Project

General

Profile

Actions

Feature #15771

closed

Add `String#split` option to set `split_type string` with a single space separator

Feature #15771: Add `String#split` option to set `split_type string` with a single space separator

Added by 284km (kazuma furuhashi) over 6 years ago. Updated over 5 years ago.

Status:
Feedback
Assignee:
-
Target version:
-
[ruby-core:92301]

Description

When String#split's separator is a single space character, it executes under split_type: awk.

When you want to split literally by a single space " ", and not a sequence of space characters, you need to take special care. For example, the CSV library detours this behavior like this:

if @column_separator == " ".encode(@encoding) @split_column_separator = Regexp.new(@escaped_column_separator) else @split_column_separator = @column_separator end 

Unfortunately, using a regexp here makes it slower than using a string. The following result shows it is about nine times slower.

$ be benchmark-driver string_split_string-regexp.yml --rbenv '2.6.2' Comparison: string: 3161117.6 i/s regexp: 344448.0 i/s - 9.18x slower 

I want to add a :literal option to execute the method under split_type: string as follows:

" a b c ".split(" ") # => ["a", "b", "c"] " a b c ".split(" ", literal: true) # => ["", "a", "", "b", "", "", "c"] " a b c ".split(" ", -1) # => ["a", "b", "c", ""] " a b c ".split(" ", -1, literal: true) # => ["", "a", "", "b", "", "", "c", "", "", "", ""] 

Implementation

Actions

Also available in: PDF Atom