Split an array of strings into an array of string arrays

I am looking for a way to split this array of strings:

["this", "is", "a", "test", ".", "I", "wonder", "if", "I", "can", "parse", "this",
"text", "?", "Without", "any", "errors", "!"]

into groups ending with punctuation:

[
  ["this", "is", "a", "test", "."],
  ["I", "wonder", "if", "I", "can", "parse", "this", "text", "?"],
  ["Without", "any", "errors", "!"]
]

Is there an easy way to do this? What is the most sensible approach to iterating an array, adding each index to a temporary array and adding this temporary array to the container array when punctuation is detected?

I was thinking about using sliceor map, but I can't figure out if this is possible or not.

+4
source share
2 answers

Departure Enumerable#slice_after:

x.slice_after { |e| '.?!'.include?(e) }.to_a
+11
source

@ndn , , .

, , , . :

s = "this is a test. I wonder if I can parse this text? Without any errors!"
s.scan /\w+|[.?!]/
  #=> ["this", "is", "a", "test", ".", "I", "wonder", "if", "I", "can",
  #    "parse", "this", "text", "?", "Without", "any", "errors", "!"] 

, - . , , String # split , s :

r1 = /
     (?<=[.?!]) # match one of the given punctuation characters in capture group 1
     \s*   # match >= 0 whitespace characters to remove spaces
     /x    # extended/free-spacing regex definition mode

a = s.split(r1)
  #=> ["this is a test.", "I wonder if I can parse this text?",
  #    "Without any errors!"] 

:

r2 = /
     \s+       # match >= 1 whitespace characters
     |         # or
     (?=[.?!]) # use a positive lookahead to match a zero-width string
               # followed by one of the punctuation characters
     /x

b = a.map { |s| s.split(r2) }
  #=> [["this", "is", "a", "test", "."],
  #    ["I", "wonder", "if", "I", "can", "parse", "this", "text", "?"],
  #    ["Without", "any", "errors", "!"]]
+2

Source: https://habr.com/ru/post/1621822/


All Articles