I think you just found an error in the ruby CSV module. From csv.rb:
1587: @re_chars = /#{%"[-][\\.^$?*+{}()|# \r\n\t\f\v]".encode(@encoding)}/
This regular text is used to remove characters that conflict with special regular expression characters, including the pipe char | . I see no reason to add [-] , so if you delete it, your example will start working:
edit: a hyphen should be escaped inside a character set expression (surrounded by brackets [] ) only when it is not a leading character. Therefore, I had to update the fixed Regexp:
1587: @re_chars = /#{%"(?<!\\[)-(?=.*\\])|[\\.^$?*+{}()|# \r\n\t\f\v]".encode(@encoding)}/ CSV.read('sample.csv', {quote_char: '|'}) # [["076N102 ", # "CARD ", # " 1", "NEW", "PCS "], # ["07-1801 ", # "BASE ", # " 18", "NEW", "PCS "]]
Since most languages do not support lookbehind expressions with quantifiers included by Ruby, I had to write it as a negative version for the left bracket. This would also correspond to hyphens with the missing left side of the pair. If you find the best solution, leave a comment pls.
Nice to hear any comments before filling out the bug report at ruby-lang.org.
source share