Comma Separated Values ​​Containing Commas and Newlines

I have a string with some special characters. The goal is to get the line [] of each line (separated by) You have a special character where you can have / n and

For example Main String Alpha,Beta,Gama,"23-5-2013,TOM",TOTO,"Julie, KameL Titi",God," timmy, tomy,tony, tini". 

You can see that you are / n in "".

Can anyone help me make out this.

thanks

__ More Explation

with Main Sting I need to separate these

 Here Alpha Beta Gama 23-5-2013,TOM TOTO Julie,KameL,Titi God timmy, tomy,tony,tini 

Problem: for Julie, KameL, Titi there is a line break / n or between them between KameL and Titi a similar problem for timmy, tomy, tony, tini there is a line break / n or between them between tony and tini.


this new text is in the file (required reading of the line)

 Alpha,Beta Charli,Delta,Delta Echo ,Frank George,Henry 1234-5,"Ida, John ", 25/11/1964, 15/12/1964,"40,000,000.00",0.0975,2,"King, Lincoln ",Mary / New York,123456 12543-01,"Ocean, Peter 

The conclusion I want to delete is "

 Alpha Beta Charli Delta Delta Echo Frank George Henry 1234-5 Ida John " 25/11/1964 15/12/1964 40,000,000.00 0.0975 2 King Lincoln " Mary / New York 123456 12543-01 Ocean Peter 
-1
source share
3 answers

Try the following:

 String source = "Alpha,Beta,Gama,\"23-5-2013,TOM\",TOTO,\"Julie, KameL\n" + "Titi\",God,\" timmy, tomy,tony,\n" + "tini\"."; Pattern p = Pattern.compile("(([^\"][^,]*)|\"([^\"]*)\"),?"); Matcher m = p.matcher(source); while(m.find()) { if(m.group(2) != null) System.out.println( m.group(2).replace("\n", "") ); else if(m.group(3) != null) System.out.println( m.group(3).replace("\n", "") ); } 

If it matches a line without quotes, the result is returned in group 2. Lines with quotes are returned to group 3. Therefore, I had to make a distinction in a time block. You can find a more beautiful way.

Output:
Alpha
Beta
Gama
23-5-2013, TOM
Toto
Julia, KameLtiti
God
timmy, toms, tony, teenie
.

+2
source

CSV analysis is much more complicated than you might think at first glance, and so your best bet is to use a well-designed and tested library to do the job for you. Two libraries opencsv and supercsv and many others. Look at both and use the one that best suits your requirements and style.

+4
source

Description

Consider the following example of using a universal regular expression validated with a Java parser , which does not require additional processing to assemble parts of the data. The first corresponding group will correspond to the quote, and then transfers it to the end of the match so that you are sure that you capture all the value between , but not including the quotation marks . I also do not write commas unless they were embedded in a delimited substring.

(?:^|,\s{0,})(["]?)\s{0,}((?:.|\n|\r)*?)\1(?=[,]\s{0,}|$)

Example

 $Matches = @() $String = 'Alpha,Beta,Gama,"23-5-2013,TOM",TOTO,"Julie, KameL\n Titi",God,"timmy, \n tomy,tony,tini"' $Regex = '(?:^|,\s{0,})(["]?)\s{0,}((?:.|\n|\r)*?)\1(?=[,]\s{0,}|$)' Write-Host start with write-host $String Write-Host Write-Host found ([regex]"(?i)(?m)$Regex").matches($String) | foreach { write-host "key at $($_.Groups[1].Index) = '$($_.Groups[1].Value)'`t= value at $($_.Groups[2].Index) = '$($_.Groups[2].Value)'" } # next match 

Productivity

 start with Alpha,Beta,Gama,"23-5-2013,TOM",TOTO,"Julie, KameL\n Titi",God,"timmy, \n tomy,tony,tini" found key at 0 = '' = value at 0 = 'Alpha' key at 6 = '' = value at 6 = 'Beta' key at 11 = '' = value at 11 = 'Gama' key at 16 = '"' = value at 17 = '23-5-2013,TOM' key at 32 = '' = value at 32 = 'TOTO' key at 37 = '"' = value at 38 = 'Julie, KameL\n Titi' key at 60 = '' = value at 60 = 'God' key at 64 = '"' = value at 65 = 'timmy, \n tomy,tony,tini' 

Summary

enter image description here

  • (?: start a group without capture
  • ^ start of string is required
  • | or
  • ,\s{0,} comma followed by any number of spaces
  • ) close the group without capture
  • ( start of capture group 1
  • ["]? consumes a quote, if it exists, I like to do it in such a way that you would like to include other characters, and then the quote
  • ) close capture group 1
  • \s{0,} consume any spaces if they exist, which means you don't need to trim the value later
  • ( start of capture group 2
  • (?:.|\n|\r)*? capture all characters including newline non greedy
  • ) close capture group 2
  • \1 If there was a quote, it would be saved in group 1, so if it was there, then it should be here
  • (?= start zero statement in the future
  • [,]\s{0,} must have a comma followed by optional spaces
  • | or
  • $ end of line
  • ) close null statement forward
+3
source

Source: https://habr.com/ru/post/1480884/


All Articles