How can I handle delimited text file efficiently?

I am just trying to execute File.ReadAllLines on a specific file and break for | . I have to use regex for this.

This code below does not work, but you will see what I'm trying to do:

 string[] contents = File.ReadAllLines(filename); string[] splitlines = Regex.Split(contents, '|'); foreach (string split in splitlines) { //Regex line = content.Split('|'); //content.Split('|'); string prefix = prefix = Regex.Match(line, @"(\S+)(\d+)").Groups[0].Value; File.AppendAllText(workingdirform2 + "configuration.txt", prefix+"\r\n"); } 
+4
source share
5 answers

It’s not entirely clear to me what you are trying to do, but there are a number of errors in your code. I tried to guess what you are doing, but if this is not what you want, please explain what you want, preferably with some examples:

 string inputFilename = "input.txt"; string outputFilename = "output.txt"; using (StreamWriter streamWriter = File.AppendText(outputFilename)) { using (StreamReader streamReader = File.OpenText(inputFilename)) { while (true) { string line = streamReader.ReadLine(); if (line == null) { break; } string[] splitlines = line.Split('|'); foreach (string split in splitlines) { Match match = Regex.Match(split, @"\S+\d+"); if (match.Success) { string prefix = match.Groups[0].Value; streamWriter.WriteLine(prefix); } else { // Handle match failed... } } } } } 

Key points:

  • It seems you want to perform an operation on each row, so you need to iterate over the rows.
  • Use the simple string.Split method if you want to split by one character. Regex.Split does not accept the character and "|" is of particular importance in regular expressions, so it would not work if you had not slipped away from it.
  • You have opened and closed the output file several times. You should open it only once and leave it open until you finish writing. The using keyword is useful here.
  • Use WriteLine instead of adding "\ r \ n".
  • If the input file is large, use StreamReader instead of ReadAllLines .
  • If the match fails, your program throws an exception. You should match.Success check match.Success before using a match, and if it returns false, handle the error accordingly (skip the line, report a warning, throw an exception with the appropriate message, etc.).
  • In fact, you are not using groups 1 and 2 in the regular expression, so you can remove the parentheses to keep the regular expression mechanism from storing results that you will not use in any case.
+1
source

Regex.Split accepts a string, not an array of strings.

I would recommend calling Regex.Split for each content item individually, and then iterate over the results of this call. That would mean nested for loops.

 string[] contents = File.ReadAllLines(filename); foreach (string line in contents) { string[] splitlines = Regex.Split(line); foreach (string splitline in splitlines) { string prefix = Regex.Match(splitline, @"(\S+)(\d+)").Groups[0].Value; File.AppendAllText(workingdirform2 + "configuration.txt", prefix+"\r\n"); } } 

This, of course, is not the most effective way to do this.

A more efficient way would be to split into a regular expression. I think this works:

 string splitlines = Regex.Split(File.ReadAllText(filename), "$|\\|"); 
0
source
  • You should pass the source string to Regex.Split , not an array.

  • It looks like you are using line instead of split when setting up the prefix. Without knowing more about your code, I can’t say whether it is correct or not, but in any case, it appears as an error. (He should not build either)

  • It is really inefficient, at least on two levels :)
0
source

I have to assume, based on limited feedback, that this is what you are looking for:

  string inputFile = filename; string outputFile = Path.Combine( workingdirform2, "configuration.txt" ); using ( StreamReader inputFileStream = File.OpenText( inputFile ) ) { using ( StreamWriter ouputFileStream = File.AppendText( outputFile ) ) { // Iterate over the file contents to extract the prefix string currentLine; while ( ( currentLine = inputFileStream.ReadLine() ) != null ) { // Notice the updated Regex - your is a bit broken string prefix = Regex.Match( currentLine, @"^(\S+?)\d+" ).Groups[1].Value; ouputFileStream.WriteLine( prefix ); } } } 

To do this, you need a file full:

 Text1231|abc|abc Text1232|abc|abc Text1233|abc|abc Text1234|abc|abc 

and indicate:

 Text Text Text Text 

to a new file.

Hope this at least helps you on the right track. My crystal ball is getting foggy .. haaazzzy ..

0
source

Probably one of the best ways to handle text files in C # is to use fileHelpers . Take a look. This allows you to strongly print import data.

0
source

Source: https://habr.com/ru/post/1299132/


All Articles