C # matching two text files, registration issue

I have two files, sourcecolumns.txt and destcolumns.txt . I need to compare the source with dest and if dest does not contain the original value, write it to a new file. The code below works, except that I have problems like this:

source: CPI
dest: Cpi

They do not match because of the letters with the inscription, so I get the wrong outputs. Any help is always appreciated!

 string[] sourcelinestotal = File.ReadAllLines("C:\\testdirectory\\" + "sourcecolumns.txt"); string[] destlinestotal = File.ReadAllLines("C:\\testdirectory\\" + "destcolumns.txt"); foreach (string sline in sourcelinestotal) { if (destlinestotal.Contains(sline)) { } else { File.AppendAllText("C:\\testdirectory\\" + "missingcolumns.txt", sline); } } 
+4
source share
3 answers

You can do this using the extension method for IEnumerable<string> , for example:

 public static class EnumerableExtensions { public static bool Contains( this IEnumerable<string> source, string value, StringComparison comparison ) { if (source == null) { return false; // nothing is a member of the empty set } return source.Any( s => string.Equals( s, value, comparison ) ); } } 

then change

 if (destlinestotal.Contains( sline )) 

to

 if (destlinestotal.Contains( sline, StringComparison.OrdinalIgnoreCase )) 

However, if the sets are large and / or you are going to do it very often, then how you do it is very inefficient. Essentially, you perform the O (n 2 ) operation β€” for each line in the source, you compare it to potentially all the lines in the destination. It would be better to create a HashSet from the destination columns using a case insenstivie comparator, and then iterate over the source columns to see if each of them exists in the HashSet of the destination columns. This would be an O (n) algorithm. note that Contains on the HashSet will use the comparator that you provide in the constructor.

 string[] sourcelinestotal = File.ReadAllLines("C:\\testdirectory\\" + "sourcecolumns.txt"); HashSet<string> destlinestotal = new HashSet<string>( File.ReadAllLines("C:\\testdirectory\\" + "destcolumns.txt"), StringComparer.OrdinalIgnoreCase ); foreach (string sline in sourcelinestotal) { if (!destlinestotal.Contains(sline)) { File.AppendAllText("C:\\testdirectory\\" + "missingcolumns.txt", sline); } } 

In retrospect, I really prefer this solution by simply writing my own case insensitive contained for IEnumerable<string> unless you need a method for something else. There is actually less code (of your own) to support using the HashSet implementation.

+5
source

Use the extension method for Contains. A vivid example was found here when the stack overflowed. The code is not mine, but I will send it below.

 public static bool Contains(this string source, string toCheck, StringComparison comp) { return source.IndexOf(toCheck, comp) >= 0; } string title = "STRING"; bool contains = title.Contains("string", StringComparison.OrdinalIgnoreCase); 
+4
source

If you don't need case sensitivity, convert your strings to uppercase using string.ToUpper before comparing.

0
source

Source: https://habr.com/ru/post/1308271/


All Articles