I have a txt document that contains more than 14,000 lines, many of which are duplicates, can I count the number of unique records?
This is just a One-Liner:
var lines = File.ReadAllLines("FileToRead.txt").Distinct().Count();
Edit: But be careful with such decisions. Files larger than 600 MB may have problems.
You can use the File.ReadLines Method and LINQ Distinct and Count Extension Methods :
var result = File.ReadLines("input.txt").Distinct().Count();
Iterate through the file, save what you find in the collection, ignore already analyzed records and in the end just check the size of the collection.
Source: https://habr.com/ru/post/1382284/More articles:Weka Example, a simple classification of text strings - javaIs MaxUploadSizeExceededException handled, but is the STILL file uploaded? - javaPassing a command parameter from a Datagrid using key bindings - wpfHow to add a lock in this situation? - javaEventStore - partial ordering of events and other functions - cqrsHow to perform bi-directional synchronization between Android SQLite and SQL Server - androidmsvcp100.dll is missing - more informative error - c ++Interchangeable JComponent skins for JButton - javaHow to make text flow through 2 rows of 2 columns using css3? - css3Security in NowJS - node.jsAll Articles