Reading a text file by word using LINQ

I am learning LINQ and I want to read a text file (say an e-book) word for word using LINQ.

Here is what I can come up with:

static void Main()
        {
            string[] content = File.ReadAllLines("text.txt");

            var query = (from c in content
                         select content);

            foreach (var line in content)
            {
                Console.Write(line+"\n");
            }

        }

This reads the file line by line. If I change ReadAllLinesto ReadAllText, the file is read by letter.

Any ideas?

+3
source share
6 answers
string[] content = File.ReadAllLines("text.txt");
var words=content.SelectMany(line=>line.Split(' ', StringSplitOptions.RemoveEmptyEntries));
foreach(string word in words)
{
}

You will need to add any whitespace that you need. Using StringSplitOptions to handle consecutive spaces is cleaner than the Where clause I originally used.

In .net 4, you can use File.ReadLines for lazy evaluation and therefore lower RAM usage when working with large files.

+3
string str = File.ReadAllText();
char[] separators = { '\n', ',', '.', ' ', '"', ' ' };    // add your own
var words = str.Split(separators, StringSplitOptions.RemoveEmptyEntries);
+1
string content = File.ReadAllText("Text.txt");

var words = from word in content.Split(WhiteSpace, StringSplitOptions.RemoveEmptyEntries) 

select word;

:

List<char> WhiteSpace = { Environment.NewLine, ' ' , '\t'};

, (, ).

0

, ReadAllText(), . , (, ..). :

Regex re = new Regex("[a-zA-Z0-9_-]+", RegexOptions.Compiled); // You'll need to change the RE to fit your needs
Match m = re.Match(text);
while (m.Success)
{
    string word = m.Groups[1].Value;

    // do your processing here

    m = m.NextMatch();
}
0

, . , .

static IEnumerable<string> GetWords(string path){  

    foreach (var line in File.ReadLines(path)){
        foreach (var word in line.Split(null)){
            yield return word;
        }
    }
}

(Split (null) )

:

foreach (var word in GetWords(@"text.txt")){
    Console.WriteLine(word);
}

Linq:

GetWords(@"text.txt").Take(25);
GetWords(@"text.txt").Where(w => w.Length > 3)

, .. .

0

You can write content.ToList().ForEach(p => p.Split(' ').ToList().ForEach(Console.WriteLine)), but it is not so much linq.

-1
source

Source: https://habr.com/ru/post/1769523/


All Articles