It's not clear what your input file looks like, but you mean that it contains only one line of many words.
300 KB away from the "big text file". You must read it in its entirety and pull out the words from there one by one. This program demonstrates
use strict; use warnings; my $data = do { open my $fh, '<', 'data.txt' or die $!; local $/; <$fh>; }; my $count = 0; while ($data =~ /(\S+)/g ) { my $word = $1; ++$count; printf "%2d: %s\n", $count, $word; }
Output
1: alpha 2: beta 3: gamma 4: delta 5: epsilon
Without an explanation of what could be a โmistaken word count,โ it is very difficult to help, but it is certain that the problem is not with the size of your array: if a problem arose, then Perl would raise an exception and die.
But if you compare the result with statistics from a word processor, then this is probably because the definition of the word "word" is different. For example, a word processor might consider a two-digit word as two words.
source share