Count duplicate words in a file

Question

Purpose: find the number of all words in a file. file contains more than 1000 words

My approach: use HashMap<String,Integer>()to store and count the number of times each word appears in a file.

Question: Would HashMap()it be better or better to use a binary tree to provide a faster search, since the file has a large number of words?

Or is there a better way to do this?

HashMap will lead to a large amount of memory overhead, which is undesirable.

+3

Junior Oct 15 '10 at 13:01

5 answers

1000 - 10000 .

.

+5

Mitch Wheat 15 . '10 13:06

Perl/PHP. .

+1

Noam 15 . '10 13:45

A HashMap .

HashMap !

0

HenryTaylor 15 . '10 13:08

0

madhurtanwani 15 . '10 13:39

Michael D · Accepted Answer · 2010-10-15T13:08:49+0000

So, are you looking for different words?

The most effective framework I can think of is Trie

Mitch Wheat - , HashMap ( ... HashMap, , )