I have a file consisting of 7.6M lines. Each line has the form: A, B, C, D, where B, C, D are the values that are used to calculate the level of importance for A, which is the identifier of the line, which is unique for each line. My approach:
private void read(String filename) throws Throwable {
BufferedReader br = new BufferedReader(new FileReader(filename));
Map<String, Double> mmap = new HashMap<>(10000000,0.8f);
String line;
long t0 = System.currentTimeMillis();
while ((line = br.readLine()) != null) {
split(line);
mmap.put(splitted[0], 0.0);
}
long t1 = System.currentTimeMillis();
br.close();
System.out.println("Completed in " + (t1 - t0)/1000.0 + " seconds");
}
private void split(String line) {
int idxComma, idxToken = 0, fromIndex = 0;
while ((idxComma = line.indexOf(delimiter, fromIndex)) != -1) {
splitted[idxToken++] = line.substring(fromIndex, idxComma);
fromIndex = idxComma + 1;
}
splitted[idxToken] = line.substring(fromIndex);
}
where a dummy value of 0.0 is inserted for profiling purposes, and split is a simple String array defined for the class. At first I worked with the String split () method, but found that the above was faster.
, , waaaay, 12 , , , . , , HashMap (.. , ), 3 .
, (i) HashMap ( , ) (ii) hashCode() - . (ii), , HashSet, 4 .
: , HashMap ? hashCode() , - , ?