Extremely slow even with TLongObjectHashMap

I need to post about 20 million entries in a HashMap. I chose TLongObjectHashMap according to: Why does Java HashMap slow down?

The code looks like this:

StringBuilder sb = new StringBuilder("");
StringBuilder value = new StringBuilder("");
TLongObjectHashMap<String> map = new TLongObjectHashMap<String>();

in = new FileInputStream(new File(inputFile));
br = new BufferedReader(new InputStreamReader(in), 102400);
for (String inLine; (inLine = br.readLine()) != null;) {
    sb.setLength(0);
    for (i = 0; i < 2; i++) {
                for (j = 1; j < 12; j++) {
                    sb.append(record.charAt(j));
                }
            }

            for (k = 2; k < 4; k++) {
                value.append(record.charAt(k));
            }
            for (k = 7; k < 11; k++) {
                value.append(record.charAt(k));
            }
    map.put(Long.parseLong(sb.toString()), value.toString());
    value.delete(0, value.length());
}

I used the GNU Trove. However, it becomes extremely slow and almost stops at about 15 million records. There is no OutOfMemoryError yet. What is the problem?

I have no way to use DB for this.

Note: values ​​such as 1, 12, 2,4, etc., are calculated before this cycle and stored in a variable, which, in turn, will be used here. I just replaced them with some values ​​now

+4
source share
2 answers

I used the GNU Trove. However, it becomes extremely slow and almost stops at about 15 million records. There is no OutOfMemoryError yet. What is the problem?

, .

. , , (: , , ).

, . , , String.substring(). . , , , . , StringBuilder.

, , , , , . , , , , .

, . - , . , , , ( HashMap, Trove, 100 000 000 2 ). , .

private static Map<Long,String> fillMap(int items)
{
    Map<Long,String> map = new HashMap<Long,String>(items);
    Random rnd = new Random();

    long start = System.currentTimeMillis();

    for (int ii = 0 ; ii < items ; ii++)
    {
        map.put(new Long(rnd.nextLong()), new String("123456789012345678901234567890"));
    }

    long finish = System.currentTimeMillis();
    double elapsed = ((finish - start) / 1000.0);
    System.out.format("time to produce %d items: %8.3f seconds (map size = %d)\n", items, elapsed, map.size());
    return map;
}
+4

, JDK HashMap . ,

  • , nessasary

, 75%

DEFAULT_INITIAL_CAPACITY = 16;  
DEFAULT_LOAD_FACTOR = 0.75;  
THRESHOLD = DEFAULT_INITIAL_CAPACITY * DEFAULT_LOAD_FACTOR;

,

double expected_maximal_number_of_data = 30000000d;
int capacity = (int) ((expected_maximal_number_of_data)/0.75+1);
HashMap<Long, String> map = new HashMap<Long, String>(capacity);
for (String inLine; (inLine = br.readLine()) != null;) {
    Long key = Long.parseLong(record.substring(1, 12));
    String value = record.substring(2, 4) + record.substring(7, 11);
    map.put(key, value);
}

2 , , < 16s.

0

Source: https://habr.com/ru/post/1570137/


All Articles