What is the typical memory allocation speed in Java?

I profiled a Java application and found that the distribution of objects is much slower than I expected. I did a simple test to try to establish the overall distribution speed of small objects, and I found that it takes about 200 nanoseconds on my machine to select a small object (a 3-floating vector). I work on a dual-core processor with a frequency of 2.0 GHz, so this is approximately 400 processor cycles. I wanted to ask people who had profiled Java applications before expecting speed to be expected. It seems a little cruel and unusual to me. In the end, I would think that a Java-like language that can compress a bunch and move objects would have an object distribution similar to the following:

int obj_addr = heap_ptr; heap_ptr += some_constant_size_of_object return obj_addr; 

.... which is a couple of lines of assembly. With regard to garbage collection, I do not select or discard enough objects for this to enter the game. When I optimize my code by reusing objects, I get a performance of the order of 15 nanoseconds / object that I need to process, instead of 200 ns per object that I need to process, so reusing objects greatly improves performance. I would really like not to reuse objects, because it makes the note look hairy (many methods must accept receptacle instead of returning a value).

So the question is: is it okay to distribute objects for so long? Or maybe something is wrong on my machine, which after correction can allow me to get better performance? How long do small objects usually stand out for others, and is there a typical value? I use the client machine and do not use any compilation flags at the moment. If everything is faster on your computer, what is your machine version of JVM and operating system?

I understand that individual mileage can vary a lot when it comes to performance, but I'm just asking if the numbers I mentioned above seem to be in the right step.

+4
source share
3 answers

Creating objects happens very quickly when the object is small and there is no GC cost.

 final int batch = 1000 * 1000; Double[] doubles = new Double[batch]; long start = System.nanoTime(); for (int j = 0; j < batch; j++) doubles[j] = (double) j; long time = System.nanoTime() - start; System.out.printf("Average object allocation took %.1f ns.%n", (double) time/batch); 

prints with -verbosegc

 Average object allocation took 13.0 ns. 

Note: no GC has occurred. However, increase the size, and the program needs to wait to copy the memory to the GC.

 final int batch = 10 *1000 * 1000; 

prints

 [GC 96704K->94774K(370496K), 0.0862160 secs] [GC 191478K->187990K(467200K), 0.4135520 secs] [Full GC 187990K->187974K(618048K), 0.2339020 secs] Average object allocation took 78.6 ns. 

I suspect your distribution is relatively slow because you are doing GC. One way is to increase the availability of memory for the application. (Although this may just delay the cost)

If I run it again using -verbosegc -XX:NewSize=1g

 Average object allocation took 9.1 ns. 
+4
source

I do not know how you measure the distribution time. This is probably at least the equivalent

 intptr_t obj_addr = heap_ptr; heap_ptr += CONSTANT_SIZE; if (heap_ptr > young_region_limit) call_the_garbage_collector (); return obj_addr; 

But it is more difficult because you need to fill in obj_addr ; then some JIT compilation or class loading may occur; and probably the first few words are initialized (for example, a class pointer and a hash code, which may include some random number generation ...), and the object constructors are called. They may need synchronization, etc.

And more importantly, the recently allocated object may not be in the nearest level one cache, so some cache misses may occur.

Therefore, although I am not a Java expert, I am not surprised at your measures. I believe that highlighting fresh objects makes your code cleaner and more convenient than trying to reuse old objects.

+2
source

Yes. The difference between what you think she should do and what she actually does can be quite big. Pooling can be messy, but when distribution and garbage collection make up most of the execution time, which certainly can be pooling is a big win, performance.

Objects in the pool are those that you most often find in the process of allocating through stack samples.

Here's what an example looks like in C ++. In Java, the details are different, but the idea is the same:

 ... blah blah system stuff ... MSVCRTD! 102129f9() MSVCRTD! 1021297f() operator new() line 373 + 22 bytes operator new() line 65 + 19 bytes COpReq::Handler() line 139 + 17 bytes <----- here is the line that doing it doit() line 346 + 12 bytes main() line 367 mainCRTStartup() line 338 + 17 bytes KERNEL32! 7c817077() V------ and that line shows what being allocated COperation* pOp = new COperation(iNextOp++, jobid); 
+1
source

Source: https://habr.com/ru/post/1388195/


All Articles