Why is a short primitive type much slower than a long or int?

I tried to optimize the use of RAM in the Android game by changing int primitives to shorts. Before I did this, I was interested in the performance of primitive types in Java.

So, I created this little test test using the caliper library.

public class BenchmarkTypes extends Benchmark { @Param("10") private long testLong; @Param("10") private int testInt; @Param("10") private short testShort; @Param("5000") private long resultLong = 5000; @Param("5000") private int resultInt = 5000; @Param("5000") private short resultShort = 5000; @Override protected void setUp() throws Exception { Random rand = new Random(); testShort = (short) rand.nextInt(1000); testInt = (int) testShort; testLong = (long) testShort; } public long timeLong(int reps){ for(int i = 0; i < reps; i++){ resultLong += testLong; resultLong -= testLong; } return resultLong; } public int timeInt(int reps){ for(int i = 0; i < reps; i++){ resultInt += testInt; resultInt -= testInt; } return resultInt; } public short timeShort(int reps){ for(int i = 0; i < reps; i++){ resultShort += testShort; resultShort -= testShort; } return resultShort; } } 

The test results surprised me.

Test conditions

Verification work is performed under the Caliper library.

Test results

https://microbenchmarks.appspot.com/runs/0c9bd212-feeb-4f8f-896c-e027b85dfe3b

Int 2.365 ns

Long 2.436 ns

Short 8.156 ns

Test output?

Is the short primitive type significantly slower (3-4 ~ times) than the long and int-primitive type?

Question

  • Why is a short primitive much slower than an int or long? I would expect the int primitive type to be the fastest on a 32-bit virtual machine, and long and short to be equal in time or shorter to be even faster.

  • Is it also on Android phones? Knowing that Android phones generally work in a 32-bit environment, and now more and more phones are starting to ship with 64-bit processors.

+6
source share
2 answers

Java bytecode does not support basic operations (+, -, *, /, β†’, β†’>, <,%) for primitive types smaller than int. For such operations, there are simply no byte codes in the instruction set. Thus, the virtual machine needs to convert short (s) to int (s), perform the operation, then trim the int back to short and save as a result.

Check the generated bytecode with javap to see the difference between your short and int tests.

VM / JIT optimizations seem to be heavily biased towards int / long operations, which makes sense since they are the most common.

Types smaller than int use them, but primarily for storing memory in arrays. They are not as good as simple class members (of course, you still use them when they have the appropriate data type). Smaller members may not even reduce the size of objects. The current virtual machine (again) is mainly designed for speed of execution, so the virtual machine can even align the fields with its own machine word boundaries in order to increase access performance due to memory costs.

+6
source

This is possible because java / android handles integer arithmetic for primitives that are less than int.

When two primitives are added to java that have a data type that is less than int, they are automatically upgraded to an integer data type. Usually, you need to convert the result to the desired data type.

The trick comes with shorthand operations, such as += , -= , etc., where the cast occurs implicitly , so the final result of the operation:

 resultShort += testShort; 

really resembles something like this:

 resultShort = (short)((int) resultShort + (int) testShort); 

If we look at the disassembled bytecode of the method:

 public static int test(int a, int b){ a += b; return a; } 

we see:

 public static int test(int, int); Code: 0: iload_0 1: iload_1 2: iadd 3: istore_0 4: iload_0 5: ireturn 

comparing this with an identical method with a replaced data type, we get:

 public static short test(short, short); Code: 0: iload_0 1: iload_1 2: iadd 3: i2s 4: istore_0 5: iload_0 6: ireturn 

Pay attention to the additional instruction i2s (integer to short). This is the likely culprit in performance loss. Another thing you may notice is that all instructions are based on integers denoted by the i prefix (e.g. iadd , meaning integer-add). This means that somewhere during the iload phase iload shorts moved up to integers, which could also cause performance degradation.

If you can come to terms with this, the bytecode for lengthy arithmetic is identical to integer, except that the instructions are long-specific (e.g. ladd instead of iadd ).

+3
source

Source: https://habr.com/ru/post/971046/


All Articles