I will make a short answer on the 2nd part (how expensive is casting):
casting is very cheap and highly optimized where it is calculated. if (o instanceof Clazz) ((Clazz)o)
absolutely free (cast, not the check itself).
In general, if the cast can be proven by the compiler and complicated, it will cost nothing. Otherwise, the load in the object header is required to determine the class. Some JVMs may use some pointer bit to hold the class, how important it is. In any case, even the load can be cheap and hit the L1 cache. Almost all drop branches are correctly predicted by hardware - imagine how unlikely it is to get a ClassCastException (the slow way - in this case, it does not matter for optimization).
Arrays, as a rule, are much faster than collectors of collections and its generics for many reasons, but casting plays a very small part.
Last: a blatant "measure", i.e. a micro-object is a pretty terrible idea, and measuring application performance in general is the right way. Microbenchmarks require a deep understanding of how the JVM optimizes (or not), and you must be sure to compare what you really need. In general, this art is in itself.
source share