Netty Server Streams for Netty Client (Point-to-Point, 1 to 1):
Good
- : Server and client are
12 cores , 1Gbit NIC =>, reaching at a constant speed of 300K 200 byte messages per second
Not very good
- : Server and Client are
32 cores , 10Gbit NIC => (same code), starting at 130 K / s, reducing hundreds per second for minutes.
Observations
Netperf shows that the "bad" environment is actually excellent (it can run at 600 MB / s for half an hour).
This is not a client problem, because if I change the client to a well-known good client (I wrote it in C) that sets the maximum OS SO_RCVBUF and does nothing but read bytes [] s and ignores them => the behavior remains the same.
Productivity degradation begins before a high-write watermark is reached (200 MB, but others have tried)
The heap rises quickly, and, of course, once reaches its maximum, GC strikes to block the world, but this happens after a βbadβ surface of symptoms. In a "good" environment, the heap remains stable somewhere in 1 GB, where it should logically, given the configuration, be.
One thing I noticed: most 32 cores are used during Netty Server threads, which I tried to limit by setting all Boss / NioWorker threads to 1 (although there is one channel anyway, but just in case):
val bootstrap = new ServerBootstrap( new NioServerSocketChannelFactory ( Executors.newFixedThreadPool( 1 ), Executors.newFixedThreadPool( 1 ), 1 ) ) // 1 thread max, memory limitation: 1GB by channel, 2GB global, 100ms of timeout for an inactive thread val pipelineExecutor = new OrderedMemoryAwareThreadPoolExecutor( 1, 1 *1024 *1024 *1024, 2 *1024 *1024 *1024, 100, TimeUnit.MILLISECONDS, Executors.defaultThreadFactory() ) bootstrap.setPipelineFactory( new ChannelPipelineFactory { def getPipeline = { val pipeline = Channels.pipeline( serverHandlers.toArray : _* ) pipeline.addFirst( "pipelineExecutor", new ExecutionHandler( pipelineExecutor ) ) pipeline } } )
But this does not limit the number of cores used => most cores are still used. I understand that Netty is trying to get around the work tasks of workers, but there is a suspicion that 32 cores βright awayβ might be too large to support NIC.
Question (s)
- Suggestions for performance degradation?
- How to limit the number of cores used by Netty (without having to go the OIO route)?
side notes: it would be interesting to discuss this on the Netty mailing list, but it's closed. tried Netty IRC but he's dead
source share