Before planning any special programming, you should do some testing to find out how much you can process with the vanilla system. Set up the data file and the sending process on the manufacturer machine and simple accepter / parser on the consumer machine and do a bunch of profiling - where are you going to run into data problems? Can you throw the best equipment on it, or can you customize your processing faster?
Make sure you start with an HW platform that can support the data rate you expect? If you are working with something like the Intel 82598EB NIC, make sure you plug it into the PCIe 2.0 slot, preferably the x16 slot, to get the full bandwidth from the network board to the chipset.
There are ways to configure the NIC driver settings for your data stream to make the most of the settings. For example, make sure you use jumbo frames on the link to minimize TCP overhead. In addition, you can play with the driver's throttle frequency to speed up low-level processing.
Is the processing for your data set parallelized? If you have one task to flush data into memory, can you configure a few more tasks to process pieces of data at the same time? This would allow the use of multi-core processors.
Finally, if this is not enough, use the profiling / synchronization data that you collected to find parts of the system that you can tune to improve performance. Don't just assume that you know where you need to tweak: backing up with real data - you might be surprised.
source share