Node.js GPS Tracking Features

Using node.js as the tcp server, I'm going to manage a relatively large number of GPS devices (~ 3000 device), and as a first step I’m just going to store the incoming data in the database, but even at this point I assume some performance problems that they bother me and I would like to catch them before they bite me.

1 - Looking at written similar servers using languages ​​such as java or ruby, I see code like the following:

Java

Thread serverThread = new Thread(() -> { System.out.println("Listening to server port 9000"); while (true) { try { Socket socket = serverSocket.accept(); ... 

ruby

  require 'socket' server = TCPServer.new ("127.0.0.1",8080) loop do Thread.start(server.accept) do |client| ... 

It seems that they give a separate stream for each device (socket) that connect to the tcp server? Since node.js is single-threaded and acts asynchronously, should you worry about incoming connections or will something like the following simple approach satisfy a large number of simultaneous connections?

 net.createServer(function(device) { device.on('data', function(data) { // parse data // store in database }); }); 

2 - Should I restrict connections to the database using the connection pool? How does the database also ask on the other hand for GIS and monitoring, how much should the pool size be?

3 - How can I use caching (for example, using redis) in such a system?

It should be great if someone sheds light on these thoughts. I would also like to hear any other thoughts on performance that you may experience or recognize when implementing such systems. Thanks.

+6
source share
2 answers
  • Choosing among these options, I would say that NodeJS is actually the best option for your use case, because it does not use one thread for each connection, like the other two options. Topics are usually the ultimate resource on a given machine. Java and Ruby have "evented" servers, although they are worth a look if you want to compare apples with apples.

  • I think you need to talk more about the database that you are going to use if you need advice on a connection pool. However, reusing connections, if they are expensive to set up, would be nice to do. It would probably be nice to be able to configure the minimum and maximum pool size. Ultimately, the right size to use is a matter of testing.

  • I think that the advantage of caching on this system would be minimal, since you are mostly writing data. If the data is valuable, you will want to write it to disk, not memory. On the other hand, if you have clients who read the collected data, perhaps caching their readings in something like Redis might be a good idea.

+3
source

I'm sure you know, but it looks like you are trying to prematurely optimize your application here.

1- Event-driven Node and non-blocking makes it an ideal candidate for hosting a large number of open socket connections, without the need for forking for each connection. As always, make sure your application is grouped correctly. I was able to store ~ ​​100 thousand open TCP sockets on a cheap laptop. If the number of devices you need for support is ever increasing, it just scales.

2- I saw that you planned to use postgres. Pools are always good.

3 Caching is useful for hot data. Material that is often requested, and therefore its presence in memory or inside redis (in memory) speeds up data retrieval and eliminates the load on the system. In your case, if you just need to get certain pieces of data, for analytics or for more causal use, I would recommend spark or solr as opposed to a simple cache layer. It will also be much cheaper and easier to maintain.

+3
source

Source: https://habr.com/ru/post/1015250/


All Articles