Register MongoDB

I am creating a logging system that will record requests and responses to a web service that is distributed across multiple application nodes. I was thinking about using MongoDB as a repository and real-time logging, or a more realistic logging to the database after x number of requests. The application is designed to be significantly larger and embedded in Perl. Does anyone have any experience? Recommendations? Or is it no no?

+4
source share
4 answers

I have seen many companies use MongoDB to store logs. Its schema binding is very flexible for application logs, in which the schema tends to change from time to time. In addition, the Capped Collection function is really useful because it automatically clears old data so that the data is written into memory.

People combine magazines using the usual grouping or MapReduce , but this is not so fast. Especially MongoDB MapReduce only works in one thread, and its overhead for JavaScript is huge. A new aggregation structure could solve this problem.

Another problem is that, although the MongoDB tab is fire and swell by default, calling a large number of insert commands causes a severe write lock response. This may affect application performance and prevent readers from collecting / filtering saved logs.

In one solution, a log collector structure such as Fluentd , Logstash , or Flume can be used. It is assumed that these daemons run on each application node and take logs from application processes.

fluentd plus mongodb

They buffer logs and write data asynchronously to other systems, such as MongoDB / PostgreSQL / etc. Recording is performed in batches, so it is much more efficient than direct from applications. This link describes how to put logs into a Fluentd program from Perl.

+3
source

I use it in several applications through Log :: Dispatch :: MongoDB ; works like a charm!

# Declaration use Log::Dispatch; use Log::Dispatch::MongoDB; use Log::Dispatch::Screen; use Moose; has log => (is => 'ro', isa => 'Log::Dispatch', default => sub { Log::Dispatch->new }, lazy => 1) ... # Configuration $self->log->add( Log::Dispatch::Screen->new( min_level => 'debug', name => 'screen', newline => 1, ) ); $self->log->add( Log::Dispatch::MongoDB->new( collection => MongoDB::Connection->new( host => $self->config->mongodb )->saveme->log, min_level => 'debug', name => 'crawler', ) ); ... # The logging facility $self->log->log( level => 'info', message => 'Crawler finished', info => { origin => $self->origin, country => $self->country, counter => $self->counter, start => $self->start, finish => time, } ); 

And here is a sample entry from the capped collection :

 { "_id" : ObjectId("50c453421329307e4f000007"), "info" : { "country" : "sa", "finish" : NumberLong(1355043650), "origin" : "onedayonly_sa", "counter" : NumberLong(2), "start" : NumberLong(1355043646) }, "level" : "info", "name" : "crawler", "message" : "Crawler finished" } 
+2
source

I did this on a webapp that runs on two application servers. Entries in mongodb are not blocked by default (the java driver just receives a request for you and immediately returns, I assume that it is the same for perl, but you better check it), which is ideal for this use-case, since you do not want your users to wait for the record magazine.

The disadvantage of this is that in some failure scenarios you can lose several logs (your application fails before mongo gets the data, for example).

+1
source

For some interesting ideas for your application, I recommend checking Graylog2 if you haven't already. They efficiently use a combination of MongoDB and Elasticsearch. Adding a powerful search engine to the mix can give you interesting query and analysis options.

For reference, here is the Elasticsearch page dedicated to log processing tools and techniques.

If you plan to queue log entries before processing (which I would recommend), I suggest Kestrel as a solid message queue option. This is what Gaug.es uses, and I've been doing it lately. Java application, it is extremely fast and atomic, and conveniently talks to the Memcache protocol. This is a great way to scale horizontally, and the memory cache is backed up by a log file for a good balance of speed and durability.

0
source

Source: https://habr.com/ru/post/1391304/


All Articles