"Error L10 (output buffer overflow)" when writing to Splunk drain

I push my logs to a local installation. I recently discovered that the following error repeats a lot (about once per minute):

Error L10 (output buffer overflow): 7150 messages discarded from 2013-06-26T19: 19: 52 + 00: 00.134 <13> 1 2013-07-08T14: 59: 47.162084 + 00: 00 host app web.1 - [ \ x1B [37minfo \ x1B [0m] application - Perf - it took 31 milliseconds to get line identifiers ...

Mistakes are repeated quite a lot, and the documentation says that these errors occur when your application creates a lot of logs.

The fact is that I have almost 20-30 magazines per second, which is actually not very much. I tested with other drains (the built-in papertrail plugin was added), and these errors do not happen there - therefore they are specific for outgoing drain.

I thought that maybe the machine for splunk was loaded and thus didn’t receive the logs very quickly, but its processor is idle and it has a lot of disk and memory.

In addition, I believe that the application (the "Play 2" application) automatically writes logs to the console, so there is no big build-up of unplanned magazines, followed by the release.

What can cause a slow drain rate for outgoing drain? How to debug it?

+4
source share
2 answers

After a long ping pong with the Heroku team, we found the answer:

I used the http: // URL prefix when setting up log drain, not syslog: // . When I changed the URL to syslog: //, the error went away and the logs flowed correctly through the splits.

+7
source

My POV is that just because the errors are gone does not mean that you have solved the problem. The HTTP protocol gives a synchronous response. Thus, if you have reached a threshold value, whether it is a capacity limit or a business agreement threshold, an HTTP response code will be displayed. With Sumo Logic, if you have exceeded the burst rate limit, we will return a 429 response code. Heroku Logplex is not configured to have negative response codes and will delete data. With the syslog endpoint, you can also lose data, except that syslog does not have a response channel, so its only option is to delete the data. For Sumo Logic, you will see notifications in the audit log indicating that throttling is applied. When this happens, you should contact the support team or your team to make adjustments to your limits or increase your plan.

0
source

Source: https://habr.com/ru/post/1490339/


All Articles