The correct way to stop async appender custom log

I created Amazon SQS and SNS logback applications using the Amazon Java SDK. Base applications use synchronous Java APIs, but I also created asynchronous versions of both by extending the ch.qos.logback.classic.AsyncAppender class.

Stopping the loglog context using async applications does not work as expected. When the context is stopped, all asynchronous applications try to reset the remaining events before exiting. The problem comes from the ch.qos.logback.core.AsyncAppenderBase#stop method, which interrupts the workflow. The interrupt is triggered while the Amazon SDK is still processing events in the queue and returns a com.amazonaws.AbortedException . In my tests, an AbortedException occurred when the SDK was processing a response from the API, so the actual message passed, but this may not always be the case.

Is the log supposed to interrupt the workflow, although workers should still process the remaining event queue? And if so, how can I get around an AbortedException caused by an interrupt? I could override all the stop methods and remove the interrupt, but this would require a copy that inserts most of the implementation.

+4
source share
1 answer

Finally, I managed to find a solution that, I think, is not optimal and far from simple, but it works.

My first attempt was to use asynchronous versions of the AWS SDK API with an artist included in the log, because with an internal artist, the interrupt problem could be avoided. But this did not work, because the work queues are divided, in which case the queue must be application specific in order to allow it to stop correctly. Therefore, I needed to use my own performer with each application.

First I needed a contractor for AWS clients. The trap with the executor is that the provided factory thread must create daemon threads, otherwise it will block indefinitely if the JVM logback is used.

 public static ExecutorService newExecutor(Appender<?> appender, int threadPoolSize) { final String name = appender.getName(); return Executors.newFixedThreadPool(threadPoolSize, new ThreadFactory() { private final AtomicInteger idx = new AtomicInteger(1); @Override public Thread newThread(Runnable r) { Thread thread = new Thread(r); thread.setName(name + "-" + idx.getAndIncrement()); thread.setDaemon(true); return thread; } }); } 

The next question: how to stop the application with interruption? This required handling the exception interrupt with repetition, because the executor would otherwise have missed the queue waiting.

 public static void shutdown(Appender<?> appender, ExecutorService executor, long waitMillis) { executor.shutdown(); boolean completed = awaitTermination(appender, executor, waitMillis); if (!completed) { appender.addWarn(format("Executor for %s did not shut down in %d milliseconds, " + "logging events might have been discarded", appender.getName(), waitMillis)); } } private static boolean awaitTermination(Appender<?> appender, ExecutorService executor, long waitMillis) { long started = System.currentTimeMillis(); try { return executor.awaitTermination(waitMillis, TimeUnit.MILLISECONDS); } catch (InterruptedException ie1) { // the worker loop is stopped by interrupt, but the remaining queue should still be handled long waited = System.currentTimeMillis() - started; if (waited < waitMillis) { try { return executor.awaitTermination(waitMillis - waited, TimeUnit.MILLISECONDS); } catch (InterruptedException ie2) { appender.addError(format("Shut down of executor for %s was interrupted", appender.getName())); } } Thread.currentThread().interrupt(); } return false; } 

Regular backup applications are expected to run synchronously and therefore should not lose event logging even without a proper shutdown. This is a problem with current async AWS SDK API calls. I decided to use the countdown to ensure the behavior of the blocking application.

 public class LoggingEventHandler<REQUEST extends AmazonWebServiceRequest, RESULT> implements AsyncHandler<REQUEST, RESULT> { private final ContextAware contextAware; private final CountDownLatch latch; private final String errorMessage; public LoggingEventHandler(ContextAware contextAware, CountDownLatch latch, String errorMessage) { this.contextAware = contextAware; this.latch = latch; this.errorMessage = errorMessage; } @Override public void onError(Exception exception) { contextAware.addWarn(errorMessage, exception); latch.countDown(); } @Override public void onSuccess(REQUEST request, RESULT result) { latch.countDown(); } } 

And handle the wait with a latch.

 public static void awaitLatch(Appender<?> appender, CountDownLatch latch, long waitMillis) { if (latch.getCount() > 0) { try { boolean completed = latch.await(waitMillis, TimeUnit.MILLISECONDS); if (!completed) { appender.addWarn(format("Appender '%s' did not complete sending event in %d milliseconds, " + "the event might have been lost", appender.getName(), waitMillis)); } } catch (InterruptedException ex) { appender.addWarn(format("Appender '%s' was interrupted, " + "a logging event might have been lost or shutdown was initiated", appender.getName())); Thread.currentThread().interrupt(); } } } 

And then all together. The following example is a simplified version of a real implementation, just showing the relevant parts for this problem.

 public class SqsAppender extends UnsynchronizedAppenderBase<ILoggingEvent> { private AmazonSQSAsyncClient sqs; @Override public void start() { sqs = new AmazonSQSAsyncClient( getCredentials(), getClientConfiguration(), Executors.newFixedThreadPool(getThreadPoolSize()) ); super.start(); } @Override public void stop() { super.stop(); if (sqs != null) { AppenderExecutors.shutdown(this, sqs.getExecutorService(), getMaxFlushTime()); sqs.shutdown(); sqs = null; } } @Override protected void append(final ILoggingEvent eventObject) { SendMessageRequest request = ... CountDownLatch latch = new CountDownLatch(1); sqs.sendMessageAsync(request, new LoggingEventHandler<SendMessageRequest, SendMessageResult>(this, latch, "Error")); AppenderExecutors.awaitLatch(this, latch, getMaxFlushTime()); } } 

All this is necessary for the correct handling of the following cases:

  • Drain the remaining queue of events in the stop or stop mode when starting the context when using an asynchronous wrapper program
  • Do not block indefinitely when a log back delay is used.
  • Provide lock if not using asynchronous appender
  • Survive due to the interruption of the async appender stop, which is why all AWS SDK stream implementations are interrupted.

The above is used in the open source project β€œLog Extensions,” which I support.

+1
source

Source: https://habr.com/ru/post/1239353/


All Articles