Why does RabbitMQ keep breaking from a corrupt suspension log file?

I am running Celery in a Django application with RabbitMQ as a message broker. However, RabbitMQ continues to break like that. The first error I get from Django. Tracing is mostly inconsequential because I know what causes the error, as you will see.

Traceback (most recent call last): ... File "/usr/local/lib/python2.6/dist-packages/amqplib/client_0_8/transport.py", line 85, in __init__ raise socket.error, msg error: [Errno 111] Connection refused 

I know this is due to a damaged rabbit_persister.log file. This is because after I kill all the processes associated with RabbitMQ, I run "sudo rabbitmq-server start" to get the following failure:

 ... starting queue recovery ...done starting persister ...BOOT ERROR: FAILED Reason: {{badmatch,{error,{{{badmatch,eof}, [{rabbit_persister,internal_load_snapshot,2}, {rabbit_persister,init,1}, {gen_server,init_it,6}, {proc_lib,init_p_do_apply,3}]}, {child,undefined,rabbit_persister, {rabbit_persister,start_link,[]}, transient,100,worker, [rabbit_persister]}}}}, [{rabbit_sup,start_child,2}, {rabbit,'-run_boot_step/1-lc$^1/1-1-',1}, {rabbit,run_boot_step,1}, {rabbit,'-start/2-lc$^0/1-0-',1}, {rabbit,start,2}, {application_master,start_it_old,4}]} Erlang has closed 

My current fix:. Each time this happens, I rename the corresponding rabbit_persister.log file to something else (rabbit_persister.log.bak) and can successfully run RabbitMQ. But the problem continues to occur, and I cannot understand why. Any ideas?

Also, as a disclaimer, I have no experience with Erlang; I only use RabbitMQ because he uses Celery as a broker.

Thanks in advance, this problem really annoys me because I keep doing the same fix over and over again.

+4
source share
2 answers

Vista is an internal RabbitMQ message database. This "log" appears to be similar to a database log and deleting it will result in message loss. I assume that it is damaged by unclean brokers, but this is a bit wrong.

Interestingly, you get an error message in the rabbit_persister module. The latest version of RabbitMQ that has this file is 2.2.0, so I highly recommend you upgrade. The best version is always the latest one you can get using the RabbitMQ APT repository . In particular, in versions after version 2.2.0 persister found a rather large number of fixes, so there is a high probability that your problem has already been resolved.

If you still see a problem after the upgrade, you should report it to RabbitMQ Discuss the mailing list. The developers (both Celery and RabbitMQ) fix any problems that are reported there.

+4
source

a. Since you are using the old version of RabbitMQ earlier than 2.7.1 B. Because RabbitMQ does not have enough RAM. You need to run RabbitMQ on the server yourself and provide this server with sufficient RAM so that the RAM is 2.5 times the maximum possible size of your saved message log.

You may be able to fix this without any software changes by simply adding more RAM and killing other services on the box.

Another approach to this is to create your own RabbitMQ from the source and enable the toke extension, which is saved in messages using Tokyo Cabinet. Make sure you use the local hard drive and not the NFS partitions, because there are corruption issues in NFS in Tokyo’s office. And, of course, use version 2.7.1 for this. Depending on your message content, you can also use the Tokyo Cabinets compression settings to reduce the read / write activity of saved messages.

0
source

Source: https://habr.com/ru/post/1384514/


All Articles