Mongodb replica set auto reconect not working after down and up for nginx + uwsgi with multiple processes

Hi everyone, I have the following envirement for python2.7.5 :

 flask==0.10.1 flask-wtf==0.8.4 jinja2==2.7 werkzeug==0.9.1 flask-mongoengine==0.7.0 mongoengine==0.8.2 pymongo==2.5.2 uwsgi==1.9.13 

and the following app.py application:

 from flask import Flask from flask.ext.mongoengine import Document, MongoEngine from mongoengine import StringField class Config(object): DEBUG = True MONGODB_HOST = ('mongodb://localhost:27017,localhost:27018/' 'test?replicaSet=rs0') MONGODB_DB = True app = Flask(__name__) app.config.from_object(Config) MongoEngine(app) class Test(Document): test = StringField(default='test') meta = { 'allow_inheritance': False, } def __unicode__(self): return self.test Test(test='test1').save() @app.route('/') def hello_world(): return unicode(Test.objects.first()) if __name__ == '__main__': app.run('0.0.0.0', 8080, True) 

I have the following nginx configuration:

 server { listen 80; server_name localhost; location / { include uwsgi_params; uwsgi_pass unix:/tmp/uwsgi.sock; } } 

I run uwsgi like:

 /path/to/env/bin/uwsgi \ --module app:app \ --env /path/to/env/ \ --pythonpath /path/to/app/ \ --socket /tmp/uwsgi.sock \ --pidfile /tmp/uwsgi.pid \ --daemonize /tmp/uwsgi.log \ --processes 2 \ --threads 2 \ --master 

I have two instances of mongodb:

 mongod --port 27017 --dbpath /path/to/mongo/data/rs0-0 --replSet rs0 \ --smallfiles --oplogSize 128 

and

 mongod --port 27018 --dbpath /path/to/mongo/data/rs0-1 --replSet rs0 \ --smallfiles --oplogSize 128 

And a customized replica set in the mongo console like:

 rsconf = { _id: "rs0", members: [{_id: 0, host: "127.0.0.1:27017"}] }; rs.initiate(rsconf); rs.add("127.0.0.1:27018"); 

So it works well. But when I go down from the primary or secondary mongo instance, my application cannot reconnect, and I have the following exceptions every time after:

 ... File "/path/to/app/replica.py", line 33, in hello_world return unicode(Test.objects.first()) File "/path/to/env/local/lib/python2.7/site-packages/mongoengine/queryset/queryset.py", line 325, in first result = queryset[0] File "/path/to/env/local/lib/python2.7/site-packages/mongoengine/queryset/queryset.py", line 211, in __getitem__ return queryset._document._from_son(queryset._cursor[key], File "/path/to/env/local/lib/python2.7/site-packages/pymongo/cursor.py", line 470, in __getitem__ for doc in clone: File "/path/to/env/local/lib/python2.7/site-packages/pymongo/cursor.py", line 814, in next if len(self.__data) or self._refresh(): File "/path/to/env/local/lib/python2.7/site-packages/pymongo/cursor.py", line 763, in _refresh self.__uuid_subtype)) File "/path/to/env/local/lib/python2.7/site-packages/pymongo/cursor.py", line 700, in __send_message **kwargs) File "/path/to/env/local/lib/python2.7/site-packages/pymongo/mongo_replica_set_client.py", line 1546, in _send_message_with_response raise AutoReconnect(msg, errors) pymongo.errors.AutoReconnect: No replica set primary available for query with ReadPreference PRIMARY 

When I use mongoengie==0.7.10 , which use ReplicaSetConnection instead of MongoReplicaSetClient in mongoengine==0.8.2 , then I have the following exceptions:

  • Down Secondary, Receive Requests, Secondarily, Receive Requests:

    I have for the first time:

     pymongo.errors.AutoReconnect: 127.0.0.1:27017: [Errno 104] Connection reset by peer 

    after

     pymongo.errors.AutoReconnect: No replica set primary available for query with ReadPreference PRIMARY 
  • Down primary, receive requests, primary, receive requests:

    I have for the first time:

     pymongo.errors.AutoReconnect: 127.0.0.1:27017: [Errno 111] Connection refused 

    after

     pymongo.errors.AutoReconnect: No replica set primary available for query with ReadPreference PRIMARY 
  • Down primary or secondary, primary or secondary, get requests that I always:

     pymongo.errors.AutoReconnect: not master and slaveOk=false 

So, two instances of mongo are just a simple example. If I add another instance (total 3), then:

  • If I have any secondary, everything works fine. If I lower and raise one secondary, and then down or down and the second second, then everything works fine.

  • If I omitted and ran the second two words - some problems.

  • If I go down and up in the first or just down in one primary (two second words are available) - some kind of problem, despite the fact that mongo chooses a new primary !!!

If I started one uwsgi process (without --master for two or three mongo instances):

 /path/to/env/bin/uwsgi \ --module app:app \ --env /path/to/env/ \ --pythonpath /path/to/app/ \ --socket /tmp/uwsgi.sock \ --pidfile /tmp/uwsgi.pid \ --daemonize /tmp/uwsgi.log \ --processes 1 \ --threads 2 

or run the application with the dev server:

 /path/to/env/bin/python app.py 

then the application reconnects without problems after resetting and mounting mongo instances.

I have some deployment in production, and sometimes the connection to mongo instances may disappear (up to a few seconds). After that, my application does not work properly until uwsgi restarts.

I have two questions:

  • Why does this happen with several uwsgi processes?
  • How to fix the normal operation of the application after down and up mongo instances?

UPD1 : I'm trying to understand the problem, and now I have a different behavior for self.__schedule_refresh() when a connection exception occurs, when I find one mongo node:

  • For one process:

    • Prior to this statement: rs_state has two members: active with up == True , resetting up == False .
    • After this statement: rs_state has one active member with up == True .
  • For two processes:

    • Prior to this statement: rs_state has two members: active with up == True , resetting up == False .
    • After this statement: rs_state has two members: active with up == True , discarded using up == False (no change).

When I raise the mongo node, then self.__schedule_refresh(sync=sync) also have a different behavior:

  • For one process:

    • Before this statement: rs_state has one active element with up == True .
    • After this statement: rs_state have two members, active with up == True , with up == True .
  • For two processes:

    • Before this statement: rs_state have two members, active with up == True , with up == False .
    • After this statement: rs_state have two members, active with up == True , with up == False (no change).

So it looks like mongo cannot update the state of the replica set (see __schedule_refresh ):

 def __schedule_refresh(self, sync=False): """Awake the monitor to update our view of the replica set state. If `sync` is True, block until the refresh completes. If multiple application threads call __schedule_refresh while refresh is in progress, the work of refreshing the state is only performed once. """ self.__monitor.schedule_refresh() if sync: self.__monitor.wait_for_refresh(timeout_seconds=5) 
+6
source share
2 answers

Try using the uwsgi --lazy-apps option. MongoReplicaSetClient generates a MonitorThread replicator, and this thread cannot handle working with the uwsgi workflow fork. --lazy-apps initializes pymongo MonitorThread in each workflow.

+13
source

After changing the set of replicas (without primary, new primary, etc.), the next operation will result in AutoReconnect exception. After this failed operation, the underlying PyMongo MongoReplicaSetClient will reconnect to the replica set, and future operations may succeed.

If there is a new primary, MongoReplicaSetClient will find it, and future operations will succeed.

If there is no primary, no operations can succeed unless you set ReadPreference to PRIMARY_PREFERRED. See the docs here:

http://mongoengine-odm.readthedocs.org/en/latest/guide/connecting.html#replicasets

The reconnection process should occur once per uwsgi process. Therefore, if there are changes to your replica set, you can expect one AutoReconnect exception per uwsgi process.

0
source

Source: https://habr.com/ru/post/948973/


All Articles