Fixed strptime exception with thread blocking, but slows down the program

I have the following code that when running inside a thread (the full code is here: https://github.com/eWizardII/homobabel/blob/master/lovebird.py )

for null in range(0,1): while True: try: with open('C:/Twitter/tweets/user_0_' + str(self.id) + '.json', mode='w') as f: f.write('[') threadLock.acquire() for i, seed in enumerate(Cursor(api.user_timeline,screen_name=self.ip).items(200)): if i>0: f.write(", ") f.write("%s" % (json.dumps(dict(sc=seed.author.statuses_count)))) j = j + 1 threadLock.release() f.write("]") except tweepy.TweepError, e: with open('C:/Twitter/tweets/user_0_' + str(self.id) + '.json', mode='a') as f: f.write("]") print "ERROR on " + str(self.ip) + " Reason: ", e with open('C:/Twitter/errors_0.txt', mode='a') as a_file: new_ii = "ERROR on " + str(self.ip) + " Reason: " + str(e) + "\n" a_file.write(new_ii) break 

Now, without blocking the thread, I am generating the following error:

 Exception in thread Thread-117: Traceback (most recent call last): File "C:\Python27\lib\threading.py", line 530, in __bootstrap_inner self.run() File "C:/Twitter/homobabel/lovebird.py", line 62, in run for i, seed in enumerate(Cursor(api.user_timeline,screen_name=self.ip).items(200)): File "build\bdist.win-amd64\egg\tweepy\cursor.py", line 110, in next self.current_page = self.page_iterator.next() File "build\bdist.win-amd64\egg\tweepy\cursor.py", line 85, in next items = self.method(page=self.current_page, *self.args, **self.kargs) File "build\bdist.win-amd64\egg\tweepy\binder.py", line 196, in _call return method.execute() File "build\bdist.win-amd64\egg\tweepy\binder.py", line 182, in execute result = self.api.parser.parse(self, resp.read()) File "build\bdist.win-amd64\egg\tweepy\parsers.py", line 75, in parse result = model.parse_list(method.api, json) File "build\bdist.win-amd64\egg\tweepy\models.py", line 38, in parse_list results.append(cls.parse(api, obj)) File "build\bdist.win-amd64\egg\tweepy\models.py", line 49, in parse user = User.parse(api, v) File "build\bdist.win-amd64\egg\tweepy\models.py", line 86, in parse setattr(user, k, parse_datetime(v)) File "build\bdist.win-amd64\egg\tweepy\utils.py", line 17, in parse_datetime date = datetime(*(time.strptime(string, '%a %b %d %H:%M:%S +0000 %Y')[0:6])) File "C:\Python27\lib\_strptime.py", line 454, in _strptime_time return _strptime(data_string, format)[0] File "C:\Python27\lib\_strptime.py", line 300, in _strptime _TimeRE_cache = TimeRE() File "C:\Python27\lib\_strptime.py", line 188, in __init__ self.locale_time = LocaleTime() File "C:\Python27\lib\_strptime.py", line 77, in __init__ raise ValueError("locale changed during initialization") ValueError: locale changed during initialization 

The problem is blocking the thread, each thread basically works on its own, and it takes a long time to start each loop to take precedence over having a thread. So, if there is no way to get rid of thread blocking, is there a way to start the for loop faster inside the try statement?

+4
source share
2 answers

As per previous time.strptime to StackOverflow, time.strptime not thread safe. Unfortunately, the error mentioned in this question is different from the error you are experiencing.

Their solution was to call time.strptime before any threads were initialized, and then subsequent calls to time.strptime on different threads would work.

I think the same solution may work in your situation after looking at the standard library modules _strptime and locale . I cannot be sure that this will work, since I cannot check my code locally, but I thought I would provide you with a potential solution.

Let me know if this works.

Edit:

I did a bit more research, and the standard Python library calls setlocale in the locale.h C. header file. According to the setlocale documentation , this is not thread safe and that setlocale calls should happen before the threads are initialized, as I mentioned earlier.

Unfortunately, setlocale is called every time you call time.strptime . Therefore, I propose the following:

  • Test the solution outlined earlier, try calling time.strptime before initializing the threads and removing the locks.
  • If # 1 does not work, you will probably have to roll up your own time.strptime function, which is thread safe, as pointed out in the Python documentation for locale .
+6
source

The problem you are working with is the lack of security for the flow of functions and modules used.

As you can see here , tweepy is not repetitive or filamentous. As you can see here , Python LocaleTime not too much.

For a multi-threaded application like yours, wrap the tweepy API through your own class that is synchronized (RLock'ed). But do not deduce from the tweepy class, make the has-a relation with a private attribute to the tweepy instance.

+2
source

Source: https://habr.com/ru/post/1334652/


All Articles