Unable to stop the stream in tweepy in one minute

I am trying to transfer twitter data over a period of time equal to 5 minutes using the Stream.filter () method. I save the extracted tweets in a JSON file. The problem is that I cannot stop the filter () method from the program. I need to stop execution manually. I tried to stop the data based on system time using a time packet. I was able to stop writing tweets to the JSON file, but the stream method still continues, but it could not continue the next line of code. I use IPython to write and execute code. Here is the code:

auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)

from tweepy import Stream
from tweepy.streaming import StreamListener

class MyListener(StreamListener):

    def __init__(self, start_time, time_limit=60):
        self.time = start_time
        self.limit = time_limit

    def on_data(self, data):
        while (time.time() - self.time) < self.limit:
            try:
                saveFile = open('abcd.json', 'a')
                saveFile.write(data)
                saveFile.write('\n')
                saveFile.close()
                return True
            except BaseException as e:
                print 'failed ondata,', str(e)
                time.sleep(5)
        return True

    def on_status(self, status):
        if (time.time() - self.time) >= self.limit:
            print 'time is over'
            return false

    def on_error(self, status):
        if (time.time() - self.time) >= self.limit:
            print 'time is over'
            return false
        else:
            print(status)
            return True

start_time = time.time()
stream_data = Stream(auth, MyListener(start_time,20))
stream_data.filter(track=['name1','name2',...list ...,'name n'])#list of the strings I want to track

These links are similar, but I do not directly answer my question

Tweepy: streaming data in X minutes?

Stopping a Tweepy pair after a duration parameter (# lines, seconds, #Tweets, etc.)

Tweedy Streaming - x

, http://stats.seandolinar.com/collecting-twitter-data-using-a-python-stream-listener/

+4
3
  • , False on_data() on_status().

  • tweepy.Stream() while, while on_data().

  • MyListener __init__, .

, , , :

class MyStreamListener(tweepy.StreamListener):
    def __init__(self, time_limit=60):
        self.start_time = time.time()
        self.limit = time_limit
        self.saveFile = open('abcd.json', 'a')
        super(MyStreamListener, self).__init__()

    def on_data(self, data):
        if (time.time() - self.start_time) < self.limit:
            self.saveFile.write(data)
            self.saveFile.write('\n')
            return True
        else:
            self.saveFile.close()
            return False

myStream = tweepy.Stream(auth=api.auth, listener=MyStreamListener(time_limit=20))
myStream.filter(track=['test'])
+10

myListener.running, MyListener Stream :

myListener = MyListener()
timeout code here... suchas time.sleep(20)
myListener.running = False 
0

So, I had this problem. Fortunately, Tweepy is open source, so it’s so easy to sort out the problem.

The mostly important part is here:

def _data(self, data):
    if self.listener.on_data(data) is False:
        self.running = False

In the Stream class in streaming.py

This means that to close the connection you just need to return false in the list_data () method of the listener.

0
source

Source: https://habr.com/ru/post/1614261/


All Articles