I want to find twitter after a word (say #google), and then I can create a tag cloud from the words used in twitts, but according to dates (for example, having a moving hour window that moves 10 minutes every time and shows me how different words are more often used throughout the day).
I would appreciate any help on how to do this: resources for information, programming code (R is the only language I use) and visualization ideas. Questions:
How do I get information?
In R, I found that the twitteR package has a searchTwitter command. But I donβt know how big I can get from him. In addition, it does not return the dates on which the tweet occurred.
I see here that I can get up to 1,500 tweets, but that requires me to make it out manually (which leads me to step 2). Also, for my purposes, I would need tens of thousands of tweets. Is it even possible to get them in retrospect? (for example, ask old messages every time through the API URL?) If not, there is a more general question about how to create a personal tweet repository on your home computer? (a question that might be better left to another SO thread - although any ideas from people here would be very interesting to me)
How to analyze information (in R)? I know that R has functions that can help in the rcurl and twitteR packages. But I do not know what and how to use them. Any suggestions would help.
How to analyze? How to remove all "not interesting" words? I found that the βtmβ package in R has this example :
reuters <- tm_map (reuters, removeWords, stop words ("English"))
Will it be a trick? Should I do something else / more?
In addition, I assume that I would like to do this after reducing my dataset according to time (which will require some posix-like functions (which I'm not quite sure what will be needed here or how to use it).
And finally, the question arises of visualization. How to create a word tag cloud? I found a solution for this here , any other suggestion / recommendation?
I believe that I am asking a huge question here, but I tried to break it down into the simplest questions possible. Any help would be appreciated!
Best
Tal
source share