Scraping page to get prices from google finance

Question

Scraping page to get prices from google finance

I am trying to get stock prices by scraping Google finance pages, I do this in python using the urllib package and then using regex to get price data.

When I leave my python script, it runs initially for some time (several minutes), and then throws an exception exception [HTTP Error 503: Service Unavailable]

I assume this is happening because on the web server side it detects frequent page updates as a robot and throws this exception after some time.

is there a way, for example, to delete a cookie or create some cookie, etc.

or even better, if google gives some api, I want to do it in python, because the complete application is in python, but if there is nothing available in python, I can consider alternatives. This is my python method that I use in a loop to get data (with a few seconds of sleep, I call this method in a loop)

def getPriceFromGOOGLE(self, symbol): """ gets last traded price from google for given security """ toReturn = 0.0 try: base_url = 'http://google.com/finance?q=' req = urllib2.Request(base_url + symbol) content = urllib2.urlopen(req).read() namestr = 'name:\"' + symbol + '\",cp:(.*),p:(.*),cid(.*)}' m = re.search(namestr, content) if m: data = str(m.group(2).strip().strip('"')) price = data.replace(',','') toReturn = float(price) else: print 'ERROR ' + str(symbol) + ' --- ' + str(content) except Exception, exc: print 'Exc: ' + str(exc) finally: return toReturn

+4

python screen-scraping urllib google-finance stockquotes

user424060 Apr 12 '11 at 14:32

source share

4 answers

The question is quite old, but the selected answer is no longer valid. API is deprecated.

There is an open source project to clear all of Google’s finances and match them to their current price http://scrape-google-finance.compunect.com/
The project solves most issues, includes caching, IP management and works stably without blocking.
It uses an internal finance company matching api to clear companies and api schedule to get prices. However, this is PHP code, not python. You can still find out how he solves problems and adapts them.

+5

John Apr 2 '14 at 10:48

source share

To get around most speed limits or detect bots from the likes of Google or Wikipedia or Yahoo, trick your user agent.

This will cause your script requests to appear from the latest version of Google Chrome.

 headers = {'User-Agent' : "Mozilla/5.0 (Windows NT 6.0; WOW64) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.16 Safari/534.24"} req = urllib2.Request(url,None,headers) content = urllib2.urlopen(req).read()

+3

Aphex Apr 12 '11 at 21:11

source share

Yahoo Finance is also a good place to get financial information that spans more countries and stocks.

For python 2 you can use ystockquote . For python 3, you can use yfq , which I rewrite from the previous one.

Get current quotes from Google and Intel.

 >>> import yfq >>> yfq.get_price('GOOG+INTL') {'GOOG': '600.25', 'INTL': '22.25'}

Get historical Yahoo quotes from March 3, 2012 to March 5, 2012.

 >>> yfq.get_historical_prices('YHOO','20120301','20120303') [['Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'], ['2012-03-02', '14.89', '14.92', '14.66', '14.72', '9164900', '14.72'], ['2012-03-01', '14.89', '14.96', '14.79', '14.93', '12283300', '14.93']]

+3

angelo Mar 12 '12 at 0:53

source share

Aj. · Accepted Answer · 2011-04-12T14:37:46+0000

There is a Google Finance API:

http://code.google.com/apis/finance/docs/2.0/developers_guide_protocol.html

And for him there is a Python client library:

http://code.google.com/p/gdata-python-client/

Scraping page to get prices from google finance

More articles: