Using python and urllib to get data from Yahoo FInance

I used urllib in python to get stock prices from yahoo finance. Here is my code:

import urllib
import re

name = raw_input(">")

htmlfile = urllib.urlopen("http://finance.yahoo.com/q?s=%s" % name)

htmltext = htmlfile.read()

# The problemed area 
regex = '<span id="yfs_l84_%s">(.+?)</span>' % name

pattern = re.compile(regex)

price = re.findall(pattern, htmltext)

print price

So, I enter the value, and the stock price comes out. But for now, I can make it display the price, just empty []. I commented on where I find the problem. Any suggestions? Thank you

+4
source share
4 answers

You have not escaped the slash in your regular expression. Change the regex:

<span id="yfs_l84_%s">(.+?)</span>

to

<span id="yfs_l84_goog">(.+?)<\/span>

This will fix your problem if you enter the company listing code as an input code. Ex; goog for google.

, , . , BeautifulSoup, Python HTML. BeautifulSoup , :

from bs4 import BeautifulSoup
import requests

name = raw_input('>')
url = 'http://finance.yahoo.com/q?s={}'.format(name)
r = requests.get(url)
soup = BeautifulSoup(r.text)
data = soup.find('span', attrs={'id':'yfs_l84_'.format(name)})
print data.text
+4

pandas? .

http://pandas.pydata.org/pandas-docs/stable/remote_data.html

yahoo :

In [1]: import pandas.io.data as web
In [2]: import datetime
In [3]: start = datetime.datetime(2010, 1, 1)
In [4]: end = datetime.datetime(2013, 01, 27)
In [5]: f=web.DataReader("F", 'yahoo', start, end)
In [6]: f.ix['2010-01-04']
Out[6]: 
OnOpen               10.17
High               10.28
Low                10.05
Close              10.28
Volume       60855800.00
Adj Close           9.75
Name: 2010-01-04 00:00:00, dtype: float64
+1

, Yahoo, csvs. csv .

HTML-, BeautifulSoup. HTML .

0

The best way to get data from Yahoo Finance using python2 or python3 is to use the POST method.
You can easily verify this using a Rest service such as Postman

Open your mail manager and use the POST method and use this. Then you will see the data. Just re-create this in python

import requests
url="https://query1.finance.yahoo.com/v7/finance/download/GOOG? period1=1519938930&period2=1522354530&interval=1d&events=history&crumb=.tLvYBkGDu3"

response = requests.post(url)
print response.text

I used to get data using urllib2, but now it gives an authorization error. They probably filter everything using Rest methods like GET and POST

0
source

Source: https://habr.com/ru/post/1536708/


All Articles