Get HTML table in pandas Dataframe, not a list of dataframe objects

Question

Get HTML table in pandas Dataframe, not a list of dataframe objects

I apologize if this question was answered elsewhere, but I could not find a satisfactory answer here or elsewhere.

I'm a little new to python and pandas and with some difficulty getting HTML data into the pandas framework. The pandas documentation says: .read_html () returns a list of dataframe objects, so when I try to do some data manipulation to get rid of some selections, I get an error.

Here is my code for reading HTML:

df = pd.read_html('http://espn.go.com/nhl/statistics/player/_/stat/points/sort/points/year/2015/seasontype/2', header = 1)

Then I try to clear it:

df = df.dropna(axis=0, thresh=4)

And I got the following error:

Traceback (most recent call last): File "module4.py", line 25, in
<module> df = df.dropna(axis=0, thresh=4) AttributeError: 'list'
object has no attribute 'dropna'

How to get this data in a real data framework, similar to what .read_csv () does?

+4

python pandas html-parsing dataframe

schaefferda Jul 20 '16 at 16:55

1

Laurent S · Accepted Answer · 2016-07-20T17:21:55+0000

http://pandas.pydata.org/pandas-docs/version/0.17.1/io.html#io-read-html, read_html DataFrame, , HTML".

df = df[0].dropna(axis=0, thresh=4) , .

Get HTML table in pandas Dataframe, not a list of dataframe objects

More articles: