Various read_csv index_col = None / 0 / False in pandas

I used the following read_csv command:

In [20]: dataframe = pd.read_csv('D:/UserInterest/output/ENFP_0719/Bookmark.csv', index_col=None) dataframe.head() Out[20]: Unnamed: 0 timestamp url visits 0 0 1.404028e+09 http://m.blog.naver.com/PostView.nhn?blogId=mi... 2 1 1 1.404028e+09 http://m.facebook.com/l.php?u=http%3A%2F%2Fblo... 1 2 2 1.404028e+09 market://details?id=com.kakao.story 1 3 3 1.404028e+09 https://story-api.kakao.com/upgrade/install 4 4 4 1.403889e+09 http://m.cafe.daum.net/WorldcupLove/Knj/173424... 1 

The result shows the Unnamed:0 column, and it is simillar when I used index_col=False , but when I used index_col=0 , the result is as follows:

 dataframe = pd.read_csv('D:/UserInterest/output/ENFP_0719/Bookmark.csv', index_col=0) dataframe.head() Out[21]: timestamp url visits 0 1.404028e+09 http://m.blog.naver.com/PostView.nhn?blogId=mi... 2 1 1.404028e+09 http://m.facebook.com/l.php?u=http%3A%2F%2Fblo... 1 2 1.404028e+09 market://details?id=com.kakao.story 1 3 1.404028e+09 https://story-api.kakao.com/upgrade/install 4 4 1.403889e+09 http://m.cafe.daum.net/WorldcupLove/Knj/173424... 1 

The result showed the Unnamed:0 column. Here, I want to ask what is the difference between index_col=None , index_col=0 and index_col=False , I read the documentation in this , but I still do not understand.

+11
source share
1 answer

UPDATE

I think that starting from version 0.16.1 , now an error will occur if you try to pass True to index_col to avoid this ambiguity

ORIGINAL

Many people are embarrassed by this to indicate the ordinal index of your column, you have to pass the position of int in this case 0 .

 In [3]: import io import pandas as pd t="""index,a,b 0,hello,pandas""" pd.read_csv(io.StringIO(t))​ Out[3]: index ab 0 0 hello pandas 

The default value is index_col=None as shown above.

If we set index_col=0 we explicitly declare that the first column should be considered as an index:

 In [4]: pd.read_csv(io.StringIO(t), index_col=0) Out[4]: ab index 0 hello pandas 

If we pass index_col=False we get the same result as None :

 In [5]: pd.read_csv(io.StringIO(t), index_col=False) Out[5]: index ab 0 0 hello pandas 

If we now index_col=None we get the same behavior as when we did not pass this parameter:

 In [6]: pd.read_csv(io.StringIO(t), index_col=None) Out[6]: index ab 0 0 hello pandas 

There is an error in which if you pass True it was erroneously converted to index_col=1 since True was converted to 1 :

 In [6]: pd.read_csv(io.StringIO(t), index_col=True) Out[6]: index b a 0 hello pandas 

EDIT

For the case when you have an empty index column that you have:

 In [7]: import io import pandas as pd t=""",a,b 0,hello,pandas""" pd.read_csv(io.StringIO(t))​ Out[7]: Unnamed: 0 ab 0 0 hello pandas In [8]: pd.read_csv(io.StringIO(t), index_col=0) Out[8]: ab 0 hello pandas In [9]: pd.read_csv(io.StringIO(t), index_col=False) Out[9]: Unnamed: 0 ab 0 0 hello pandas In [10]: pd.read_csv(io.StringIO(t), index_col=None) Out[10]: Unnamed: 0 ab 0 0 hello pandas 
+11
source

Source: https://habr.com/ru/post/985933/


All Articles