I'm having trouble using pandas to open tab-delimited data without headers.
My test data (actually contains 200 rows, of which I am showing the first 10):
Tag19184 CTAAC hffef 1 a 36 - chr1 10006 0 36M 36 Tag19184 CTAAC hffef 1 a 36 - chr1 10012 0 36M 36 Tag19184 CTAAC hffef 1 a 36 - chr1 10018 0 36M 36 Tag19184 CTAAC hffef 1 a 36 - chr1 10024 0 36M 36 Tag19184 CTAAC hffef 1 a 36 - chr1 10030 0 36M 36 Tag19184 CTAAC hffef 1 a 36 - chr1 10036 0 36M 36 Tag19184 CTAAC hffef 1 a 36 - chr1 10042 0 36M 36 Tag20198 CTAAC hffef 1 a 36 - chr1 10048 0 36M 36 Tag20198 CTAAC hffef 1 a 36 - chr1 10054 0 36M 36 Tag45093 CTAAC hffef 1 a 36 - chr1 10060 0 36M 36
My code is:
import pandas as pd df = pd.read_csv('in_test.txt',sep='\t',header=None) print df
However, I get the following output, which I think I canโt use for further data processing (?):
<class 'pandas.core.frame.DataFrame'> Int64Index: 200 entries, 0 to 199 Data columns: X.1 200 non-null values X.2 200 non-null values X.3 200 non-null values X.4 200 non-null values X.5 200 non-null values X.6 200 non-null values X.7 200 non-null values X.8 200 non-null values X.9 200 non-null values X.10 200 non-null values X.11 200 non-null values X.12 200 non-null values dtypes: int64(5), object(7)
here suggests that print df should just provide me with the appropriate data frame. What am I doing wrong?
source share