Setting the pandas.read_table field and record separators

I am trying to read a file that uses two colons in a line (: :) to separate fields and a channel to separate records. Thus, the test.txt file of the data file may look like this:

testcol1::testcol2|testdata1::testdata2

And my code is as follows:

pd.read_table('test.txt', sep='::', lineterminator='|')

This triggers the following warning :

C:\Users\jordan\AppData\Local\Enthought\Canopy\User\lib\site-packages\ipykernel\__main__.py:4: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators; you can avoid this warning by specifying engine='python'.

And the following "analyzed" data:

testcol1   testcol2|testdata1   testdata2

... with three columns, one header row and zero rows of data. If I add engine = c kwarg, I get the following error:

ValueError: the 'c' engine does not support regex separators

, Python , :: , , Python, lineterminator kwarg, pandas c- c , ?

+4
1

c, , lineterminator, str.split:

In [20]:
import pandas as pd
import io
t="""testcol1::testcol2|testdata1::testdata2"""
df = pd.read_csv(io.StringIO(t),  lineterminator=r'|')
df

Out[20]:
     testcol1::testcol2
0  testdata1::testdata2

In [37]:
df1 = df['testcol1::testcol2'].str.split('::', expand=True)
df1.columns = list(df.columns.str.split('::', expand=True)[0])
df1

Out[37]:
    testcol1   testcol2
0  testdata1  testdata2
+3

Source: https://habr.com/ru/post/1624017/


All Articles