Convert pandas dataframe to utf8

How to convert pandas framework to Unicode?

`messages=pandas.read_csv('data/SMSSpamCollection',sep='\t',quoting=csv.QUOTE_NONE,names=["label", "message"]) def split_into_tokens(message): message = unicode(message, 'utf8') # convert bytes into proper unicode return TextBlob(message).words messages.head().apply(split_into_tokens(messages))` 

He gives an error

 Traceback (most recent call last): File "minor.py", line 46, in <module> messages.head().apply(split_into_tokens(messages)) File "minor.py", line 42, in split_into_tokens message = unicode(message, 'utf8') # convert bytes into proper unicode TypeError: coercing to Unicode: need string or buffer, DataFrame found 
+5
source share
2 answers

Change code

 messages.head().apply(split_into_tokens(messages)) 

to

 messages.head().apply(split_into_tokens) 

when using "apply" with funtion, as in your case, transfer parameters are not required, as your code shows that it passes the data framework, which gives an error during execution.

+3
source

Source: https://habr.com/ru/post/1264714/


All Articles