I have a .xlsx file that I open with this code:
import pandas as pd
df = pd.read_excel(open('file.xlsx','rb'))
df['Description'].head
and I have the following result, which looks pretty good.
ID | Description
:----- | :-----------------------------
0 | Some Description with no hash
1 | Text with
2 | Text with #two #hashes
Now I want to create a new column, saving only words starting with C #, like this one:
ID | Description | Only_Hash
:----- | :----------------------------- | :-----------------
0 | Some Description with no hash | Nan
1 | Text with #one hash |
2 | Text with #two #hashes |
I managed to count / split C # strings:
descriptionWithHash = df['Description'].str.contains('#').sum()
but now I want to create a column as described above. What is the easiest way to do this?
Hello!
PS: it is assumed that the table format will be displayed in the question, but I can’t understand why it is wrong!
source
share