Removing space in python dataframe

Question

Removing space in python dataframe

I get an error in my code because I tried to create a data file by calling an element from csv. I have two columns that I call from a file: CompanyName and QualityIssue. There are three types of quality issues: equipment quality, user, and none of them. I am encountering problems that try to make dataframe df.Equipment Quality, which obviously does not work, because there is room. I want to take the quality of the equipment from the source file and replace the space with an underscore.

input:

Top Calling Customers, Equipment Quality, User, Neither, Customer 3, 2, 2, 0, Customer 1, 0, 2, 1, Customer 2, 0, 1, 0, Customer 4, 0, 1, 0,

Here is my code:

 import numpy as np import pandas as pd import pandas.util.testing as tm; tm.N = 3 # Get the data. data = pd.DataFrame.from_csv('MYDATA.csv') # Group the data by calling CompanyName and QualityIssue columns. byqualityissue = data.groupby(["CompanyName", "QualityIssue"]).size() # Make a pandas dataframe of the grouped data. df = pd.DataFrame(byqualityissue) # Change the formatting of the data to match what I want SpiderPlot to read. formatted = df.unstack(level=-1)[0] # Replace NaN values with zero. formatted[np.isnan(formatted)] = 0 includingtotals = pd.concat([formatted,pd.DataFrame(formatted.sum(axis=1), columns=['Total'])], axis=1) sortedtotal = includingtotals.sort_index(by=['Total'], ascending=[False]) sortedtotal.to_csv('byqualityissue.csv')

This seems to be a frequently asked question, and I tried many solutions, but they didn't seem to work. Here is what I tried:

 with open('byqualityissue.csv', 'r') as f: reader = csv.reader(f, delimiter=',', quoting=csv.QUOTE_NONE) return [[x.strip() for x in row] for row in reader] sentence.replace(" ", "_")

and

 sortedtotal['QualityIssue'] = sortedtotal['QualityIssue'].map(lambda x: x.rstrip(' '))

And what I considered the most promising from here is http://pandas.pydata.org/pandas-docs/stable/text.html :

 formatted.columns = formatted.columns.str.strip().str.replace(' ', '_')

but I got this error: AttributeError: The 'Index' object does not have the 'str' attribute

Thanks for your help in advance!

+6

python pandas whitespace dataframe strip

jenryb Jun 10 '15 at 17:30

source share

2 answers

As I understand it, your question should work (check it inplace=False to see how it looks first if you want to be careful):

 sortedtotal.rename(columns=lambda x: x.replace(" ", "_"), inplace=True)

And if you have spaces surrounding the column names, for example: "This example"

 sortedtotal.rename(columns=lambda x: x.strip().replace(" ", "_"), inplace=True)

which breaks the leading / trailing white space, then converts the internal spaces to "_".

+3

JBWhitmore Nov 22 '15 at 1:13

source share

Alexander · Accepted Answer · 2015-06-10T19:35:14+0000

Try:

 formatted.columns = [x.strip().replace(' ', '_') for x in formatted.columns]

Removing space in python dataframe

More articles: