I am trying to sort data in a CSV file using the sort function in Pandas using the following code. I have 229 lines in the source file. But the result of sorting is 245 lines, because some of the data in the field was printed on the next line, and some of the lines do not matter.
sample=pd.read_csv("sample.csv" , encoding='latin-1', skipinitialspace=True)
sample_sorted = sample.sort_values(by = ['rating'])
sample_sorted.to_csv("sample_sorted.csv")
I think this problem happened because in some cells the data was entered by creating new rows. For example, this is the contents of a cell in the source file. When I sort the source file, the second line was printed on a new line, and the three lines remained blank between the first and second lines.
"Side effects are way to extreme.
E-mail me if you have experianced the same things."
Any suggestion? Thank!
source
share