Duplicate the row.names error reading table. row.names = NULL shifts columns

Question

Duplicate the row.names error reading table. row.names = NULL shifts columns

This is similar to read.csv row.names and https://stackoverflow.com/questions/12425599/duplicated-row-names , but I do not see the answers that help.

Problem: Trying to read in a file that contains duplicate numbers in the first column, but shifts the column headers when row.names = NULL.

I am trying to read the following file in R

TripId VID TspVID VWT VCLS Week 201110041426 2226 33889 1 0 41 201110041501 2226 33889 1 0 41 201110041510 2226 33889 1 0 41 201110041557 2226 33889 1 0 41

(This is a small excerpt from a CSV Excel file with many thousands of rows and ~ 200 columns. The first row has as many records as the rest. The first row has duplicates. The columns do not match the labels in this view, but they are executed in CSV space.)

Team

 > lm.table <- read.table(file= file.in, sep=",", header=TRUE) Error in read.table(file = file.in, sep = ",", header = TRUE) : duplicate 'row.names' are not allowed

does not work. Using the first column for row.names means that the first row has fewer values than the others, which is not the case. I do not want the first column to be like row.names.

I am trying to set row.names = NULL

 > lm.table <- read.table(file= file.in, sep=",", header=TRUE, row.names=NULL)

but the columns are offset

 > head(lm.table) row.names TripId VID TspVID VWT VCLS Week Date TimeStart TimeEnd Lat1 1 201110010006 2226 33889 1 0 40 2011/09/30 17:06:37 17:25:16 47.5168 -122.209 2 201110010028 2226 33889 1 0 40 2011/09/30 17:28:45 17:43:14 47.5517 -122.058 3 201110010000 2231 45781 1 0 40 2011/09/30 17:00:00 18:02:30 32.9010 -117.193 4 201110011407 2231 45781 1 0 40 2011/10/01 07:07:57 07:48:17 32.7044 -117.004

Notice that the new column name is "row.names", and the entire row is shifted to the right.

Here is the tail of the result> head (lm.table). He shifted the column labels to an undefined column (I think this also shows the number of column labels = number of columns, which is also true when checking).

  FVavR FVstdR FIdlR 1 3.959140 2 NA 2 5.285770 20 NA 3 4.274140 26 NA

Any idea why I get shift in columns and how not to shift, and that row.names are just ascending numbers?

+6

r

Chris Wilson Nov 05

source share

4 answers

adrianoesch · Answer 1 · 2014-03-14 15:15

had the same problem. just added this line:

 colnames(rec) <- c(colnames(rec)[-1],"x") rec$x <- NULL

Ryan · Answer 2 · 2012-11-05 20:31

I used the following code:

 lm.table <- read.table("file name", header=TRUE, row.names=NULL)

This added a column to the left with numbered row names, but I did not find that the column names were shifted. Could it be that the column names still matched the correct columns, but the output of R looked as if the names were shifted?

Tammy · Answer 3 · 2015-08-24 10:11

I had the same problem. I combined the date with timestamps and now I can read from csv.

You can generate the row number as the first column (say using python) in your csv and then read it again.

Kemin Zhou · Answer 4 · 2016-04-16 00:49

My problem is with the field delimiter for a TAB delimited file:

If I do not specify a field separator:

 > condensed <- read.table("condense_report.tab", header=T) Error in read.table("condense_report.tab", header = T) : duplicate 'row.names' are not allowed

If I add sep = TAB

 condensed <- read.table("condense_report.tab", header=TRUE, sep="\t")

Then there is no error message.

Here is my contents of the file (^ I am the TAB character and $ is the end of the line):

Example ^ Imethod ^ ^ Imean_frac Genome, ^ Istd_frac ^ Imean_dep ^ Imean_clsz ^ Inumrep $ asterix_potion ^ Imothur ^ IEnterococcus faecalis ^ I0.32290000 ^ I0.021755985650701942 ^ I3293.5000 ^ I3309.700 ^ Iixboteriix ^ ^ Iotnii ^ xiotii ^ xi ^ tb ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^? .28010000 ^ I0.021539851624928375 ^ I2869.5000 ^ I2880.7500 ^ I4 $

The problem is the view column. It has spaces, and R uses both space and the tab as default delimiters. This way you have an extra column than the heading according to R if the sep parameter is not specified. This is the root of the problem.

Duplicate the row.names error reading table. row.names = NULL shifts columns

More articles: