R - how to prevent row.names when selecting rows from a data frame

Suppose I create a dataframe (just to make it simple):

testframe <- data.frame( a = c(1,2,3,4), b = c(5,6,7,8)) 

So I have two variables (columns) and four cases (rows).

If I select some of the BEGINNING WITH THE FIRST rows, I get some subset of the data frame, for example:

 testframe2 <- testframe[1:2,] #selecting the first two rows 

But if I do the same with the NOT BEGINNING WITH THE FIRST ROW row, I get another column containing the row numbers of the original frame.

 testframe3 <- testframe[3:4,] #selecting the last two rows 

leads to:

  ab 3 3 7 4 4 8 

What can I do to prevent the new row.names variable? I know that I can delete it later, but perhaps it can be avoided from the very beginning.

Thank you for your help!

+6
source share
1 answer

It copies row.names from the source dataset. Just rename the lines using rownames<- like this ...

 rownames( testframe3 ) <- seq_len( nrow( testframe3 ) ) # ab # 1 3 7 # 2 4 8 

Programmatically, seq_len( nrow( x ) ) preferable to say 1:nrow( x ) , because it looks like what happens in cross cases where you select data.frame zero rows ...

 df <- testframe[0,] # [1] ab # <0 rows> (or 0-length row.names) rownames(df) <- seq_len( nrow( df ) ) # No error thrown - returns a length 0 vector of rownames # But... rownames(df) <- 1:nrow( df ) # Error in `row.names<-.data.frame`(`*tmp*`, value = value) : # invalid 'row.names' length # Because... 1:nrow( df ) # [1] 1 0 

Alternatively, you can do this one way by wrapping a subset when calling data.frame , but it is really inefficient if you want to get the number of rows programmatically (because you have to have the subset twice), and I do not recommend it using the rownames<- method:

 data.frame( testframe[3:4,] , row.names = 1:2 ) # ab #1 3 7 #2 4 8 
+3
source

Source: https://habr.com/ru/post/956640/


All Articles