Repeat rows data.frame N times

Question

Repeat rows data.frame N times

I have the following data frame:

data.frame(a = c(1,2,3),b = c(1,2,3)) ab 1 1 1 2 2 2 3 3 3

and I want to turn this into:

  ab 1 1 1 2 2 2 3 3 3 4 1 1 5 2 2 6 3 3 7 1 1 8 2 2 9 3 3

or repeat it N times. Is there a simple function to do this in R? Thank!

+61

r dataframe

Michael Jan 6 2018-12-12T00:

source share

7 answers

For data.frame objects data.frame this solution is several times faster than @mdsummer and @ wojciech-sobala.

 d[rep(seq_len(nrow(d)), n), ]

For data.table objects, @mdsummer is slightly faster than applying the above after converting to data.frame . For large n, this may turn upside down. microbenchmark .

Full code:

 packages <- c("data.table", "ggplot2", "RUnit", "microbenchmark") lapply(packages, require, character.only=T) Repeat1 <- function(d, n) { return(do.call("rbind", replicate(n, d, simplify = FALSE))) } Repeat2 <- function(d, n) { return(Reduce(rbind, list(d)[rep(1L, times=n)])) } Repeat3 <- function(d, n) { if ("data.table" %in% class(d)) return(d[rep(seq_len(nrow(d)), n)]) return(d[rep(seq_len(nrow(d)), n), ]) } Repeat3.dt.convert <- function(d, n) { if ("data.table" %in% class(d)) d <- as.data.frame(d) return(d[rep(seq_len(nrow(d)), n), ]) } # Try with data.frames mtcars1 <- Repeat1(mtcars, 3) mtcars2 <- Repeat2(mtcars, 3) mtcars3 <- Repeat3(mtcars, 3) checkEquals(mtcars1, mtcars2) # Only difference is row.names having ".k" suffix instead of "k" from 1 & 2 checkEquals(mtcars1, mtcars3) # Works with data.tables too mtcars.dt <- data.table(mtcars) mtcars.dt1 <- Repeat1(mtcars.dt, 3) mtcars.dt2 <- Repeat2(mtcars.dt, 3) mtcars.dt3 <- Repeat3(mtcars.dt, 3) # No row.names mismatch since data.tables don't have row.names checkEquals(mtcars.dt1, mtcars.dt2) checkEquals(mtcars.dt1, mtcars.dt3) # Time test res <- microbenchmark(Repeat1(mtcars, 10), Repeat2(mtcars, 10), Repeat3(mtcars, 10), Repeat1(mtcars.dt, 10), Repeat2(mtcars.dt, 10), Repeat3(mtcars.dt, 10), Repeat3.dt.convert(mtcars.dt, 10)) print(res) ggsave("repeat_microbenchmark.png", autoplot(res))

+27

Max Ghenis Feb 13 '15 at 18:55

source share

The dplyr package contains the bind_rows() function, which directly combines all the data frames in the list, so there is no need to use do.call() together with rbind() :

 df <- data.frame(a = c(1, 2, 3), b = c(1, 2, 3)) library(dplyr) bind_rows(replicate(3, df, simplify = FALSE))

For a large number of repetitions, bind_rows() also much faster than rbind() :

 library(microbenchmark) microbenchmark(rbind = do.call("rbind", replicate(1000, df, simplify = FALSE)), bind_rows = bind_rows(replicate(1000, df, simplify = FALSE)), times = 20) ## Unit: milliseconds ## expr min lq mean median uq max neval cld ## rbind 31.796100 33.017077 35.436753 34.32861 36.773017 43.556112 20 b ## bind_rows 1.765956 1.818087 1.881697 1.86207 1.898839 2.321621 20 a

+13

Stibu Aug 11 '17 at 15:30

source share

 d <- data.frame(a = c(1,2,3),b = c(1,2,3)) r <- Reduce(rbind, list(d)[rep(1L, times=3L)])

+5

Wojciech Sobala Jan 06 2018-12-12T00:

source share

Just use simple indexing with snooze function.

 mydata<-data.frame(a = c(1,2,3),b = c(1,2,3)) #creating your data frame n<-10 #defining no. of time you want repetition of the rows of your dataframe mydata<-mydata[rep(rownames(mydata),n),] #use rep function while doing indexing rownames(mydata)<-1:NROW(mydata) #rename rows just to get cleaner look of data

+3

learner Apr 01 '16 at 11:22

source share

Even simpler:

 library(data.table) my_data <- data.frame(a = c(1,2,3),b = c(1,2,3)) rbindlist(replicate(n = 3, expr = my_data, simplify = FALSE)

+2

Arturo Sbr Feb 20 '19 at 21:16

source share

With data.table -package you can use the special .I character along with rep :

 df <- data.frame(a = c(1,2,3), b = c(1,2,3)) dt <- as.data.table(df) n <- 3 dt[rep(dt[, .I], n)]

which gives:

  ab 1: 1 1 2: 2 2 3: 3 3 4: 1 1 5: 2 2 6: 3 3 7: 1 1 8: 2 2 9: 3 3

0

Jaap Sep 13 '19 at 8:10

source share

mdsumner · Accepted Answer · 2012-01-06 05:23

EDIT: Updated to the best modern R answer.

You can use replicate() and then rbind result. Line names are automatically changed to run from 1: nrows.

 d <- data.frame(a = c(1,2,3),b = c(1,2,3)) n <- 3 do.call("rbind", replicate(n, d, simplify = FALSE))

A more traditional way is to use indexing, but here changing the name of the string is not entirely accurate (but more informative):

  d[rep(seq_len(nrow(d)), n), ]

Here is the improvement on the above, the first two using purrr functional programming, idiomatic purrr:

 purrr::map_dfr(seq_len(3), ~d)

and less idiomatic purrs (identical result, albeit more awkward):

 purrr::map_dfr(seq_len(3), function(x) d)

and finally, using indexing, not a list, apply using dplyr :

 d %>% slice(rep(row_number(), 3))

Repeat rows data.frame N times

More articles: