Reading the last n lines from a huge text file

I tried something like this

file_in <- file("myfile.log","r") x <- readLines(file_in, n=-100) 

but I'm still waiting ...

Any help would be greatly appreciated.

I would use scan for this if you know how many lines the log has:


If you don’t know how much you need to skip, you have no choice but to go to

  • reading in everything and taking the last n lines (if possible),
  • using scan("foo.txt",sep="\n",what=list(NULL)) to find out how many entries there are, or
  • using some algorithm to go through the file, saving only the last n lines each time

The last option might look like this:

 ReadLastLines <- function(x,n,...){ con <- file(x) open(con) out <- scan(con,n,what="char(0)",sep="\n",quiet=TRUE,...) while(TRUE){ tmp <- scan(con,1,what="char(0)",sep="\n",quiet=TRUE) if(length(tmp)==0) {close(con) ; break } out <- c(out[-1],tmp) } out } 





if you know that you have over 10 million rows. This can save reading time when you start to have very large magazines.

EDIT: Actually, I didn't even use R for this, given the size of your file. On Unix, you can use the tail command. There is a version of Windows for this, somewhere in the toolbox. I have not tried to do this yet.


You can do this with read.table by specifying the skip parameter. If your lines are not processed by variables, specify the delimiter as '\n' , as @Joris Meys below, and set to get character vectors instead of factors.

A small example (skipping the first lines of 2000):

 df <- read.table('foo.txt', sep='\n',, skip=2000) 

As @JorisMeys already mentioned, the unix tail command will be the easiest way to solve this problem. However, I want to offer seek based R solution that will start reading the file from the end of the file:

 tailfile <- function(file, n) { bufferSize <- 1024L size <-$size if (size < bufferSize) { bufferSize <- size } pos <- size - bufferSize text <- character() k <- 0L f <- file(file, "rb") on.exit(close(f)) while(TRUE) { seek(f, where=pos) chars <- readChar(f, nchars=bufferSize) k <- k + length(gregexpr(pattern="\\n", text=chars)[[1L]]) text <- paste0(text, chars) if (k > n || pos == 0L) { break } pos <- max(pos-bufferSize, 0L) } tail(strsplit(text, "\\n")[[1L]], n) } tailfile(file, n=100) 

