Why don't you just try the read.fwf function from the utils package? The column widths are specified in the readme.txt file (see Section IV).
IV. FORMAT OF "ghcnd-stations.txt" ------------------------------ Variable Columns Type ------------------------------ ID 1-11 Character LATITUDE 13-20 Real LONGITUDE 22-30 Real ELEVATION 32-37 Real STATE 39-40 Character NAME 42-71 Character GSN FLAG 73-75 Character HCN/CRN FLAG 77-79 Character WMO ID 81-85 Character ------------------------------
However, the following attempt returns an error:
data <- read.fwf("ghcnd-stations.txt", widths = c(11,9,10,7,3,31,4,4,6)) Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 25383 did not have 7 elements
Checking line 25383 shows the cause of the error.
> x <- readLines("ghcnd-stations.txt", 25383) > tail(x, 1) [1] "CA002100627 60.8167 -137.7333 846.0 YT HAINES APPS #4 "
So, comment.char around this by including the comment.char argument, changing the default (#) to something else, maybe just null.
data <- read.fwf("ghcnd-stations.txt", widths = c(11,9,10,7,3,31,4,4,6), comment.char="")
It only takes 20 seconds. There is no real need for fread .