It is recommended to separate the cleaning and analysis steps. Since you mention that your dataset changes frequently, this cleanup should be automatic. Here is a solution for auto cleaning.
#Read in the data without parsing it lines <- readLines("Skewdailyprices.csv") #The bad lines have more than two fields n_fields <- count.fields( "Skewdailyprices.csv", sep = ",", skip = 1 ) #View the dubious lines lines[n_fields != 2] #Fix them library(stringr) #can use gsub from base R if you prefer lines <- str_replace(lines, ",,x?$", "") #Write back out to file writeLines(lines[-1], "Skewdailyprices_cleaned.csv") #Read in the clean version sdp <- read.zoo( "Skewdailyprices_cleaned.csv", format = "%m/%d/%Y", header = TRUE, sep = "," )
source share