I have a problem where I want to match the start zip code and the end zip code of a very large survey dataset and put these results in a new data frame. I created an example data frame to use for illustrative purposes.
ID = c(1,2,3,4,5) StartPC = c("AF2 4RE","AF3 5RE","AF1 3DR","AF2 4RE","AF2 4PE") EndPC = c("AF2 4RE","NA","AF2 3DR","AX2 4RE","AF2 4PE") data<-data.frame(ID,StartPC,EndPC) data2 <- subset(data, StartPC==EndPC,na.rm=TRUE)
Using the code above, I want to create a dataframe (data2) that only includes ID numbers in which the start and end zip codes are the same. However, I get the error message:
Error in Ops.factor (StartPC, EndPC): sets of factor levels vary
For output, you only need to have identification numbers 1 and 5 included in the new data table.
source share