ive previously asked this question, and the answer I got processed: R: How to use setdiff for two string vectors, only comparing the first 3 elements with tab delimiters on each line? , However, qdap requires rJava and the correct configuration of the user system. cannot load qdap R-package . So now I am asking the question again, but I wonder if there is a way to do this without using qdap? I will repeat the question below:
I am trying to find a way in R to distinguish between two string vectors, but only based on the first three columns that are listed on each row. For example, this is list1 and list2
list1:
"1\t1113200\t1118399\t1\t1101465\t1120176\tENSRNOG00000040300\tRaet1l\t0\n"
"1\t1180200\t1187599\t1\t1177682\t1221416\tENSRNOG00000061316\tAABR07000121.1\t0\n"
"1\t1180200\t1187599\t1\t1177632\t1221416\tENSRNOG00000061316\tAABR07000121.1\t0\n"
list2:
"1\t1113200\t1118399\t1\t1101465\t1120176\tENSRNOG00000040300\tRaet1l\t0\n"
"1\t1180200\t1187599\t1\t1177682\t1221416\tENSRNOG00000061316\tAABR07000121.1\t0\n"
I want to make setdiff (list2, list1), so I just get everything on list2 that is not on list1, but I want to do this only on the first lines with three tabs. Therefore, in list1, I would just think:
"1\t1113200\t1118399"
from the first entry. However, I still need a full line. I want to compare only the first three columns. I find it difficult to understand how to do this, any help will be appreciated. Ive already looked at a few SO posts, none of them seemed to help.
source
share