R: How to use setdiff for two string vectors, only comparing the first 3 elements with tab delimiters in each line?

I am trying to find a way in R to distinguish between two string vectors, but only based on the first three columns that are listed on each row. For example, this is list1 and list2

list1:

        "1\t1113200\t1118399\t1\t1101465\t1120176\tENSRNOG00000040300\tRaet1l\t0\n" 
        "1\t1180200\t1187599\t1\t1177682\t1221416\tENSRNOG00000061316\tAABR07000121.1\t0\n"
        "1\t1180200\t1187599\t1\t1177632\t1221416\tENSRNOG00000061316\tAABR07000121.1\t0\n"

songs2:

 "1\t1113200\t1118399\t1\t1101465\t1120176\tENSRNOG00000040300\tRaet1l\t0\n" 
  "1\t1180200\t1187599\t1\t1177682\t1221416\tENSRNOG00000061316\tAABR07000121.1\t0\n"

I want to do setdiff(list2,list1), so I just get everything in list2 that does not exist in list1, however I want to do this based on only rows with three rows with three tabs. Therefore, in list1, I would just think:

   "1\t1113200\t1118399"

from the first entry. However, I still need a full line. I want to compare only the first three columns. I find it difficult to understand how to do this, any help will be appreciated. Ive already looked at a few SO posts, none of them seemed to help.

+3
1

( , , dataframe...), beg2char() qdap. (, , substr() .)

beg2char(list1, '\t', 3) # Will extract from the beginning up to the third tab delimiter

setdiff %in%, , list2 list1.

beg2char(list2, '\t', 3) %in% beg2char(list1, '\t', 3) # will give you TRUE/FALSE
list2[!(beg2char(list2, '\t', 3) %in% beg2char(list1, '\t', 3))]

list2, , list1.

+2

Source: https://habr.com/ru/post/1656026/


All Articles