R Identification of the text row in the data frame column

There are words and phrases in one column of my data frame. I am trying to create a dummy variable for those fields inside this column that have specific lines of text anywhere.

For instance:

  • kite
  • cars
  • box kites
  • car models
  • I like kites flying
  • cars of the world

    myvector<-c("kite","cars","box kites","model cars","i like kites that fly", "cars of the world") 

I would like to identify all fields with the string "kite"

I tried several things like any() , which() and %in% , but so far nothing has worked.

Any help is much appreciated

+4
source share
1 answer

You have not provided a single reproducible example. But your answer will be grepl.

 grepl("kite", df$words) 

It will return a logical vector if the word is in a string.

If you want to combine several words, use logical or | inside string to match

 grepl("kite|cars|box kites", df$words) 
+12
source

Source: https://habr.com/ru/post/1434022/


All Articles