Extract numbered numbers from a string in R

I am trying to extract numbered numbers from strings, as well as extract the word that comes after the number. I was able to do this using a convenient way of writing my own code, including numbered numbers to search for (here is an example from stringr::sentences:

numbers <- str_c(c(" one ", " two ", " three ", " four ", " five ", " six ", " seven ", " eight "," nine ", " ten "), "([^ ]+)")
number_match <- str_c(numbers, collapse = "|")

reduced <- sentences %>%
   str_detect(number_match)
sent <- sentences[reduced==TRUE]
str_extract(sent, number_match)

These are the extracted lines:

 [1] " seven books"   " two met"       " two factors"   " three lists"   " seven is"      " two when"      " ten inches."   " one war"      
 [9] " one button"    " six minutes."  " ten years"     " two shares"    " two distinct"  " five cents"    " two pins"      " five robins." 
[17] " four kinds"    " three story"   " three inches"  " six comes"     " three batches" " two leaves."

How can I not know in advance if I considered all possible numbers, I was wondering if R provides a tool that can identify numbered numbers? I found similar questions, for example. Convert a number to a number, but this, unfortunately, is not a question about R.

Any help is appreciated.

+4
source share

No one has answered this question yet.

Source: https://habr.com/ru/post/1694805/


All Articles