Retrieve all characters before the first occurrence of a special character in R

I want all characters to be in front of the first "." if there is. Otherwise, I want to return the same character ("8" → "8").

Example:

v<-c("7.7.4","8","12.6","11.5.2.1")

I want to get something like this:

[1] "7 "8" "12" "11"

My idea was to split each element into "." and then take only the first split. I did not find a solution that worked ...

+4
source share
2 answers

you can use sub

sub("\\..*", "", v)
#[1] "7"  "8"  "12" "11"

or several parameters stringi:

library(stringi)
stri_replace_first_regex(v, "\\..*", "")
#[1] "7"  "8"  "12" "11"
# extract vs. replace
stri_extract_first_regex(v, "[^\\.]+")
#[1] "7"  "8"  "12" "11"

If you want to use a separation approach, they will work:

unlist(strsplit(v, "\\..*"))
#[1] "7"  "8"  "12" "11"

# stringi option
unlist(stri_split_regex(v, "\\..*", omit_empty=TRUE))
#[1] "7"  "8"  "12" "11"
unlist(stri_split_fixed(v, ".", n=1, tokens_only=TRUE))
unlist(stri_split_regex(v, "[^\\w]", n=1, tokens_only=TRUE))

sub, :

sub("(\\w+).+", "\\1", v) # \w matches [[:alnum:]_] (i.e. alphanumerics and underscores)
sub("([[:alnum:]]+).+", "\\1", v) # exclude underscores

# variations on a theme
sub("(\\w+)\\..*", "\\1", v)
sub("(\\d+)\\..*", "\\1", v) # narrower: \d for digits specifically
sub("(.+)\\..*", "\\1", v) # broader: "." matches any single character

# stringi variation just for fun:
stri_extract_first_regex(v, "\\w+")
+8

scan() . ., , scan() v.

scan(text = v, comment.char = ".")
# [1]  7  8 12 11

, , . , what, , .

scan(text = v, comment.char = ".", what = "")
# [1] "7"  "8"  "12" "11"

:

v <- c("7.7.4", "8", "12.6", "11.5.2.1")
+3

Source: https://habr.com/ru/post/1621003/


All Articles