Split a string into substrings of a given length with the remainder

Question

Split a string into substrings of a given length with the remainder

For a string such as:

text <- "abcdefghijklmnopqrstuvwxyz"

I would like to cut a string into substrings, for example a length of 10, and save the remainder:

 "abcdefghij" "klmnopqrst" "uvwxyz"

All the methods that I know for creating substrings do not give me a residual substring with 6 characters. I have tried answers to previous similar questions, such as:

 > substring(text, seq(1, nchar(text), 10), seq(10, nchar(text), 10)) [1] "abcdefghij" "klmnopqrst" ""

Any advice on how to get all the substrings of the desired length and any remaining strings would be greatly appreciated.

+6

string substring r string-split

grdn Dec 15 '14 at 18:25

source share

3 answers

Try

 strsplit(text, '(?<=.{10})', perl=TRUE)[[1]] #[1] "abcdefghij" "klmnopqrst" "uvwxyz"

Or you can use library(stringi) for a faster approach

 library(stringi) stri_extract_all_regex(text, '.{1,10}')[[1]] #[1] "abcdefghij" "klmnopqrst" "uvwxyz"

+10

akrun Dec 15 '14 at 18:27

source share

The following is an example of using strapplyc using a fairly simple regular expression. It works because .{1,10} always matches the longest line with a maximum of 10 characters:

 library(gsubfn) strapplyc(text, ".{1,10}", simplify = c)

giving:

 [1] "abcdefghij" "klmnopqrst" "uvwxyz"

Visualization . This regular expression is simple enough that it doesn’t actually need visualization, but here it’s all the same:

 .{1,10}

Demo version of Debuggex

+3

G. grothendieck Dec 15 '14 at 21:41

source share

Rich scriven · Accepted Answer · 2014-12-15T18:28:01+0000

The vectors that you use for the first and last arguments in substring can exceed the number of characters in a string without errors / warnings / problems. So you can do

 text <- "abcdefghijklmnopqrstuvwxyz" sq <- seq.int(to = nchar(text), by = 10) substring(text, sq, sq + 9) # [1] "abcdefghij" "klmnopqrst" "uvwxyz"

Split a string into substrings of a given length with the remainder

More articles: