Find the whole combination of substrings that make up the given string

I am trying to create a data structure that contains all possible combinations of substrings that are added to the original string. For example, if the string is "java" , then the actual results will be "j", "ava" , "ja", "v", "a" , the invalid result would be "ja", "a" or "a", "jav"

It was very easy for me to find all possible substrings

  String string = "java"; List<String> substrings = new ArrayList<>(); for( int c = 0 ; c < string.length() ; c++ ) { for( int i = 1 ; i <= string.length() - c ; i++ ) { String sub = string.substring(c, c+i); substrings.add(sub); } } System.out.println(substrings); 

and now I'm trying to build a structure containing only valid substrings. But it is not so simple. I'm in a fog of very ugly code, messing around with indexes, and no where near the finish, most likely, on the wrong track completely. Any hints?

+6
source share
4 answers

Here is one approach:

 static List<List<String>> substrings(String input) { // Base case: There only one way to split up a single character // string, and that is ["x"] where x is the character. if (input.length() == 1) return Collections.singletonList(Collections.singletonList(input)); // To hold the result List<List<String>> result = new ArrayList<>(); // Recurse (since you tagged the question with recursion ;) for (List<String> subresult : substrings(input.substring(1))) { // Case: Don't split List<String> l2 = new ArrayList<>(subresult); l2.set(0, input.charAt(0) + l2.get(0)); result.add(l2); // Case: Split List<String> l = new ArrayList<>(subresult); l.add(0, input.substring(0, 1)); result.add(l); } return result; } 

Output:

 [java] [j, ava] [ja, va] [j, a, va] [jav, a] [j, av, a] [ja, v, a] [j, a, v, a] 
+9
source

This seems to be the problem of finding string length compositions and using these compositions to create substrings. Thus, there are 2 ^ n-1 compositions of number n, which can be time consuming for long lines ...

+1
source

Probably someone will want another irregular solution and will not save memory for storing the list:

 public static List<List<String>> substrings(final String input) { if(input.isEmpty()) return Collections.emptyList(); final int size = 1 << (input.length()-1); return new AbstractList<List<String>>() { @Override public List<String> get(int index) { List<String> entry = new ArrayList<>(); int last = 0; while(true) { int next = Integer.numberOfTrailingZeros(index >> last)+last+1; if(next == last+33) break; entry.add(input.substring(last, next)); last = next; } entry.add(input.substring(last)); return entry; } @Override public int size() { return size; } }; } public static void main(String[] args) { System.out.println(substrings("java")); } 

Output:

 [[java], [j, ava], [ja, va], [j, a, va], [jav, a], [j, av, a], [ja, v, a], [j, a, v, a]] 

It simply calculates the next combination based on its index.

+1
source

Just in case, someone will look for the same algorithm in python, here is an implementation in Python:

 from itertools import combinations def compositions(s): n = len(s) for k in range(n): for c in combinations(range(1, n), k): yield tuple(s[i:j] for i, j in zip((0,) + c, c + (n,))) 

An example of how this works:

 >>> for x in compositions('abcd'): ... print(x) ('abcd',) ('a', 'bcd') ('ab', 'cd') ('abc', 'd') ('a', 'b', 'cd') ('a', 'bc', 'd') ('ab', 'c', 'd') ('a', 'b', 'c', 'd') 

With a little modification, you can create compositions in a different order:

 def compositions(s): n = len(s) for k in range(n): for c in itertools.combinations(range(n - 1, 0, -1), k): yield tuple(s[i:j] for i, j in zip((0,) + c[::-1], c[::-1] + (n,))) 

This will give you the following:

 >>> for x in compositions('abcd'): ... print(x) ('abcd',) ('abc', 'd') ('ab', 'cd') ('a', 'bcd') ('ab', 'c', 'd') ('a', 'bc', 'd') ('a', 'b', 'cd') ('a', 'b', 'c', 'd') 

And with one more small addition, you can only generate the specified number of partitions:

 def compositions(s, r=None): n = len(s) r = range(n) if r is None else [r - 1] for k in r: for c in itertools.combinations(range(n - 1, 0, -1), k): yield tuple(s[i:j] for i, j in zip((0,) + c[::-1], c[::-1] + (n,))) 
 >>> for x in compositions('abcd', 3): ... print(x) ('ab', 'c', 'd') ('a', 'bc', 'd') ('a', 'b', 'cd') 
0
source

Source: https://habr.com/ru/post/986972/


All Articles