Problem: Split a string into a list of words with delimiters passed as a list.
Line: "After the flood ... all the colors came out."
Required output: ['After', 'the', 'flood', 'all', 'the', 'colors', 'came', 'out']
I wrote the following function - note. I know there are better ways to split the string using some of the pythons built-in functions, but for research, I decided that I would continue this way:
def split_string(source,splitlist): result = [] for e in source: if e in splitlist: end = source.find(e) result.append(source[0:end]) tmp = source[end+1:] for f in tmp: if f not in splitlist: start = tmp.find(f) break source = tmp[start:] return result out = split_string("After the flood ... all the colors came out.", " .") print out ['After', 'the', 'flood', 'all', 'the', 'colors', 'came out', '', '', '', '', '', '', '', '', '']
I can’t understand why “came out” is not divided into “come” and “come out” as two separate words. Its as if a whitespace between two words is ignored. I think the rest of the output is trash that is related to the problem associated with the problem “exited”.
EDIT:
I followed the @Ivc suggestion and came up with the following code:
def split_string(source,splitlist): result = [] lasti = -1 for i, e in enumerate(source): if e in splitlist: tmp = source[lasti+1:i] if tmp not in splitlist: result.append(tmp) lasti = i if e not in splitlist and i == len(source) - 1: tmp = source[lasti+1:i+1] result.append(tmp) return result out = split_string("This is a test-of the,string separation-code!"," ,!-") print out