Line section problem

Problem: Split a string into a list of words with delimiters passed as a list.

Line: "After the flood ... all the colors came out."

Required output: ['After', 'the', 'flood', 'all', 'the', 'colors', 'came', 'out']

I wrote the following function - note. I know there are better ways to split the string using some of the pythons built-in functions, but for research, I decided that I would continue this way:

 def split_string(source,splitlist): result = [] for e in source: if e in splitlist: end = source.find(e) result.append(source[0:end]) tmp = source[end+1:] for f in tmp: if f not in splitlist: start = tmp.find(f) break source = tmp[start:] return result out = split_string("After the flood ... all the colors came out.", " .") print out ['After', 'the', 'flood', 'all', 'the', 'colors', 'came out', '', '', '', '', '', '', '', '', ''] 

I can’t understand why “came out” is not divided into “come” and “come out” as two separate words. Its as if a whitespace between two words is ignored. I think the rest of the output is trash that is related to the problem associated with the problem “exited”.

EDIT:

I followed the @Ivc suggestion and came up with the following code:

 def split_string(source,splitlist): result = [] lasti = -1 for i, e in enumerate(source): if e in splitlist: tmp = source[lasti+1:i] if tmp not in splitlist: result.append(tmp) lasti = i if e not in splitlist and i == len(source) - 1: tmp = source[lasti+1:i+1] result.append(tmp) return result out = split_string("This is a test-of the,string separation-code!"," ,!-") print out #>>> ['This', 'is', 'a', 'test', 'of', 'the', 'string', 'separation', 'code'] out = split_string("After the flood ... all the colors came out.", " .") print out #>>> ['After', 'the', 'flood', 'all', 'the', 'colors', 'came', 'out'] out = split_string("First Name,Last Name,Street Address,City,State,Zip Code",",") print out #>>>['First Name', 'Last Name', 'Street Address', 'City', 'State', 'Zip Code'] out = split_string(" After the flood ... all the colors came out...............", " ." print out #>>>['After', 'the', 'flood', 'all', 'the', 'colors', 'came', 'out'] 
+6
source share
6 answers

It seems you are expecting:

 source = tmp[start:] 

To change the source , that the outer loop of the loop is looped over. It will not - this loop will continue to iterate over the string that you gave it, and not any object now uses that name. This may mean that the character you are facing may not be what is left of source .

Instead of trying to do this, keep an eye on the current index in the row as follows:

 for i, e in enumerate(source): ... 

and what you add will always be source[lasti+1:i] , and you just need to track lasti .

+2
source

You do not need an internal loop call. Just that is enough:

 def split_string(source,splitlist): result = [] for e in source: if e in splitlist: end = source.find(e) result.append(source[0:end]) source = source[end+1:] return result 

You can eliminate the “garbage” (that is, an empty string) by checking if the source [: end] is an empty string or not before adding it to the list.

+3
source

I think that if you use a regular expression, you can easily get it if you want only the words in the line above.

 >>> import re >>> string="After the flood ... all the colors came out." >>> re.findall('\w+',string) ['After', 'the', 'flood', 'all', 'the', 'colors', 'came', 'out'] 
+2
source

Why do too many things, It's just that simple, try ..
str.split(strSplitter , intMaxSplitCount) intMaxSplitCount is optional
In your case, you also have to do Housekeeping, if you want to avoid ... one of them you can replace, for example, str.replace(".","", 3) 3 is optional, it will replace only the first 3 points

So, briefly you should do the following,
print ((str.replace(".", "",3)).split(" ")) it will print what you need

I did the execution, just check here, ...

0
source
 [x for x in a.replace('.', '').split(' ') if len(x)>0] 

Here 'a' is your input line.

0
source

A simpler way, at least, looks simpler.

 import string def split_string(source, splitlist): table = string.maketrans(splitlist, ' ' * len(splitlist)) return string.translate(source, table).split() 

You can check string.maketrans and string.translate

0
source

Source: https://habr.com/ru/post/916967/


All Articles