Replace punctuation with a space

I have a problem with the code and cannot figure out how to move forward.

tweet = "I am tired! I like fruit...and milk" clean_words = tweet.translate(None, ",.;@#?!&$") words = clean_words.split() print tweet print words 

Output:

 ['I', 'am', 'tired', 'I', 'like', 'fruitand', 'milk'] 

What I would like is to replace punctuation with a space, but I don’t know which function or to use a loop. Can anybody help me?

+15
source share
5 answers

This is easy to do by changing the "layout" as follows:

 import string tweet = "I am tired! I like fruit...and milk" translator = string.maketrans(string.punctuation, ' '*len(string.punctuation)) #map punctuation to space print(tweet.translate(translator)) 

It works on my machine with Python 3.5.2 and 2.x. Hope this works for you too.

+22
source

Here is a regex-based solution that has been tested in Python 3.5.1. I think it's simple and red.

 import re tweet = "I am tired! I like fruit...and milk" clean = re.sub(r""" [,.;@#?!&$]+ # Accept one or more copies of punctuation \ * # plus zero or more copies of a space, """, " ", # and replace it with a single space tweet, flags=re.VERBOSE) print(tweet + "\n" + clean) 

Results:

 I am tired! I like fruit...and milk I am tired I like fruit and milk 

Compact version:

 tweet = "I am tired! I like fruit...and milk" clean = re.sub(r"[,.;@#?!&$]+\ *", " ", tweet) print(tweet + "\n" + clean) 
+6
source

There are several ways to solve this problem. I have one that works, but I think that it is suboptimal. Hopefully someone who knows regex better will come and improve the answer or suggest the best.

Your question is tagged python-3.x, but your code is python 2.x, so my code is 2.x. I include a version that works in 3.x.

 #!/usr/bin/env python import re tweet = "I am tired! I like fruit...and milk" # print tweet clean_words = tweet.translate(None, ",.;@#?!&$") # Python 2 # clean_words = tweet.translate(",.;@#?!&$") # Python 3 print(clean_words) # Does not handle fruit...and regex_sub = re.sub(r"[,.;@#?!&$]+", ' ', tweet) # + means match one or more print(regex_sub) # extra space between tired and I regex_sub = re.sub(r"\s+", ' ', regex_sub) # Replaces any number of spaces with one space print(regex_sub) # looks good 
+1
source

I'm not sure I fully understand your requirements, but have you thought of adding another line to your current code:

 >>> a=['I', 'am', 'tired', 'I', 'like', 'fruitand', 'milk'] >>> " ".join(a) 'I am tired I like fruitand milk' 

Is this what you ask for, or do you need something more specific? Best wishes.

-1
source

If you are using Python 2.x, you can try:

 import string tweet = "I am tired! I like fruit...and milk" clean_words = tweet.translate(string.maketrans("",""), string.punctuation) print clean_words 

For Python 3.x, it works:

 import string tweet = "I am tired! I like fruit...and milk" transtable = str.maketrans('', '', string.punctuation) clean_words = tweet.translate(transtable) print(clean_words) 

These pieces of code remove all punctuation characters from the string.

-1
source

Source: https://habr.com/ru/post/1240867/


All Articles