Expression to remove URLs from Twitter tweets

Question

Expression to remove URLs from Twitter tweets

I just wanted to find and replace all occurrences of twitter url in line (tweet):

Input:

This is a tweet from the URL: http://t.co/0DlGChTBIx

Output:

This is a tweet from the URL:

I tried this:

p=re.compile(r'\<http.+?\>', re.DOTALL) tweet_clean = re.sub(p, '', tweet)

+12

python string regex

hagope Jun 25 '14 at 3:45

source share

5 answers

In the following regular expression, two matched groups will be written: the first includes everything in a tweet until the URL and the second understand everything after the URL (empty in the example that you specified above):

 import re str = 'This is a tweet with a url: http://t.co/0DlGChTBIx' clean_tweet = re.match('(.*?)http.*?\s?(.*?)', str) if clean_tweet: print clean_tweet.group(1) print clean_tweet.group(2) # will print everything after the URL

+2

alfasin Jun 25 '14 at 3:59

source share

You can try the following re.sub function to remove the URL link from your string,

 >>> str = 'This is a tweet with a url: http://t.co/0DlGChTBIx' >>> m = re.sub(r':.*$', ":", str) >>> m 'This is a tweet with a url:'

It removes everything after the first character : and : in the replacement line adds : last.

This will print all characters that were just before the character :

 >>> m = re.search(r'^.*?:', str).group() >>> m 'This is a tweet with a url:'

0

Avinash raj Jun 25 '14 at 4:35

source share

Try using this:

 text = re.sub(r"http\S+", "", text)

0

Garima rawat Jun 14 '18 at 9:43

source share

clean_tweet = re.match ('(. *?) http (. *?) \ s (. *)', content)

while (clean_tweet):
content = clean_tweet.group (1) + "" + clean_tweet.group (3)
clean_tweet = re.match ('(. *?) http (. *?) \ s (. *)', content)

0

nancy agarwal Jun 17 '19 at 13:33

source share

zx81 · Accepted Answer · 2014-06-25T03:51:43+0000

Do it:

 result = re.sub(r"http\S+", "", subject)

http matches literal characters
\S+ matches all characters without spaces (end of URL)
replace the empty string

Expression to remove URLs from Twitter tweets

More articles: