I would like to remove the urls from the string and replace them with the names of the original content.
For instance:
mystring = "Ah I like this site: http://www.stackoverflow.com. Also I must say I like http://www.digg.com"
sanitize(mystring)
To replace url with a header, I wrote this snipplet:
def get_title(url):
"""Returns the title of the input URL"""
output = BeautifulSoup.BeautifulSoup(urllib.urlopen(url))
return output.title.string
I somehow need to apply this function to strings where it catches URLs and converts them to headers via get_title.
source
share