I need to extract the src element from all image tags in an HTML document.
So, the input is an HTML page, and the output will be a list of a URL pointing to images: ex ... http://www.google.com/intl/en_ALL/images/logo.gif
The following is what I came up with:
<img\s+src=""(http:
This does not work for tags where src is not located immediately after the img tag, for example:
<img height="1px" src="spacer.gif">
Can someone help fill out this regex? This is pretty easy, but I thought it might be a faster way to get an answer.
James
source
share