Regex to cut image urls?

Question

Regex to cut image urls?

I need to extract a bunch of image URLs from a document in which images are associated with these names:

bellpepper = "http://images.com/bellpepper.jpg"
cabbage = "http://images.com/cabbage.jpg"
lettuce = "http://images.com/lettuce.jpg"
pumpkin = "http://images.com/pumpkin.jpg"

I assume that I can detect the start of the link with:

/http:[^ ,]+/i

But how can I get all the links separated from the document?

EDIT: To clarify the question: I just want to cross out the URLs from the file minus the variable name is equal to the sign and double quotes, so I have a new file, which is just a list of URLs, one per line.

+1

url regex parsing image

boysenberry Jul 17 '09 at 12:08

source share

4 answers

If the format is constant then this should work (python):

import re
s = """bellpepper = "http://images.com/bellpepper.jpg" (...) """
re.findall("\"(http://.+?)\"", s)

: " " regexp, :)

+1

Wojciech Bederski 17 . '09 0:18

You want to say that you have this format in your document, and you just want to get the http part? you can just split the delimiter "=" without regex

$f = fopen("file","r");
if ($f){
    while( !feof($f) ){
        $line = fgets($f,4096);
        $s = explode(" = ",$line);
        $s = preg_replace("/\"/","",$s);
        print $s[1];
    }
    fclose($f);
}

on the command line:

#php5 myscript.php > newfile.ext

If you use languages other than PHP, there is a similar line-splitting method that you can use. e.g. Python / Perl split (). read your document to find out

0

ghostdog74 Jul 17 '09 at 12:29

source share

You can try this if your tool supports a positive lookbehind:

/(?<=")[^"\n]+/

0

Nakilon Nov 25 '12 at 11:14

source share

user110714 · Accepted Answer · 2009-07-17T00:48:16+0000

Try it...

(http://)([a-zA-Z0-9\/\\.])*

Regex to cut image urls?

More articles: