I have a script, which, in my opinion, is a fairly simple scraper, call it what you want, but it takes an average of at least 6 seconds ... is it possible to speed it up? The $ date variables exist only for code synchronization and do not add anything significant to the time it takes. I set two timestamps, and each of them is approximately 3 seconds. Example URL below for testing
$date = date('m/d/Y h:i:s a', time()); echo "start of timing $date<br /><br />"; include('simple_html_dom.php'); function getUrlAddress() { $url = $_SERVER['HTTPS'] == 'on' ? 'https' : 'http'; return $url .'://'.$_SERVER['HTTP_HOST'].$_SERVER['REQUEST_URI']; } $date = date('m/d/Y h:i:s a', time()); echo "<br /><br />after geturl $date<br /><br />"; $parts = explode("/",$url); $html = file_get_html($url); $date = date('m/d/Y h:i:s a', time()); echo "<br /><br />after file_get_url $date<br /><br />"; $file_string = file_get_contents($url); preg_match('/<title>(.*)<\/title>/i', $file_string, $title); $title_out = $title[1]; foreach($html->find('img') as $e){ $image = $e->src; if (preg_match("/orangeBlue/", $image)) { $image = ''; } if (preg_match("/BeaconSprite/", $image)) { $image = ''; } if($image != ''){ if (preg_match("/http/", $image)) { $image = $image; } elseif (preg_match("*//*", $image)) { $image = 'http:'.$image; } else { $image = $parts['0']."//".$parts[1].$parts[2]."/".$image; } $size = getimagesize($image); if (($size[0]>110)&&($size[1]>110)){ if (preg_match("/http/", $image)) { $image = $image; } echo '<img src='.$image.'><br>'; } } } $date = date('m/d/Y h:i:s a', time()); echo "<br /><br />end of timing $date<br /><br />";
URL example
UPDATE
This is really what timestamps show:
countdown 01/24/2012 12:31:50 a.m.
after geturl 12/24/2012 12:31:50 am
after file_get_url 01/24/2012 12:31:53 am
end of time 01/24/2012 12:31:57 a.m.
http://www.ebay.co.uk/itm/Duke-Nukem-Forever-XBOX-360-Game-BRAND-NEW-SEALED-UK-PAL-UK-Seller-/170739972246?pt=UK_PC_Video_Games_Video_Games_JS&hash=item27c0e53896`
source share