I create a spider that will navigate through various sites and their intellectual analysis.
Since I need to get each page separately, it can take a VERY long time (maybe 100 pages). I already set set_time_limit to 2 minutes per page, but it seems that apache will kill the script in 5 minutes no matter what.
This is usually not a problem, as it will work from cron or something similar that does not have this time limit. However, I would also like administrators to be able to run the selection manually through the HTTP interface.
It doesn't matter if apache is maintained for the full duration, I am going to use AJAX to start fetching and validation after a while using AJAX.
My problem is how to start fetching from a PHP script without completing the extraction at the end of the script call.
Maybe I could use a system ('script.php &'), but I'm not sure if this will do the trick. Any other ideas?
source share