If you are returning a list of directories full of links in the corresponding XHTML document, you can use the DOMDocument and code like the following to get the list of files:
$doc = new DOMDocument(); $doc->preserveWhitespace = false; $doc->load('directorylisting.html'); $files = $doc->getElementsByTagName('a');
$files now a DOMElement list, which you can DOMElement through and get the href attribute to get the full path to the files in the list.
Note that this approach requires a well-formed directory list returned from the server. You cannot, for example, execute a request on stackoverflow.com and get a list of files in a directory.
If this does not work (possibly incorrect HTML), you can use regular expressions (e.g. preg_match_all ) to find the <a tags, for example:
preg_match_all('@<a href\="([a-zA-Z\.\-\_\/ ]*)">(.*)</a>@', file_get_contents('http://www.ibiblio.org/pub/'), $files); var_dump($files);
$files will still match elements, just a collection of arrays.
UPDATE, I tested your URL ( http://www.ibiblio.org/pub/ ) and it works fine ( preg_match_all method).
source share