If you must have the exact date / time displayed on the page, you need to refer to the solution. Start by examining the index page (where your second snippet came from), then work to extract the URLs and timestamps and load them, but with a new file name.
Also, you cannot easily parse arbitrary HTML with regular expressions . Now, if you know that this HTML is pretty static in the way it is formed, you might come down with it. But be prepared for things to break if the court changes its page even to the smallest.
If you do not need this level of accuracy, you can use the HTTP response headers that come with the file when it is downloaded. From this, you can get the Last-Modified date specified by the server - the last date / time when the file was modified on the server itself. This is not necessarily the date you see on the web page, but rather when they put the file there (therefore, if there were a 2-hour backlog from production to publication, you could see this difference).
R is my RAMdisk, which I use for temp. Correct your paths as needed.
$client = New-Object system.net.WebClient; $client.DownloadFile("http://app1.co.madison.il.us/circuitclerk/dockets/63/489641.TXT","r:\tempfile.txt"); $updated = Get-Date $wc.ResponseHeaders["Last-Modified"] -Format "yyyyMMdd"; Rename-Item -Path "r:\tempfile.txt" -NewName "r:\July-Updated$updated.txt";
If you used PowerShell 3.0, you can use invoke-webrequest to get the file in memory and then write it directly to disk with the corresponding name, since invoke-webrequest returns an object containing both the response data and the headers that you then can be processed as needed.
Another option would be to contact the court and see if they have another, more convenient way for the machine to access the data. An RSS or XML feed or some other gateway designed for what you are trying to do.
alroc source share