I have a Python program that parses files, takes a path like an argument, and parses all the files in the specified path and in all subdirectories - using os.walk(path)
. I want to call this from my php web application, so the user can specify a path, which is then passed as an argument to the parser. (The transfer path is fine, because it is all on the internal network).
I can call the parser and pass the ok arguments with popen()
, but the path that the Python program gets is always invalid. I had a php script outputting the command that it sends to the browser. If I copy and paste this command into the command window, the parser works fine.
I know that the path that the php script goes through is invalid from the result of os.path.exists(path)
in a Python script
This is the code to invoke the Python program:
$path = $_REQUEST['location']; echo "Path given is: ".$path; $command = 'python C:\Workspaces\parsers\src\main\main.py '. intval($mode).' "'.$path.'"'; echo "<p>".$command."</p>"; $parser = popen($command, 'r'); if ($parser){ echo "<p>Ran the program</p>"; while (!feof($parser)){ $read = fgets($parser); if (!$read) echo "<p>Reached end of file</p>"; else echo "<p>".$read."</p>"; } }
The command displayed in the browser is as follows:
python C:\Workspaces\parsers\src\main\main.py 2 "I:\Dir1\Dir2\Dir3"
If 2 is just another script argument and $_REQUEST['location']
is determined from the text input field of the form on the calling page.
This is on a Windows system, so I assume this has something to do with the backslashes of the path.
Basically, I'm not sure how all backslashes are handled. I would like to understand how backslash strings are sent to the php page, and how they are again sent using popen()
. I think that the result that is printed in the browser is not the original command line, and I canβt be sure how many back-wraps really are in the command issued by popen ().
If anyone has any ideas, I would really appreciate it.
Edit:
So, in a Python program, the path is used as follows:
nfiles=0 print 'Parsing all files in directory tree '+path+"<br />" start = time.time() if not os.path.exists(path): print "<p>Path is NOT REAL!!!</p>" else: print "<p>Path IS real!</p>" for root, dirs, files in os.walk(path): for f in files: file = os.path.join(root,f) print file nfiles+=1 ...Code to run parser... print nfiles, "Files parsed<br />"
This is reflected in the browser from the $ read variable.
Conclusion:
Parsing all files in directory tree I:\Dir1\Dir2\Dir3 Path is NOT REAL!!! 0 Files parsed
This is identical to exiting if the command is run from the command line (the command is copied from the browser and pasted into the cmd window). EXCEPT, when it starts in this way, the path is real and all files are parsed. (and html markup is also displayed in the command window)
The web server and parsers are hosted on my local machine.