I am trying to use a python script to edit a large directory of .html files in a loop. I'm having problems flashing files with os.walk (). This piece of code simply turns the html files into strings that I can work with, but the script does not even go into the loop, as if the files do not exist. Mostly it prints point1, but never reaches point2. The script ends without an error message. The directory is configured inside a folder called "amazon", and inside it is one of 20 subfolders with 20 html files in each of them.
Oddly enough, the code works fine in the neighboring directory, which contains only .txt files, but for some reason it does not capture my .html files. Is there something I don’t understand about the structure of the loop for root, dirs, filenames in os.walk()? This is my first time I used os.walk, and I looked at a number of other pages on this site to try and get it working.
import os
rootdir = 'C:\filepath\amazon'
print "point1"
for root, dirs, filenames in os.walk(rootdir):
print "point2"
for file in filenames:
with open (os.path.join(root, file), 'r') as myfile:
g = myfile.read()
print g
Any help is greatly appreciated.
source
share