Pythonic way to handle multiple loops with different filters in one list?

Here is a little program that I write that will create csv by classifying a file directory:

matches = [] for root, dirnames, filenames in os.walk(directory): for filename in fnmatch.filter(filenames, '*[AZ]*'): matches.append([os.path.join(root, filename), "No Capital Letters!"]) test = re.compile(".*\.(py|php)", re.IGNORECASE) for filename in filter(test.search, filenames): matches.append([os.path.join(root, filename), "Invalid File type!"]) 

In principle, the user selects the folder, and the program indicates the problem files, which can be of several types (only two listed here: without files with capital letters, without php or python files). There will probably be five or six cases.

While this works, I want to reorganize. Is it possible to do something like

 for filename in itertools.izip(fnmatch.filter(filenames, '*[AZ]*'), filter(test.search, filenames), ...): matches.append([os.path.join(root, filename), "Violation") 

being able to track which of the original unpacked lists caused a "violation?"

+6
source share
2 answers

A simpler solution would probably be to simply iterate over the files first and then apply your checks one at a time:

 reTest = re.compile(".*\.(py|php)", re.IGNORECASE) for root, dirnames, filenames in os.walk(directory): for filename in filenames: error = None if fnmatch.fnmatch(filename, '*[AZ]*'): error = 'No capital letters!' elif reTest.search(filename): error = 'Invalid file type!' if error: matches.append([os.path.join(root, filename), error]) 

This will not only simplify the logic, since you will only need to check one file (instead of defining each time how to call the verification method in the sequence of file names), it will also iterate once through the list of file names.

In addition, it will also avoid generating multiple matches for a single file name; it just adds one error (first) maximum. If you do not want this, you can make an error list instead and add it to your checks, you want to change elif to if , and then evaluate all the checks.

+4
source

I recommend you watch these slides .

David Baisley gives an example of using yield to process log files.

edit: here are two examples from pdf, one without a generator:

 wwwlog = open("access-log") total = 0 for line in wwwlog: bytestr = line.rsplit(None,1)[1] if bytestr != '-': total += int(bytestr) print "Total", total 

and with a generator (can use a function with an output for more complex examples)

 wwwlog = open("access-log") bytecolumn = (line.rsplit(None,1)[1] for line in wwwlog) bytes = (int(x) for x in bytecolumn if x != '-') print "Total", sum(bytes) 
-1
source

Source: https://habr.com/ru/post/988041/


All Articles