Is it possible (at least theoretically) to read in Huffman ZIP encoding and then translate the regular expression into Huffman code? Could this be more efficient than compressing the data first and then running the regex?
(Note: I know that it will not be so simple: you will also have to deal with other aspects of encoding a ZIP layout, file layout, block structures, backlinks and mdash, but you can imagine that it can be quite easy.)
EDIT: Also note that it is probably much wiser to use a zipfile solution.
source share