How about this:
Concept: because you basically only read one file, index .7z by block.
read the block of the compressed file by block, give each block a number and, possibly, an offset in a large file. scan the bindings of the target element in the data stream (for example, the names of articles on Wikipedia). For each binding record, save the number of the block in which the element started (perhaps it was in the block before)
write the index to some kind of O (log n) storage. To access, select the block number and its offset, remove the block and find the item. the cost is associated with extracting one block (or very few) and a string search in this block.
To do this, you need to read the file once, but you can transfer it and discard it after processing, so nothing gets to disk.
DARN: you basically postulated this in your question ... it seems useful to read the question before answering ...
source share