Since, as you said, distributions are not the modules they contain, we are faced with a problem: the typical installation process for the distribution β which is afaik, the set of packages with the installer β is downloaded, unzipped, and then run setup.py, which processes the rest of the installation process.
The result is that even with the Python distribution in mind, you cannot say what setup.py will do without running it. There may be agreements, and you can pull out a lot of information and formulate a lot of good guesses, but running this setup.py file is really the only way to see what it really installs in the siteβs packages. Therefore, parse_requirements , or indeed any part of the package, will not really be useful to you unless you are interested in distributions.
So, I think the best way to deal with your problem is:
- Setting up a virtual environment without site packages
pip -r requirements.txt to actually install all packages- We go through
sys.path , look for .py, .pyc and in subfolders for __init__.py? files __init__.py? to create a list of modules. - Kill this virtual one and move along the path.
Step three can be performed in other, better ways, I'm not sure. In addition, you still run the risk of losing dynamically created modules or another trick, but this should cover most modules.
Edit:
Here is the code that should work for everything except zip files:
import sys, os def walk_modules_os(root): def inner_walk(dir_path, mod_path): filelist = os.listdir(dir_path) pyfiles = set() dirs = [] for name in filelist: if os.path.isdir(os.path.join(dir_path, name)): dirs.append(name) else: pre, ext = os.path.splitext(name) if ext in ('.py', '.pyc', '.pyo'): pyfiles.add(pre) if len(mod_path): if '__init__' not in pyfiles: return pyfiles.remove('__init__') yield mod_path for pyfile in pyfiles: yield mod_path + (pyfile,) for directory in dirs: sub = os.path.join(dir_path, directory) for mod in inner_walk(sub, mod_path + (directory,)): yield mod root = os.path.realpath(root) if not os.path.isdir(root): return iter([]) return iter(inner_walk(root, tuple()))
Edit 2:
Well, screams. GWW has the right idea. A much better solution than mine.