While I did not solve the problem, there are possible reasons why the results are not always the same (roughly ordered, most likely / easiest to fix, to most unlikely / harder to fix). I am also trying to give a solution after a problem.
- Human error - you miss the number / made a typo when you copied the result from one shell onto paper: Registration. Create
2017-12-31-23-54-experiment-result.log
for each experiment you run. Not manually, but an experiment creates it. Yes, a timestamp in the title to make it easier to find again. All subsequent ones must be recorded in this file for each individual experiment. - Code changed: version control (e.g. git)
- configuration file changed : version control
- Pseudo-random number changed: set the seed for random / tensorflow / numpy (yes, you may need to set multiple seeds)
- Loading data in different ways / in a different order: version control + seed (is the preprocessing really the same?)
- Environment variables changed : Docker
- Software (version) changed by: Docker
- Changed driver (version) : logging
- Hardware changed: Registration
- Hardware / software has some reproducibility issues . For example, the fact that floating point multiplication is not associative , and different cores on the GPU can finish the calculations at different times (I'm not sure about this)
- Hardware has errors
In any case, launching the βsameβ thing several times can help get a gut feeling, like different things.
Writing paper
If you write a document, I think the following best practice for reproducibility:
- Add a link to the repository (e.g. git) where all the code is
- The code must be containerized (e.g. Docker)
- If there is Python code and
requirements.txt
, you should specify the exact version of the software , not something like tensorflow>=1.0.0
, but tensorflow==1.2.3
- Add the git hash of the version used for experimentation. It can be different hashes if you change something between them.
- Always record driver information (such as for nVidia ) and hardware . Add this to the appendix of your article. Therefore, in the case of subsequent changes, you can at least check whether a change has occurred that could lead to the numbers being different.
To register versions you can use something like this:
#!/usr/bin/env python # core modules import subprocess def get_logstring(): """ Get important environment information that might influence experiments. Returns ------- logstring : str """ logstring = [] with open('/proc/cpuinfo') as f: cpuinfo = f.readlines() for line in cpuinfo: if "model name" in line: logstring.append("CPU: {}".format(line.strip())) break with open('/proc/driver/nvidia/version') as f: version = f.read().strip() logstring.append("GPU driver: {}".format(version)) logstring.append("VGA: {}".format(find_vga())) return "\n".join(logstring) def find_vga(): vga = subprocess.check_output("lspci | grep -i 'vga\|3d\|2d'", shell=True, executable='/bin/bash') return vga print(get_logstring())
which gives something like
CPU: model name : Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz GPU driver: NVRM version: NVIDIA UNIX x86_64 Kernel Module 384.90 Tue Sep 19 19:17:35 PDT 2017 GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5) VGA: 00:02.0 VGA compatible controller: Intel Corporation Skylake Integrated Graphics (rev 06) 02:00.0 3D controller: NVIDIA Corporation GM108M [GeForce 940MX] (rev a2)
source share