There are several options for storing passwords and other secrets that Python should use, especially a program that should run in the background, where it cannot just ask the user to enter a password.
Problems to avoid:
- Password verification in the version control system, where other developers or even the public can see it.
- Other users on the same server read the password from the configuration file or source code.
- Having a password in the source file, where others can see it over your shoulder while you are editing it.
Option 1: SSH
This is not always an option, but probably the best. Your private key is never transmitted over the network, SSH just does the math to prove that you have the right key.
For this to work, you need the following:
- The database or whatever you are accessing must be accessible via SSH. Try to find "SSH" plus any service you are accessing. For example, "ssh postgresql" . If this is not a function in your database, go to the next option.
- Create an account to start the service that will make calls to the database, and generate the SSH key .
- Either add the public key to the service you are about to call, or create a local account on this server and set the public key there.
Option 2: environment variables
This one is the easiest, so it can be a good start. This is well described in the Twelve Factors appendix. The basic idea is that your source code simply extracts a password or other secrets from environment variables, and then you configure these environment variables on each system in which you run the program. It can also be nice if you use defaults that will work for most developers. You must balance this in order to make your software "secure by default."
Here is an example that retrieves the server username and password from environment variables.
import os server = os.getenv('MY_APP_DB_SERVER', 'localhost') user = os.getenv('MY_APP_DB_USER', 'myapp') password = os.getenv('MY_APP_DB_PASSWORD', '') db_connect(server, user, password)
See how to set environment variables on your operating system, and think about starting the service under your own account. Thus, you do not have sensitive data in environment variables when you run programs under your account. When you configure these environment variables, be especially careful that other users cannot read them. Check file permissions, for example. Of course, any user with root privileges can read them, but there is nothing to be done about it.
Option 3: configuration files
This is very similar to environment variables, but you are reading secrets from a text file. I still find environment variables more flexible for things like deployment tools and continuous integration servers. If you decide to use a configuration file, Python supports several formats in the standard library, such as JSON , INI , netrc, and XML . You can also find external packages such as PyYAML and TOML . Personally, I think JSON and YAML are the easiest to use, and YAML lets you comment.
Three things to consider with configuration files:
- Where is the file? A default location is possible, for example
~/.my_app , and a command-line option to use a different location. - Make sure that other users cannot read the file.
- Obviously, do not commit the configuration file in the source code. You might want to commit a template that users can copy to their home directory.
Option 4: Python module
Some projects simply put their secrets directly into the Python module.
# settings.py db_server = 'dbhost1' db_user = 'my_app' db_password = 'correcthorsebatterystaple'
Then import this module to get the values.
One project that uses this technique is Django . Obviously, you should not commit settings.py to the version control system, although you might want to commit a file called settings_template.py that users can copy and modify.
I see several problems with this technique:
- Developers may accidentally transfer a file to a version control system. Adding it to
.gitignore reduces this risk. - Part of your code is not under source control. If you are disciplined and enter only lines and numbers here, this will not be a problem. If you start writing log filter classes here, stop!
If your project already uses this technique, it's easy to switch to environment variables. Just move all the settings to environment variables and change the Python module to read from these environment variables.