Changes in IMDbPY and IMDb file format mean that existing answers no longer work (as of January 2018).
I am using Ubuntu 17.10 and MariaDB 10.1 (but not MySQL, but the following will work with MySQL as well).
Changes to IMDbPY
The latest version of IMDbPY is 6.2, it is implemented in Python 3, and the dependencies on gcc and SQLObject have been removed. In addition, the Python package MySQL-python not available for Python 3, so we install mysqlclient ; See below. (The mysqlclient API mysqlclient compatible with MySQL-python .)
Changes to the IMDb Data File Format
Changes in the format of IMDb data files were introduced in December 2017, and IMDbPY 6.2 (current version) does not yet work with the new file format. (See this is a GitHub issue.)
Until this is fixed, use the latest version of the IMDd data published in the old format, which is available at ftp://ftp.fu-berlin.de/pub/misc/movies/database/frozendata/ , download all *.list.gz files (excluding files from subdirectories).
New steps to complete
Install Python 3 and the necessary packages:
sudo apt install python3 pip3 install mysqlclient
In MariaDB, create the imdb database and grant all user privileges with password password .
CREATE DATABASE imdb; GRANT ALL PRIVILEGES ON imdb.* TO 'user'@'localhost' IDENTIFIED BY 'password'; FLUSH PRIVILEGES;
Get IMDbPY 6.2:
wget https://github.com/alberanid/imdbpy/archive/6.2.zip unzip 6.2.zip cd imdbpy-6.2 python3 setup.py install
Upload IMDb data to MariaDB:
cd bin python3 imdbpy2sql.py -d [imdb_dataset_directory] -u 'mysql://user: password@localhost /imdb'
Edit: Version 6.2 IMDbPY does not create foreign keys. See this GitHub question. You will need to use an earlier version of IMDbPY if you need foreign keys to be created, but there are also problems with generating foreign keys in older versions (see the related GitHub issue).
Update: It took 4.5 hours to import, and I had no problems using InnoDB tables.
Edit: if you want to use IMDbPY version 6.2 and require foreign keys, you will need to add them manually to the database after it is created. Before adding foreign keys, very little data cleansing is required. This cleanup and the foreign keys that need to be added are described in this GitHub issue.
source share