Engaging a sample of your database - Postgresql

we have a semi-large database that takes a long time to import locally on our individual machines. I was wondering if there are any tools that could connect to the database, analyze a little and extract the sample database, preserving all the relationships through the passed parameter. Something like pg_sample --rough_size 1000 --dbhost mydbhost --dbuname uname --dbpword pword this creates a dump that I can import into my database for testing and development. But it does not take 45 minutes.

thanks

+6
source share
2 answers

I wrote a tool like this: https://github.com/mla/pg_sample

From README:

pg_sample is a utility for exporting a small set of sample data from a larger PostgreSQL database. The output and command line options closely resemble the pg_dump backup utility (although text format is only supported).

The created sample database includes all the tables from the original, supports referential integrity and supports circular dependencies.

+4
source

I suggest you explore pg_dump --format=custom and pg_restore -j 12 . The custom format allows parallel recovery (and, if necessary, can be reset to raw SQL, so you won’t lose any functionality here). The -j pg_restore to pg_restore forces it to start recovery in parallel. This can provide substantial speed. Especially if you started your work_mem service to good and big.

It will not solve the problem that you asked for the reasons stated by Erwin above, and because it requires a level of knowledge about what is important to represent and what is not, which is simply not in the scheme. This is something that takes the developer's time and usually should be done at the same time that the circuit is built first.

+2
source

Source: https://habr.com/ru/post/908053/


All Articles