PostgreSQL development workflow

I am starting to build a new database for my project using PostgreSQL. (I am new to PostgreSQL and the database.)

I think my development workflow is very bad, and here is part of it:

  • create a table / view / function using pgAdmin.
  • determine the file name before saving the code.
    The goal is to be able to recreate the database automatically by running all saved scripts,
    I need to know how to run these scripts due to dependency.
    Therefore, I add a number for each file with an order. e.g. 001_create_role_user.ddl, 002_create_database_project.ddl, 013_user_table.ddl
  • save code.
  • Transfer the file to the repository using GIT.

Here are some errors that I can think of:

  • I can easily forget what changes I made. For example, a new type is created, or an edited comment
  • It is difficult to determine the name (order) for the file.
  • Changing the code will be a pain in the ass, especially when the new code changes the order.

So my workflow is bad. I was wondering what the Postgres developers workflow looks like.

Are there any good tools (free or cheap) for editing and saving scripts? can a good IDE be?

It would be great if I can create automated unit tests for the database.

Any tool to recreate the database? CI Server Tool?

Basically I am looking for any advice, good practice or a good tool for developing databases.

(Sorry, this question is not suitable for the Q & A format, but I do not know where else to ask this question.)

+4
source share
5 answers

Check liquibase . We use it at the company I work for to set up our PostgreSQL database. It is open source, easy to use, and a changelog file that you can ultimately add to the source control. Each changeset receives an identifier, so that each set of changes is run only once. As a result, you will get two additional tables for tracking changes in the database when it starts.

While it is not DB agnostic, you can use PostgreSQL SQL directly in each change set, and each change set can have its own comments.

The only caveat to using it is that you should caution yourself and other people against reusing the change set after applying it to the database. Any changes to an already applied change set result in another checksum (even a space), which can cause Liquibase to stop updating it. This may result in unsuccessful database updates in the field, so each update for any of the changes must first be checked locally. Instead, all changes, however minor, should be inserted into the new changeset with a new identifier. They have a changeets subtag called "validCheckSum" so you can get around this, but I think it's best to try to ensure that a new set of changes is constantly created.

Here are the doc links for creating a table and creating a view , for example.

+2
source

Well, your question is actually very important for any database developer, and if I understand it correctly, there is another way to achieve the desired results.

It is interesting to note that your idea of ​​splitting various changes into different files is the concept of Ruby On Rails migration. You can even use rake to track workflows like yours.

But now, what I think is your decision. PostgreSQL and others, to be honest, have specific utilities for processing data and schemas, for example, what you probably need.

The pg_dumpall command line executable will download the entire database to a file or console so that the psql utility can simply "reload" into the same or another (virgin) database.

So, if you want to save only the current schema (without data!) Of a working database cluster, you can, as a user of the postgres process owner:

$ pg_dumpall --schema-only > schema.sql 

Now, schema.sql will contain exactly the same users / databases / tables / triggers / etc, but not the data. If you want to have a "full backup" dump style (and one way to make a full backup of the database), simply remove the "only schema" option from the command line.

You can reload the file into another (it must be virgin, you can ruin the database with other data):

 $ psql -f schema.sql postgres 

Now, if you want to dump only one database, one table, etc., you should use the pg_dump utility.

 $ pg_dump --schema-only <database> > database-schema.sql 

And then, to reload the database into a running postgresql server:

 $ psql <database> < database-schema.sql 

As for version control, you can just save the schema.sql file under it and just dump the database back into the file before each vc commit. That way, in some particular version control state, you will have the code and the working database schema that go with it.

Oh, and all the tools I mentioned are free, and pg_dump and pg_dumpall come with a standard PostgreSQL installation.

Hope this helps,

Marco

+1
source

You're not far off. I am a Java developer, not a database administrator, but creating a database as the project grows is an important task for the teams I have been to, and this is how I saw how it was done best:

  • All changes to the database are based on DDL text script scripts (SQL create, alter or delete). No changes using the database client. Use a text editor that supports syntax highlighting, such as vim or notepad ++, since highlighting can help you find errors before running the script.
  • Use the number at the beginning of each DDL script to determine the order in which scripts are run. Base scripts have lower numbers. As the project grows, use changes to new scripts to change the table; do not redefine the table in the initial script.
  • Use the script and psql client to load the lowest to highest DDL scripts. This uses a bash script. You can use it as a base for a .bat script for windows.

#! / Bin / bash

PGDATABASE export = your_db export

PGUSER export = your_user export

export PGPASSWORD = your_password

for SQL_SCRIPT in $ (find ./- name "* .sql" -print | sort);
to do

 echo "**** $SQL_SCRIPT ****" psql -q < $SQL_SCRIPT 

to do

  • As the project grows, use new scripts to modify the table; do not redefine the table in the initial script.

  • All scripts are checked in the source control. Each release is flagged so that you can update this version of the database in the future.

  • For unit testing and CI, most CI servers can run a script to delete and recreate the circuit. One of the cited PostGresql unit testing frameworks is pgTAP

+1
source

I am the database administrator and my workflow is almost equal to the one suggested by @Ireeder ... but besides using the shell script to support updated ddl scripts, I use a tool called dbmaintain DBMaintain

DbMaintain needs some configuration, but it is not a pain ... It maintains control over the execution of scripts and in what order. The main advantage is that if the sql script that has already been executed is changed, it complains by default or only runs this script (if configured for this) ... This behavior works when you add a new script in the environment ... it only runs this new script.

it is ideal for deploying and maintaining the development and production environment up to date ... it is not necessary to execute all scripts every time (for example, the shell proposed by Ireeder) or you must manually execute each new script.

+1
source

If the changes have time intervals, you can create scripts that modify the DDL and reset the new state (version) of the expected database.

 pg_dump -f database-dump-production-yesterday.sql // all commands to create populate a startup 

Today you need to enter a new table in a new function

 psql -f change-production-for-today.sql // DDL and DML commands to make database reflect the new state pg_dump --schema -f dump-production-today.sql // all new commands to create database for today app psql -i sql-append-table-needed-data-into-dump.sql -f dump-production-today.sql 

All developers should use the new database by creating a script from now on.

0
source

Source: https://habr.com/ru/post/1494406/


All Articles