I'm trying to find how much stress PostgreSQL puts on disks, and the results so far are discouraging. Please take a look at the methodology, apparently, I missed something or calculated the numbers incorrectly.
Environment
PostgreSQL 9.6.0-1.pgdg16.04 + 1 runs inside a separate LXC container with Ubuntu 16.04.1 LTS (kernel version 4.0.0-38, ext4 file system on top of SSD), it has only one client connection, from which I run tests .
I disabled autovacuum to prevent unnecessary entries. The calculation of the written bytes is performed by the following command: I want to find the total number of bytes written by all PostgreSQL processes (including WAL writer):
pgrep postgres | xargs -I {} cat /proc/{}/io | grep ^write_bytes | cut -d' ' -f2 | python -c "import sys; print sum(int(l) for l in sys.stdin)"
Test
With the # sign, I marked the database command, with → I marked the result of the write_bytes sum after the database command. The test test is simple: a table with one int4 column filled with 10,000,000 values.
Before each test, I run a set of commands to free up disk space and prevent additional entries:
# DELETE FROM test_inserts;
Test # 1: Unprepared Table
As stated in the documentation, changes to the UNLOGGED table UNLOGGED not written to the WAL log, so this is a good time to start:
# CREATE UNLOGGED TABLE test_inserts (f1 INT); → 1526276096 # INSERT INTO test_inserts SELECT generate_series(1, 10000000); → 1902977024
The difference is 376700928 bytes (~ 359 MB), which makes sense (ten million 4-byte integers + lines, pages and other costs), but it still looks too much, almost 10x the actual data size.
Test No. 2: a table without a primary file with a primary key
# CREATE UNLOGGED TABLE test_inserts (f1 INT PRIMARY KEY); → 2379882496 # INSERT INTO test_inserts SELECT generate_series(1, 10000000); → 2967339008
The difference is 587456512 bytes (~ 560 MB).
Test No. 3: the usual table
# CREATE TABLE test_inserts (f1 INT); → 6460669952 # INSERT INTO test_inserts SELECT generate_series(1, 10000000); → 7603630080
There the difference is already 1142960128 bytes (~ 1090 MB).
Test No. 4: a regular table with a primary key
# CREATE TABLE test_inserts (f1 INT PRIMARY KEY); → 12740534272 # INSERT INTO test_inserts SELECT generate_series(1, 10000000); → 14895218688
Now the difference is 2154684416 bytes (~ 2054 MB), and after about 30 seconds an additional 100 MB were written.
In this test case, I did a breakdown by process:
Process | Bytes written /usr/lib/postgresql/9.6/bin/postgres | 0 \_ postgres: 9.6/main: checkpointer process | 99270656 \_ postgres: 9.6/main: writer process | 39133184 \_ postgres: 9.6/main: wal writer process | 186474496 \_ postgres: 9.6/main: stats collector process | 0 \_ postgres: 9.6/main: postgres testdb [local] idle | 1844658176
Any ideas, suggestions on how to measure the values I'm looking for correctly? Maybe this is a kernel bug? Or is PostgreSQL really writing so much?
Edit: To check what write_bytes means, I wrote a simple python script that proved that this value is the actual value of the written bytes.
Edit 2: For PostgreSQL 9.5. In test example # 1, 362577920 bytes are shown, in test # 4 2141343744 bytes are shown, so it does not apply to the PG version.
Edit 3: Richard Hookston mentions Database Page Layout , and I would like to clarify: I agree with the storage cost including 24 bytes of the row header, 4 bytes of data and even 4 bytes for data alignment (usually 8 bytes), which gives 32 bytes per line and with so many lines - about 320 MB per table, and this is what I got with test No. 1. I could assume that the primary key in this case should be about the same size as the data, and its test number No. 4, both the data and the PC, will be written to the WAL. This gives something like 360MB x 4 = 1.4GB, which is less than the result I got.