My question is related to the internal principles of Postgres:
I have a table:
CREATE TABLE A ( id SERIAL, name VARCHAR(32), type VARCHAR(32) NOT NULL, priority SMALLINT NOT NULL, x SMALLINT NOT NULL, y SMALLINT NOT NULL, start timestamp with time zone, end timestamp with time zone, state Astate NOT NULL, other_table_id1 bigint REFERENCES W, other_table_id2 bigint NOT NULL REFERENCES S, PRIMARY KEY(id) );
with additional indices on other_table_id1, state and other_table_id2.
The table is quite large and sees a lot of updates in the columns: other_table_id1, state. A few updates for the start and end columns, but the rest are immutable. (Astatine is an enumerated type for the state of a column.)
I am wondering if it makes sense to split the two most frequently updated columns into a separate table. What I hope to get is performance when I just look at this information or reduce the weight of updates, because (maybe?) Reading and writing a shorter line is less expensive. But I need to weigh this against the cost of joins when they are (sometimes) necessary so that all the data for a particular item is right away.
At some point, I got the impression that each column is stored separately. But later I changed my mind when I read somewhere that decreasing the width of a column on one side of a table has a positive effect on performance when searching for data using another column (since the row is stored together, so the total row length will be shorter). Therefore, I now have the impression that all the data for the row is physically stored together on disk; therefore, the proposed splitting of the table sounds as if it would be useful. When I am currently writing 4 bytes to update the state, can I rewrite 64 bytes of text (name, type) that actually never change?
I am not very good at โnormalizingโ a table and am not familiar with Postgres internal structures, so Iโm looking for tips and tricks to evaluate a compromise without first having to do the work and then determine if the work was worthwhile. This change will require considerable effort to rewrite queries that have already been optimized, so I would like to better understand what result I can expect. Thank you, m.