Implications of using ADD COLUMN for a large dataset

The docs for Redshift say:

ALTER TABLE locks the table for reads and writes until the operation completes.

My question is:
Let's say I have a table with 500 million rows, and I want to add a column. It sounds like a hard operation that can lock the table for a long time - right? Or is this actually a quick operation since Redshift is a columnar db? Or does it depend on whether the column is NULL / has a default value?

+4
source share
3 answers

I find that adding (and deleting) columns is a very fast operation even in tables with many billions of rows, regardless of whether there is a default value or just NULL.

, , , , . ( ) node.

+5

65M Redshift , . node dw2.large(SSD).

, () , .., - .

+3

, . .

  • N_OLD_TABLE
  • /
  • N_OLD (old_columns) select (old_columns) old_table OLD_Table OLD_TABLE_BKP
  • N_OLD_TABLE OLD_TABLE

This is a much faster process. Does not lock any table, and you always have a backup of the old table if something goes wrong.

+1
source

Source: https://habr.com/ru/post/1542934/


All Articles