Implications of using ADD COLUMN for a large dataset

Question

Implications of using ADD COLUMN for a large dataset

The docs for Redshift say:

ALTER TABLE locks the table for reads and writes until the operation completes.

My question is:
Let's say I have a table with 500 million rows, and I want to add a column. It sounds like a hard operation that can lock the table for a long time - right? Or is this actually a quick operation since Redshift is a columnar db? Or does it depend on whether the column is NULL / has a default value?

+4

amazon-redshift

Anentropic Jun 2 '14 at 14:35

source share

3 answers

Joe Harris · Answer 1 · 2014-06-04T09:32:13+0000

I find that adding (and deleting) columns is a very fast operation even in tables with many billions of rows, regardless of whether there is a default value or just NULL.

, , , , . ( ) node.

novabracket · Answer 2 · 2014-06-03T09:23:40+0000

65M Redshift , . node dw2.large(SSD).

, () , .., - .

Rakesh singh · Answer 3 · 2014-06-10T18:28:56+0000

, . .

N_OLD_TABLE
/
N_OLD (old_columns) select (old_columns) old_table OLD_Table OLD_TABLE_BKP
N_OLD_TABLE OLD_TABLE

This is a much faster process. Does not lock any table, and you always have a backup of the old table if something goes wrong.

Implications of using ADD COLUMN for a large dataset

More articles: