Additional warehouse load; updating measurement tables

I'm trying to figure out how to gradually load my fact and dimension tables as the data goes into our system.

Is there an easier way:

  • dim_id = select id from dim_table, where dim_table.value = 'dim value';
  • if rowcount == 0 → insert into dim_table ...
  • insert into actual values ​​(dim, measure) (dim_id, 23131)

if I got 10 measurements, loading becomes quite cumbersome.

+3
source share
2 answers

Do you really need them to be incremental? You can not use UUID?

I do not understand why you need dim_table.

If you are using Star Schema, here is how you can make it work.

Fact_table
----------
time_id          character(36)
geographic_id    character(36)
measure          whatyouwant

Dim Time
--------
time_id    character(36) (That matches the time_id inside your fact table)
...
...

Dim Geogrphic
-------------
geographic_id character(36) (that matches the geographic_id inside your fact_Table)
....
....

, uuid . , , uuid, .

: , http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html

0

. , ETL, , . DW, - , , .

, , . , EsperTech . ETL, . Kettle (Talend, SSIS,..), .

0

Source: https://habr.com/ru/post/1779614/


All Articles