Custom Data Warehouse - Design Q

What is the best way to store user data compared to measuring date / time? Usecase I try to store user actions per day, per hour. For example, the number of Stocks, friends, friends, etc. I have a time table and a date table. This is easy for a while - I have each line = user_id and colunms = 1 to 24 for every hour of the day. But the problem is dates. If I give every day = 1 colunm, then I will have 365 colungs per year. I cannot archive the data path, because analytic data also needs data. What are other strategies?

+3
source share
2 answers

enter image description here

dimDate : 1 row per date
dimTime : 1 row per minute

" " , .

, TimeKey "23:59".

, TimeKey "HH: 59".

, TimeKey "HH: MM"

15 , TimeKey "HH: 14", "HH: 29", "HH: 44", "HH: 59"

...

-- How many new friends did specific user gain
-- in first three months of years 2008, 2009 and 2010
-- between hours 3 and 5 in the morning
-- by day of week
-- not counting holidays ?

select
      DayOfWeek
    , sum(NewFriends) as FriendCount
from factUserAction as f
join dbo.dimUser    as u on u.UserKey = f.UserKey
join dbo.dimDate    as d on d.DateKey = f.DateKey
join dbo.dimTime    as t on t.TimeKey = f.TimeKey
where CalendarYear between 2008 and 2010
  and MonthNumberInYear between 1 and 3
  and t.Hour between 3 and 5
  and d.IsHoliday = 'no'
  and UserEmail = 'john_doe@gmail.com' 
group by DayOfWeek
order by DayOfWeek ;
+5

, , day_of_year.

, , , , , -, ?

user_activity_facts(
   time_key references time_dimension(time_key)
  ,user_key references user_dimension(user_key)
  ,measure1
  ,measure2
  ,measure3
  ,primary key(time_key, user_key)
)
partition by range(time_key)(
   ...
)
+1

Source: https://habr.com/ru/post/1791021/


All Articles