Finding the median in sql server before each date in the table

I use the query below to find the median for each sector

SELECT DISTINCT Sector, PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY Value) OVER (PARTITION BY sector) AS Median FROM TABLE 

The table is in the format below

  Sector Date Value A 2014-08-01 1 B 2014-08-01 5 C 2014-08-01 7 A 2014-08-02 6 B 2014-08-02 5 C 2014-08-02 4 A 2014-08-03 3 B 2014-08-03 9 C 2014-08-03 6 A 2014-08-04 5 B 2014-08-04 8 C 2014-08-04 9 A 2014-08-05 5 B 2014-08-05 7 C 2014-08-05 2 

So, I get the expected result below

  Sector Median A 5 B 7 C 6 

Now I need to change the process so that the medians are calculated, only taking into account the records before the specified date. So the new result will be

  Sector Date Value A 2014-08-01 1 B 2014-08-01 5 C 2014-08-01 7 (Only 1 record each was considered for A, B and C) A 2014-08-02 3.5 B 2014-08-02 5 C 2014-08-02 5.5 (2 records each was considered for A, B and C) A 2014-08-03 3 B 2014-08-03 5 C 2014-08-03 6 (3 records each was considered for A, B and C) A 2014-08-04 4 B 2014-08-04 6.5 C 2014-08-04 6.5 (4 records each was considered for A, B and C) A 2014-08-05 5 B 2014-08-05 7 C 2014-08-05 6 (All 5 records each was considered for A, B and C) 

So, this will be a kind of cumulative median. Can someone please tell me how to achieve this. My table has about 2.3 M records with approximately 1,100 records each for approximately 1,100 dates.

Please let me know if you need information.

+6
source share
2 answers

This complicates the work because the following does not work:

 SELECT DISTINCT Sector, Date, PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY Value) OVER (PARTITION BY sector ORDER BY DATE) AS Median FROM TABLE; 

Alas. You can use cross apply for this purpose:

 select t.sector, t.date, t.value, m.median from table t cross apply (select top 1 PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY t2.Value) OVER (PARTITION BY sector ORDER BY t2.DATE) AS Median from table t2 where t2.sector = t.sector and t2.date <= t.date ) m; 
+1
source

Another way is to create a triangular JOIN to get the whole past value for each day and use it as data

 ;With T AS ( SELECT t2.Sector, t2.[Date], t1.[Value] FROM Table1 t1 LEFT JOIN Table1 t2 ON t1.Sector = t2.Sector and t1.[Date] <= t2.[Date] ) SELECT DISTINCT Sector , [Date] , PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY [Value]) OVER (PARTITION BY sector, [Date]) AS Median FROM T ORDER BY [Date], Sector; 

SQLFiddle demo

In the query, I changed PERCENTILE_DISC to PERCENTILE_CONT to get the correct median in case of an even number of values, for example, the second day.

+2
source

Source: https://habr.com/ru/post/974548/


All Articles