Close Timestamp SQL Query Group

I have a table with a timestamp column. I would like to be able to group by identifier column (e.g. cusip), summarize by another column (e.g. quantity), but only for rows that are within 30 seconds of each other, i.e. not in a fixed 30 seconds. Given the data:

  cusip |  quantity |  timestamp
 ============= | ========= | =============
 BE0000310194 |  100 |  16: 20: 49.000
 BE0000314238 |  50 |  16: 38: 38.110
 BE0000314238 |  50 |  16: 46: 21.323
 BE0000314238 |  50 |  16: 46: 35.323

I would like to write a query that returns:

  cusip |  quantity
 ============= | ==========
 BE0000310194 |  100
 BE0000314238 |  50
 BE0000314238 |  100

Edit: In addition, this will greatly simplify the situation if I can also get MIN (timestamp) from the request.

+5
source share
2 answers

The following may be helpful to you.

Grouping 30-second time periods. Here he is "2012-01-01 00:00:00". DATEDIFF counts the number of seconds between a timestamp value and a time indication. Then it is divided by 30 to get a grouping column.

SELECT MIN(TimeColumn) AS TimeGroup, SUM(Quantity) AS TotalQuantity FROM YourTable GROUP BY (DATEDIFF(ss, TimeColumn, '2012-01-01') / 30) 

Here, the minimum timestamp of each group will be displayed as TimeGroup. But you can use the maximum or even value of the grouping column, which can be converted at the time again for display.

0
source

Looking at the comments above, I assume that the first scenario of Chris is the one you need (all 3 are grouped, although the values ​​1 and 3 are not within 30 seconds of each other, but in each of them within 30 seconds after the value 2). Also suppose each row in your table has a unique identifier called "id". You can do the following:

  1. Create a new group to determine whether the previous line in your section is more than 30 seconds behind the current line (for example, determine whether you need a new 30 second grouping or continue the previous one). We will call this parent_id.
  2. Amount amount by parent_id (plus any other aggregates)

The code may look like this

 select sub.parent_id, sub.cusip, min(sub.timestamp) min_timestamp, sum(sub.quantity) quantity from ( select base_sub.*, case when base_sub.self_parent_id is not null then base_sub.self_parent_id else lag(base_sub.self_parent_id) ignore nulls over ( partition by my_table.cusip order by my_table.timestamp, my_table.id ) parent_id from ( select my_table.id, my_table.cusip, my_table.timestamp, my_table.quantity, lag(my_table.timestamp) over ( partition by my_table.cusip order by my_table.timestamp, my_table.id ) previous_timestamp, case when datediff( second, nvl(previous_timestamp, to_date('1900/01/01', 'yyyy/mm/dd')), my_table.timestamp) > 30 then my_table.id else null end self_parent_id from my_table ) base_sub ) sub group by sub.time_group_parent_id, sub.cusip 
0
source

Source: https://habr.com/ru/post/1444472/


All Articles