Large number of queries and summary duplicate fields

I would like to rotate the "unitId", "firebase_screen_class" fields so that each one appears in a separate column:

SELECT event.name, event_param.value.string_value AS ad_unit, COUNT(*) AS event_count FROM `app_events_20170510`, UNNEST(event_dim) AS event, UNNEST(event.params) as event_param WHERE event.name in ('Ad_requested', 'Ad_clicked', 'Ad_shown') and event_param.key in ('unitId', 'screen_class') GROUP BY 1,2 

I used the following query using legacy SQL, but it does not show the correct aggregation results:

 SELECT event_name, ad_unit, count(*) FROM ( SELECT event_dim.name as event_name, MAX(IF(event_dim.params.key = "firebase_screen_class", event_dim.params.value.string_value, NULL)) WITHIN RECORD as firebase_screen_class, MAX(IF(event_dim.params.key = "unitId", event_dim.params.value.string_value, NULL)) WITHIN RECORD as ad_unit FROM [app_events_20170510] WHERE event_dim.name in ('Ad_requested','Ad_shown', 'Ad_clicked') and event_dim.params.key in ('unitId','screen_class') ) group by 1,2 

I am looking for the following output:

 _________________________________________________________________________ | event_dim.name | unitId | screen_class | count_events| |__________________|________________|______________________|_____________| | Ad_requested | hpg | socialFeed | 520 | |__________________|________________|______________________|_____________| | Ad_shown | hpg | chat | 950 | |__________________|________________|______________________|_____________| | Ad_requested | hni | chat | 740 | |__________________|________________|______________________|_____________| 

All events Ad_requested , Ad_shown and Ad_clicked have parameters with the same keys ( unitId , screen_class ), as well as the same values ​​for each key ( unitId : hpg , hni / screen_class : socialFeed , chat )

0
source share
1 answer

Below for standard SQLQuery SQL

 #standardSQL WITH `aggregation` AS ( SELECT event.name, event_param.key, COUNT(*) AS event_count FROM `app_events_20170510`, UNNEST(event_dim) AS event, UNNEST(event.params) AS event_param WHERE event.name IN ('Ad_requested', 'Ad_clicked', 'Ad_shown') AND event_param.key IN ('unitId', 'firebase_screen_class','house') GROUP BY 1, 2 ) SELECT name, MAX(IF(key = 'unitId', event_count, NULL)) AS unitId, MAX(IF(key = 'firebase_screen_class', event_count, NULL)) AS firebase_screen_class, MAX(IF(key = 'house', event_count, NULL)) AS house FROM `aggregation` GROUP BY name 

Update based on explanation in comments:

 #standardSQL SELECT event.name, (SELECT value.string_value FROM UNNEST(event.params) WHERE key = 'unitId') AS unitId, (SELECT value.string_value FROM UNNEST(event.params) WHERE key = 'firebase_screen_class') AS firebase_screen_class, (SELECT value.string_value FROM UNNEST(event.params) WHERE key = 'house') AS house, COUNT(1) AS event_count FROM `app_events_20170510`, UNNEST(event_dim) AS event WHERE event.name IN ('Ad_requested', 'Ad_clicked', 'Ad_shown') GROUP BY 1,2,3,4 

... Out of curiosity, I tried to replicate the query using legacy SQL ... -

Added version for BigQuery Legacy SQL (for educational purposes only and in the hope of helping those who are considering switching to standard SQL, because here are two versions of the same task)

 #legacySQL SELECT name, product_id, source, firebase_event_origin, COUNT(1) AS event_count FROM ( SELECT event_dim.name AS name, MAX(IF(event_dim.params.key = 'unitId', event_dim.params.value.string_value, NULL)) WITHIN RECORD AS unitId, MAX(IF(event_dim.params.key = 'firebase_screen_class', event_dim.params.value.string_value, NULL)) WITHIN RECORD AS firebase_screen_class, MAX(IF(event_dim.params.key = 'house', event_dim.params.value.string_value, NULL)) WITHIN RECORD AS house, FROM FLATTEN([project:dataset.app_events_20170510], event_dim) AS event WHERE event_dim.name IN ('Ad_requested', 'Ad_clicked', 'Ad_shown') ) GROUP BY 1, 2, 3, 4 
+3
source

Source: https://habr.com/ru/post/1258089/


All Articles