I am trying to update a column for all rows after each processing of one row using UDF.
In the example, there are 3 rows with 6 columns. Column "A" has the same meaning in three rows; column "B" and "A" is the common identifier of each row; column "C" - these are arrays with any letters in a, b, c, d, e; column "D" is the target array to be filled; column "E" are some integers; The "abcde" column is an integer array with 5 integers that defines the counts for each letter a, b, c, d, e.
Each row will be passed to UDF to update column "D" and column "abcde" according to column "C" and column "E". Rule: select the number that is indicated by "E"; elements from "C" to enter into "D"; the choice is random; after each selection made for a row, the "abcde" column will be updated across all rows .
For example, to process the first line, we randomly select one element from ('a', 'b', 'c') to enter in "D". Let's say the system selects “c” in column “C”, so the value in “D” for this row becomes ['c'], and 'abcde' is updated to [1,3,1,1,1] (it used to be [ 1,3,2,1,1]) for all three lines.
Sample data:
with sample as (
select 'y1' as A, 'x1' as B, ['a','b','c'] as C, [] as D, 1 as E, [1,3,2,1,1] as abcde union all
select 'y1','x2',['a','b'],[],2,[1,3,2,1,1] union all
select 'y1','x3',['c','d','e'],[],3,[1,3,2,1,1])
select * from sample order by B
After processing the first line:
with sample as (
select 'y1' as A, 'x1' as B, ['a','b','c'] as C, ['c'] as D, 1 as E, [1,3,1,1,1] as abcde union all
select 'y1','x2',['a','b'],[],2,[1,3,1,1,1] union all
select 'y1','x3',['c','d','e'],[],3,[1,3,1,1,1])
select * from sample order by B
:
with sample as (
select 'y1' as A, 'x1' as B, ['a','b','c'] as C, ['c'] as D, 1 as E, [0,2,1,1,1] as abcde union all
select 'y1','x2',['a','b'],['a','b'],2,[0,2,1,1,1] union all
select 'y1','x3',['c','d','e'],[],3,[0,2,1,1,1])
select * from sample order by B
:
with sample as (
select 'y1' as A, 'x1' as B, ['a','b','c'] as C, ['c'] as D, 1 as E, [0,2,0,0,0] as abcde union all
select 'y1','x2',['a','b'],['a','b'],2,[0,2,0,0,0] union all
select 'y1','x3',['c','d','e'],['c','d','e'],3,[0,2,0,0,0])
select * from sample order by B
, UDF . , BigQuery abcde , ?
UDF, , UDF , . , . SQL?
:
:

:

:
create temporary function selection(A string, B string, C ARRAY<STRING>, D ARRAY<STRING>, E INT64, abcde ARRAY<INT64>)
returns STRUCT< A stRING, B string, C array<string>, D array<string>, E int64, abcde array<int64>>
language js AS """
/*
for the row i in the data:
select the number i.E of items (randomly) from i.C where the numbers associated with the item in i.abcde is bigger than 0 (i.e. only the items with numbers in abcde bigger than 0 can be the cadidates for the random selection);
put the selected items in i.D and deduct the amount of selected items from the number for the corresponding item in the column 'abcde' FOR ALL ROWS;
proceed to the next row i+1 until every row is processed;
*/
return {A,B,C,D,E,abcde}
""";
with sample as (
select 'y1' as A, 'x1' as B, ['a','b','c'] as C, CAST([] AS ARRAY<STRING>) as D, 1 as E, [1,3,2,1,1] as abcde union all
select 'y1','x2',['a','b'],[],2,[1,3,2,1,1] union all
select 'y1','x3',['c','d','e'],[],2,[1,3,2,1,1])
select selection(A,B,C,D,E,abcde) from sample order by B