We are a gaming company that stores events (up to 1 gig per day) in bigquery. Events overlap throughout the month and applications to reduce request costs.
Now to our problem.
Our current solution supports the addition of new event types, which leads to new versions of the table schema. These versions are also added to the tables.
those. events_app1_v2_201308 and events_app1_v2_201308
If we add events with new column types in September, we also get events_app1_v3_201309
We wrote code that detects the involved tables (for a date range) and combines them into an aqua comma separeted FROM.
But I just realized that this will NOT work when we create unions in different versions of the event tables.
Anyone who has a smart decision how to handle this !?
We are now investigating whether JSON structures can help us. The current solution is just flat columns. [timestamp, eventId, value, value, value, ...]
From https://developers.google.com/bigquery/query-reference#from
Note. Unlike many other SQL-based systems, BigQuery uses comma syntax to indicate table joins, not joins. Does this mean that you can run a query across multiple tables with compatible schemas ?? in the following way:
source share