Can BigQuery be used as the primary query engine?

I was interested to learn how much it is possible to use BigQuery as the main query mechanism for the analytical tool that we are developing. Our public API should realistically execute at least hundreds of simultaneous SELECT queries using the PHP SDK (for potentially 100M + rows), but from the current documentation it seems that BigQuery is more focused on rare queries than on high volume, high load on demand queries.

Some of the companies listed on the Google website seem to be doing similar things, but have I also seen speed limit figures from 20 concurrent requests that seem to exclude this use case for a product?

+6
source share
2 answers

I'm glad you asked. Ordinary BigQuery users are subject to concurrent query speed limits, but there is an option that matches the specific use case that you describe: reserved capacity.

With reserved capacity, you get your own "separate cluster", not subject to the same restrictions, but defined by you.

Learn more about https://developers.google.com/bigquery/pricing#reserved_cap .

+2
source

This is an architectural solution. My personal opinion: I would not consider BigQuery if you expect several different users to use the API at the same time. That would be expensive and risky. I think you should have raw data in Big Query and try to figure out the customer service mechanism in a more efficient way, perhaps by using a cache or storing some results / snapshots in a data store or, possibly, CloudSQL.

+1
source

Source: https://habr.com/ru/post/971311/


All Articles