Refresh or delete streaming buffer tables in BigQuery?

Question

Refresh or delete streaming buffer tables in BigQuery?

I get the following error when I try to delete records from a table created using the GCP console and updated using the GCP table insert function BigQuery Node.js.

UPDATE or DELETE DML statements are not supported over table stackdriver-360-150317:my_dataset.users with streaming buffer

The table was created without streaming functions. And from what I read in the Tables that have been written to recently via BigQuery Streaming (tabledata.insertall) cannot be modified using UPDATE or DELETE statements documentation Tables that have been written to recently via BigQuery Streaming (tabledata.insertall) cannot be modified using UPDATE or DELETE statements .

Does this mean that after a record has been inserted with this function into the table, there is no way to delete records? At all? If so, does this mean that the table needs to be deleted and recreated? If it is not. Could you suggest a workaround to avoid this problem?

Thanks!

Including a new error message for SEO: "The UPDATE or DELETE statement on a table ... will affect rows in the stream buffer, which is not supported" - Fh

+20

google-cloud-platform google-bigquery

Diego Mar 29 '17 at 6:26

source share

2 answers

Be sure to change the filters so that they do not include data that may be in the current streaming buffer.

For example, this query is not executed during streaming to this table:

 DELETE FROM 'project.dataset.table' WHERE id LIKE '%-%' Error: UPDATE or DELETE statement over table project.dataset.table would affect rows in the streaming buffer, which is not supported

This can be fixed by deleting only old entries:

 DELETE FROM 'project.dataset.table' WHERE id LIKE '%-%' AND ts < TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 40 MINUTE) 4282 rows affected.

+6

Felipe hoffa Nov 27 '18 at 8:06

source share

Pentium10 · Accepted Answer · 2017-03-29T06:46:14+0000

To check if a table has a stream buffer, check the tables.get response for a section called streamingBuffer or, when streaming to a partitioned table, the data in the stream buffer is NULL for the _PARTITIONTIME pseudo- _PARTITIONTIME , so you can check even with a simple WHERE query.

Stream data is available for analysis in real time within a few seconds after the first stream insertion into the table, but it may take up to 90 minutes to become available for copying / exporting and other operations. You may have to wait up to 90 minutes for the entire buffer to be stored in the cluster. You can use queries to check if the stream buffer is empty or not, as you mentioned.

If you use loading to create a table, you will not have a stream buffer, but you may have redirected some values to it.

Check out the answer below to work with tables that have current stream buffers. Just use WHERE to filter the last minutes of the data and your queries will work. - fh

Refresh or delete streaming buffer tables in BigQuery?

More articles: