BigQuery standard SQL query returns an invalid response

This request:

SELECT x 
FROM dataset.table_a 
WHERE x NOT IN (SELECT x FROM dataset.table_b)

returns null records, though:

  • The field xin table_acontains 1,326,932 different string values

  • Field xin table_bcontains 18,885 different string values

I do not understand why. Moreover, in a BigQuery binary SQL query, this query returns the correct answer.

+4
source share
2 answers

, - - NULL NOT IN SQL, SQL SQL. ​​ , .

IN (https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#in-operators) :

IN NULL IN-list TRUE NULL, FALSE

, NOT EXISTS :

SELECT x
FROM dataset.table_a AS t
WHERE NOT EXISTS (
  SELECT 1 FROM dataset.table_b
  WHERE t.x = x
);
+3

, WHERE NOT x IS NULL

#standardSQL
SELECT x 
FROM `dataset.table_a` 
WHERE x NOT IN (SELECT x FROM `dataset.table_b` WHERE NOT x IS NULL)  

, DISTINCT, ,

#standardSQL
SELECT x 
FROM `dataset.table_a` 
WHERE x NOT IN (SELECT DISTINCT x FROM `dataset.table_b` WHERE NOT x IS NULL)
+3

Source: https://habr.com/ru/post/1676364/


All Articles