FAILED: error in semantic analysis: column found in more than one table / subqueries

CREATE EXTERNAL TABLE old_events (day STRING, foo STRING, count STRING, internal_id STRING) PARTITIONED BY (ds string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '${INPUT}'; CREATE EXTERNAL TABLE events (internal_id, foo STRING, count STRING) PARTITIONED BY (ds string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '${OUTPUT}'; INSERT OVERWRITE TABLE events SELECT e2.internal_id, e2.foo, count(e1.foo) FROM old_events e2 LEFT OUTER JOIN old_events e1 ON e1.foo = e2.foo WHERE e1.event = 'event1' AND e2.event = 'event2' AND ds = date_sub('${DAY}',1) GROUP BY e2.internal_id, e2.foo; 

FAILED: semantic analysis error: ds column found in more than one table / subqueries

I get this error when adding a ds variable that retrieves the current date. As I implemented with the above script to work in the date section.

+4
source share
1 answer

You need to add the ds alias to the WHERE . ex, ds = date_sub('${DAY}',1) to e2.ds = date_sub('${DAY}',1) .

To clarify your problem a bit, here is a small example that shows the same behavior

 CREATE EXTERNAL TABLE example (a INT, b INT) LOCATION '${OUTPUT}'; SELECT * FROM example e1 JOIN example e2 ON e1.a = e2.a WHERE b = 5; 

This creates the same error:

 FAILED: SemanticException Column b Found in more than One Tables/Subqueries 

The problem is that column b exists both in example with an alias and in e1 and e2 . You and I might know that if you attach example to yourself by column a , then e1.b will be the same as e2.b , so it does not need an alias, but Hive does not know this, so you need to choose one to eliminate any ambiguity. It doesn’t matter if b section column or not.

+8
source

Source: https://habr.com/ru/post/1481383/


All Articles