Why does the predicate not work?

Program sketch

  • I am creating a HiveContext hiveContext.
  • In this context, I create a DataFrame dffrom a JDBC relational table.
  • I am registering a DataFrame dfthrough df.registerTempTable("TESTTABLE").
  • I run HiveThriftServer2 through HiveThriftServer2.startWithContext(hiveContext).

TESTTABLE contains 1 000 000 records, columns - ID (INT) and NAME (VARCHAR)

+-----+--------+
| ID  |  NAME  |
+-----+--------+
| 1   | Hello  |
| 2   | Hello  |
| 3   | Hello  |
| ... | ...    |

With Beeline, I access the SQL endpoint (on port 10000) of HiveThriftServer and execute the query. For example.

SELECT * FROM TESTTABLE WHERE ID='3'

When I check a QueryLog from a database with SQL statements executed, I see

/*SQL #:1000000 t:657*/  SELECT \"ID\",\"NAME\" FROM test;

Thus, a downward predicate does not occur, since the where clause is missing.

Questions

This leads to the following questions:

  • Why is not a loyal click performed?
  • Can this be changed without using registerTempTable?
  • , ? HiveThriftServer?

DataFrame df Spark SQLContext

df.filter( df("ID") === 3).show()

/*SQL #:1*/SELECT \"ID\",\"NAME\" FROM test WHERE ID = 3;

.

0

Source: https://habr.com/ru/post/1615701/


All Articles