Program sketch
- I am creating a HiveContext
hiveContext. - In this context, I create a DataFrame
dffrom a JDBC relational table. - I am registering a DataFrame
dfthrough df.registerTempTable("TESTTABLE"). - I run HiveThriftServer2 through
HiveThriftServer2.startWithContext(hiveContext).
TESTTABLE contains 1 000 000 records, columns - ID (INT) and NAME (VARCHAR)
+-----+--------+
| ID | NAME |
+-----+--------+
| 1 | Hello |
| 2 | Hello |
| 3 | Hello |
| ... | ... |
With Beeline, I access the SQL endpoint (on port 10000) of HiveThriftServer and execute the query. For example.
SELECT * FROM TESTTABLE WHERE ID='3'
When I check a QueryLog from a database with SQL statements executed, I see
SELECT \"ID\",\"NAME\" FROM test;
Thus, a downward predicate does not occur, since the where clause is missing.
Questions
This leads to the following questions:
- Why is not a loyal click performed?
- Can this be changed without using registerTempTable?
- , ? HiveThriftServer?
DataFrame df Spark SQLContext
df.filter( df("ID") === 3).show()
SELECT \"ID\",\"NAME\" FROM test WHERE ID = 3;
.