Spark AnalysisException cannot resolve "column name" based on input columns

I have a DataFrame containing a column of x, y

x     y
1     false
1     false
1     true
2     true
2     false
3     null
3     true

I am trying to create a contingency table with the following code and expecting the result below:

myDataFrame.stat.crosstab("x", "y")

x_y  true    false     null
1    1       2         0
2    1       1         0
3    1       0         1

However, I get the following exception: AnalysisException cannot resolve 'true' given input columns [x, y]

The column "true" (as well as "false" and "null") is created automatically stat.crosstabat run time. Static analysis cannot detect a new column name without first making a full pass through the data.

I am using Spark 1.6.1.5. This is mistake? Is there any way to turn off the analyzer?

+4
source share

Source: https://habr.com/ru/post/1667397/


All Articles