Using rank () in Spark SQL

Question

Using rank () in Spark SQL

Need to use some pointers when using rank()

I retrieved a column from the dataset .. to do the ranking.

Dataset<Row> inputCol= inputDataset.apply("Colname");    
Dataset<Row>  DSColAwithIndex=inputDSAAcolonly.withColumn("df1Rank", rank());

DSColAwithIndex.show();

I can sort the column and then add the index column to get the rank ... but curious to know the syntax and usage rank()

+4

java rank apache-spark window-functions apache-spark-sql

Binu Mar 6 '17 at 4:47

source share

2 answers

I searched for this by applying a rating to my data frame in Java.

Using the answer in the comment above,

import org.apache.spark.sql.expressions.WindowSpec; 
WindowSpec w = org.apache.spark.sql.expressions.Window.orderBy(colName);
Dataset<Row> leadDf = inputDSAAcolonly.withColumn("df1Rank", rank().over(w));

worked for me, thanks gaurav.

0

Gaurav multani 06 . '19 10:32

mrsrinivas · Accepted Answer · 2017-03-06T06:33:48+0000

Window specification should be indicated for rank()

val w = org.apache.spark.sql.expressions.Window.orderBy("date") //some spec    

val leadDf = inputDSAAcolonly.withColumn("df1Rank", rank().over(w))

Edit: Java version of the answer since OP using Java

import org.apache.spark.sql.expressions.WindowSpec; 
WindowSpec w = org.apache.spark.sql.expressions.Window.orderBy(colName);
Dataset<Row> leadDf = inputDSAAcolonly.withColumn("df1Rank", rank().over(w));

Using rank () in Spark SQL

More articles: