Sort a Spark Dataset / Result Set

I am trying to get a list of columns from a Hive table and store the result in a spark data block.

var my_column_list = hiveContext.sql(s""" SHOW COLUMNS IN $my_hive_table""")

But I cannot sort the data alphabetically or even the result of querying show columns. I tried using sort and orderBy ().

How can I sort the result alphabetically?

Update: Added sample of my code

import org.apache.spark.{ SparkConf, SparkContext }
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.hive.HiveContext

val hiveContext = new HiveContext(sc)
hiveContext.sql("USE my_test_db")

var lv_column_list = hiveContext.sql(s""" SHOW COLUMNS IN MYTABLE""")
//WARN LazyStruct: Extra bytes detected at the end of the row! Ignoring similar problems

lv_column_list.show //Works fine
lv_column_list.orderBy("result").show //Error arises
+4
source share
3 answers

Instead of "SHOW COLUMNS" I used "DESC" and got a list of columns with "col_name".

var lv_column_list = hiveContext.sql(s""" DESC MYTABLE""")
lv_column_list.select("col_name").orderBy("col_name")
0
source

SHOW COLUMNS Dataframe result. , , :

val df = hiveContext.sql(s""" SHOW COLUMNS IN $my_hive_table """)
df.orderBy("result").show
+3

Not sure how you use the sort or orderBy method,

Try the following:

df.sort(asc("column_name"))    
df.orderBy(asc("column_name"))
0
source

Source: https://habr.com/ru/post/1660177/


All Articles