Spark SQL change number format

After the showspark command prints the following:

+-----------------------+---------------------------+
|NameColumn             |NumberColumn               |
+-----------------------+---------------------------+
|name                   |4.3E-5                     |
+-----------------------+---------------------------+

Is there a way to change the format NumberColumnto something like 0.000043?

+9
source share
4 answers

You can use the function format_number as

import org.apache.spark.sql.functions.format_number
df.withColumn("NumberColumn", format_number($"NumberColumn", 5))

here 5 are the decimal places you want to show

As can be seen from the above format_numberfunction format_number, the string column is returned.

format_number (column x, int d)
Formats a numeric column x in a format such as' #, ###, ###. ## ', rounded to decimal places d, and returns the result as a string column.

, regexp_replace

regexp_replace ( e, , )
, rep.

import org.apache.spark.sql.functions.regexp_replace
df.withColumn("NumberColumn", regexp_replace(format_number($"NumberColumn", 5), ",", ""))

, (,) .

+12

cast, :

val df = sc.parallelize(Seq(0.000043)).toDF("num")    

df.createOrReplaceTempView("data")
spark.sql("select CAST (num as DECIMAL(8,6)) from data")

.

+4
df.createOrReplaceTempView("table")
outDF=sqlContext.sql("select CAST (num as DECIMAL(15,6)) from table")

6 .

0

pyspark round() bround(). ",".

:

df.withColumn("NumberColumn", bround("NumberColumn",5))
0

Source: https://habr.com/ru/post/1681094/


All Articles