Spark SQL and timezones - How to convert a unix timestamp to a localized timestamp

From a Spark DataFrame, I need to convert the / unix era timestamp column (e.g. 1509102527 = GMT: Friday, October 27, 2017 11:08:47) to a localized timestamp to get the local hour in a specific time zone.

Is there a Spark SQL function that can take a unix timestamp and return localized java.sql.Timestamp?

I already tried to use the function from_unixtime, but it returns a localized timestamp based on the default system time zone on the computer on which the code is running. The only solution I have found so far is to convert this timestamp back to UTC and then from UTC to the target time zone. This is a unit test that works with a workaround, but there should be a better way to do this.

test("timezone localization should not change effective unix timestamp") {
  import org.apache.spark.sql.functions._

  val df = Seq(1509102527)
    .toDF("unix_timestamp")
    .withColumn("machine_localised_timestamp", from_unixtime('unix_timestamp))
    .withColumn("utc_timestamp", to_utc_timestamp('machine_localised_timestamp, TimeZone.getDefault().getID()))
    .withColumn("local_time", from_utc_timestamp('utc_timestamp, "Europe/Amsterdam"))
    .withColumn("local_hour", hour('local_time))
    .withColumn("reverted_unix_timestamp", unix_timestamp('local_time))

  df.show(false)

  val row = df.collect()(0)
  row(row.fieldIndex("unix_timestamp")) shouldBe 1509102527
  row(row.fieldIndex("reverted_unix_timestamp")) shouldBe 1509102527
  row(row.fieldIndex("local_hour")) shouldBe 13
}
+4
source share

Source: https://habr.com/ru/post/1688287/


All Articles