I have string data in a compressed .gz format. I have to read it in pyspark Below is a snippet of code
rdd = sc.textFile("data/label.gz").map(func)
But I could not read this file successfully. How to read a compressed gz file. I found a similar question here , but my current spark version is different from the version in this question. I expect that there will be some kind of built-in function, like in hadoop.
source
share