How to remove unnecessary things like (), [], single quotes from PyPpark output

Question

How to remove unnecessary things like (), [], single quotes from PyPpark output

Hi, I'm new to Spark, I joined two key-based RDDs, and I got the following output, which I want to reformat using a spark,

 (676747, (['India', 'Telemart', 'North', 'South', 'Region', 'Area', 'States', '1C-iim'], ((0.0, 'North', 17), (0.0, 'South', 22), (1.0, 'East', 21), (3.0, 'west', 9.0), (7.0, 'MAH', 8.0, (3.0, 'AKL', 9.0), (23.0, 'PNB', 67))))

So, I want to remove all the brackets and want the clean output to like,

676747,India,Telemart,North,South,Region,Area,States,1C-iim,0.0,North,17,0.0,South,22,1.0,East,21 ......

please help me achieve the desired result.

0

apache-spark pyspark

Deno george Jan 22 '16 at 10:21

source share

No one has answered this question yet.

See similar questions:

4

How to reformat Spark Python output

or similar:

171

How to read multiple text files in one RDD?

89

How to rewrite the output directory into a spark

5

How to write data to Elasticsearch from Pyspark?

2

Launch Spark in IntelliJ Idea on a stand-alone cluster with a wizard on the same Windows machine

1

Capturing output / return value from source source

0

Removing a Worker Node from Spark Job Launch

0

The output from map () and flatMap () - what's the difference

0

How to return one field from each row in pyspark RDD?

0

How to remove brackets from RDD output?

0

Corrected Error Code

How to remove unnecessary things like (), [], single quotes from PyPpark output

More articles: