Reading csv in pig, csv file contains comma

So, my data looks something like this.

asdf, asdf, "adsf,qwef", asdf 

When I read this data in a pig using

 PigStorage(',') 

It saves "adsf, qwef" as two data and saves it as

 { "adsf } { qwef" } 

I want quotes to be treated as separate data.

What should I do?

I am trying to write pigment for this.

+4
source share
1 answer

You should use CSVLoader:

 data = LOAD 'my.csv' USING org.apache.pig.piggybank.storage.CSVExcelStorage() AS (...); 

Where ... are identifiers.

NOTE. First you need to register a Piggybank. Details here: https://cwiki.apache.org/confluence/display/PIG/PiggyBank

+6
source

Source: https://habr.com/ru/post/1493005/


All Articles