So, I saw a couple of tutorials for this online, but everyone seems to be saying to do something else. In addition, each of them does not seem to indicate whether you are trying to work on a remote cluster or interact locally with a remote cluster, etc.
However, my goal is to get my local computer (mac) to get pigs to work with compressed lzo files that exist in a Hadoop cluster that has already been configured to work with lzo files. I already installed Hadoop locally and can get files from the cluster using hadoop fs -[command] .
I already have a pig installed locally and have a connection to the hadoop cluster when I run the scripts or when I just run the material through grunts. I can download and play with files other than lzo, just fine. My problem is only to figure out a way to download lzo files. Maybe I can just process them through an instance of the ElephantBird cluster? I have no idea, and found only minimal information on the Internet.
So, any short tutorial or answer for this would be awesome and hopefully help more people than just me.
source share