I'm looking for a framework, a combination of frameworks, best practices, or a tutorial on visualizing large datasets using Hadoop.
I am not looking for an infrastructure to visualize the mechanics of running Hadoop jobs or managing disk space on Hadoop. I am looking for an approach or guide for visualizing data contained in HDFS using graphs and charts, etc.
For example, let's say I have a set of data points stored in several files in HDFS, and I would like to show a histogram of data. Am I the only option to write a custom map / reduce task that would try to figure out which points fall into the bucket, write the resulting data to a file, and then use the graphics library to visualize this?
Do I need to deploy a custom solution, or is there anyone else doing this? I am trying to find online, but I could not find something that is directly related to this.
thanks for the help
source share