I used GlusterFS before, it has some nice features, but in the end I decided to use HDFS for the distributed file system in Hadoop.
The good thing about GlusterFS is that it does not require master-client nodes. Each cluster node in the cluster is the same, so there is no single failure in GlusterFS. And one more thing that interests me in GlusterFS is that it has the glusterfs-client module, http://www.jamescoyle.net/how-to/439-mount-a-glusterfs-volume , when you if you want to save the file in glusterfs, you donβt need to interact with Apache GlusterFS, you just need to copy the file to the installed volume in glusterfs-client and make the work so simple.
But I find that GlusterFS is difficult to integrate into the Hadoop ecosystem, such as Spark, Mapreduce, ect .., where HDFS is supported by all of the majority of any components in the Hadoop ecosystem. I think GlusterFS is good for building a clustered system such as file storage, regardless of Hadoop.
source share