, , Apache Hadoop Apache Spark, . .
HBase , HDFS, , .
HBase Hadoop Spark, , - ! HFiles, .
, SQL, , . , (). NoSQL - , , (, , NoSQL) - . , SSD , - . , .
:
.
I think that if you use Apache Spark for data analysis, you need to avoid HBase (Cassandra or any other database). They can be useful for storing aggregated data for reporting or selecting specific records about users or items, but this happens after processing.
source
share