How to store and query very large data sets (in addition to relational databases)

Question

How to store and query very large data sets (in addition to relational databases)

We are currently faced with the problem of efficiently storing and retrieving data from very large data sets (billions). We used mysql and optimized the system, OS, raids, queries, indexes, etc., and now they want to move on.

I need to make an informed decision about which technology should be addressed to solve our data problems. I studied map / reduction using HDFS, but also heard good things about HBase. I cannot help but think that there are other options. Is there a good comparison of the technologies available and what are their tradeoffs?

If you have links to share with each other, I would appreciate it.

+3

hbase hdfs large-data

jW. Jan 20 '11 at 2:17

source share

1

David Gruzman · Accepted Answer · 2011-01-20T06:33:43+0000

. , . - DB. , RAID- - Oracle , . TPC-H : http://www.tpc.org/tpch/results/tpch_perf_results.asp, . - .
- Hadoop HDFS + Map/Reduce + Hive. - MapReduce. , . , - .
- MPP - . SQL. Netezza, Greenplum, Asterdata, Vertica. - , .

How to store and query very large data sets (in addition to relational databases)

More articles: