In which version does HBase integrate the spark API?

I read the documentation about spark and hbase:

http://hbase.apache.org/book.html#spark

I see that the latest stable version is HBase 1.1.2 , but I also see that apidocs is on version 2.0.0-SNAPSHOT and that the apidoc spark is empty.

I am confused why the versions of apidocs and HBase do not match?

My goal is to use Spark and HBase (bulkGet, bulkPut..etc). How do I know which version of HBase implemented these features?

If anyone has additional documentation on this, this will be awesome.

I am on hbase-0.98.13-hadoop1.

+5
source share
2 answers

Spark does not currently ship with the HBase api, as is the case with the hive; you manually set the hbase bans in the spark class in the spark-default.conf file.

see the link below, she has complete information on how to connect to HBase:

http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html

0
source

Below is the main JIRA ticket for integrating Spark into HBase, target version 2.0.0 , which is still under development, needs to wait for release or to build the version from source using its own

https://issues.apache.org/jira/browse/HBASE-13992

There are several documentation links inside the ticket.

If you just want to access HBase from Spark RDD, you can think of it as a regular Hadoop data source based on the HBase-specific TableInputFormat and TableOutputFormat

0
source

Source: https://habr.com/ru/post/1237268/


All Articles