HBase distributed scanner

The "API usage example" on the Getting Started page in the HBase documentation provides an example of how to use the scanner:

Scanner Scanner = table.getScanner (new String [] {"myColumnFamily: columnQualifier1"});

RowResult rowResult = scanner.next();
 while (rowResult != null) {
  //...
  rowResult = scanner.next(); 

}

As I understand it, this code will be executed on one computer (node ​​name), and all scanning and filtering operations will not be distributed. Only data storage and data loading will be distributed. How to use a distributed scanner that will work separately on each node.

What is the best practice for fast data filtering? Thanks.

+3
source share
2 answers

, : - api- . MapReduce (hbase.mapred).

+1

, , . , , , , , . , , , OOM-, , . , .

getRegionLocations HTable: http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#getRegionLocations()

, , , , , .

+1

Source: https://habr.com/ru/post/1712448/


All Articles