Get HBase Row Keys in a range without data extraction?

Is there a way to get row keys in a given range without actually getting the / CF columns associated with that row key?

For clarification . In my example, the keys of the table table are the names of the titles (for example, GOOG), and in our web application we want to fill in the autocomplete widget using only the row keys that we have in the database. Obviously, if we select all the data (instead of stock names) for all stocks between G and H, when the user types β€œG”, we will unnecessarily strain our system. Any ideas?

+4
source share
5 answers

take a look at the filters (http://hbase.apache.org/book/client.filter.html), especially KeyOnlyFilter. filter description ( http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/package-summary.html )

A filter that will return only the key component of each KV (the value will be overwritten as empty).

To restrict keys to a specific range, use the Scan constructor (rowStart, rowEnd).

+7
source

According to the official documentation, you can optimally retrieve only the row keys using a combination of two filters: KeyOnlyFilter and FirstKeyOnlyFilter. (I think that "FirstKeyOnlyFilter" will return the key only once, even with large complex strings.) If you only need keys in a given range, you can add this range to the scanner.

Here is a sample code:

FilterList filters = new FilterList(FilterList.Operator.MUST_PASS_ALL, new FirstKeyOnlyFilter(), new KeyOnlyFilter()); Scan s = new Scan(filters); // in order to limit the scan to a range s.setStartRow(startRowKey); // first key in range s.setStopRow(stopRowKey); // key value after the last key in the range 

Source: https://hbase.apache.org/book.html#perf.hbase.client.rowkeyonly

+4
source

I would create a family of columns called "empty:" and keep the empty values ​​for all rows. Now you can simply request the loading of the "empty:" column. This is not ideal, but it is better than loading column families with lots of data.

+1
source

you can use addFamily (byte [] family) or addFamily (byte [] family, byte [] qualifier) ​​to get only the relevant data

0
source

One approach would be to maintain another index table that has keys for all possible FSA states for all stocks. So the next time the user enters β€œG,” all you have to do is click on this table, and the extracts can be a comma-separated list of all the values ​​associated with G.

0
source

Source: https://habr.com/ru/post/1348163/


All Articles