WAL designed to restore NOT to duplicate data.
Pls go below to understand more ...
Habse stores MemStore and 0 or more StoreFiles (HFiles). The store corresponds to the column family for the table for this area.
The Write Ahead (WAL) journal writes all changes to data in HBase, a file-based storage. if a RegionServer crashes or becomes unavailable until the MemStore is reset, WAL ensures that changes to the data can be reproduced.
When using one WAL for a RegionServer, a RegionServer must be written to the WAL sequentially, since the HDFS files must be sequential. This makes WAL a performance bottleneck.
WAL can be disabled to improve bottleneck performance. This is done by calling the Hbase client field.
Mutation.writeToWAL(false)
General note . A common practice is that when mass loading is performed, the WAL is disabled to obtain speed. But a side effect is that if you turn off WAL, you cannot return data for playback if in the event of a memory failure.
Moreover, if you use solr + HBASE + LILY, that is, LILY Morphiline NRT indexes with hbase, then it will work on WAL, if you disable WAL for performance reasons, then Solr NRT indexing will not work. since Lily works on WAL.
please browse the hbase architecture section

source share