How to unit test HBase in a Spark scala stream

I tried a unit test doSomethingRddthat requires reading some reference data from HBase in an rdd conversion.

def doSomethingRdd(in: DStream[String]): DStream[String] = {
    in.map(i => {
        val cell = HbaseUtil.getCell("myTable", "myRowKey", "myFamily", "myColumn") 
        i + cell.getOrElse("")
    })
}

Object HBaseUtil {
    def getCell(tableName: String, rowKey: String, columnFamily: String, column: String): Option[String] = {
    val HBaseConn = ConnectionPool.getConnection()
    //the rest of the code will use HBaseConn 
    //to get a HBase cell and convert to a string
    }
}

I read this article by Cloudera , but I have some problems with their recommended methods.

This is the first thing I've tried is to use ScalaMock to retrieve the HBaseUtil.getUtilmethod so that I can bypass the HBase connection. I also made some workaround for mock Object singleton to suggest this article. I updated my code as below. However, doSomethingRddit failed because the mockery of hbaseUtil is not serialization, which is also explained by Paul Butcher in his answer

def doSomethingRdd(in: DStream[String], hbaseUtil: HBaseUtilBody:HBaseUtil): DStream[String] = {
    in.map(i => {
        val cell = HbaseUtil.getCell("myTable", "myRowKey", "myFamily", "myColumn") 
        i + cell.getOrElse("")
    })
}

trait HBaseUtilBody {
    def getCell(tableName: String, rowKey: String, columnFamily: String, column: String): Option[String] = {
    val HBaseConn = ConnectionPool.getConnection()
    //the rest of the code will use HBaseConn 
    //to get a HBase cell and convert to a string
    }
}

object HBaseUtil extends HBaseUtilBody

, HBase RDD . , unit test HBase.

+4

Source: https://habr.com/ru/post/1654726/


All Articles