How to read from hbase using spark

A Basic Example to Read the HBase data using Spark (Scala), You can also wrtie this in Java : import org.apache.hadoop.hbase.client.{HBaseAdmin, Result} import org.apache.hadoop.hbase.{ HBaseConfiguration, HTableDescriptor } import org.apache.hadoop.hbase.mapreduce.TableInputFormat import org.apache.hadoop.hbase.io.ImmutableBytesWritable import org.apache.spark._ object HBaseRead { def main(args: Array[String]) { val sparkConf = new SparkConf().setAppName(“HBaseRead”).setMaster(“local[2]”) val sc = new SparkContext(sparkConf) val conf = HBaseConfiguration.create() val … Read more

Spark on YARN + Secured hbase

You are not alone in the quest for Kerberos auth to HBase from Spark, cf. SPARK-12279 A little-known fact is that Spark now generates Hadoop “auth tokens” for Yarn, HDFS, Hive, HBase on startup. These tokens are then broadcasted to the executors, so that they don’t have to mess again with Kerberos auth, keytabs, etc. … Read more

HBase Kerberos connection renewal strategy

A Kerberos TGT has a lifetime (e.g. 12h) and a renewable lifetime (e.g. 7 days). As long as the ticket is still valid and is still renewable, you can request a “free” renewal — no password required –, and the lifetime counter is reset (e.g. 12h to go, again). The Hadoop authentication library spawns a … Read more