Spark on YARN + Secured hbase

You are not alone in the quest for Kerberos auth to HBase from Spark, cf. SPARK-12279 A little-known fact is that Spark now generates Hadoop “auth tokens” for Yarn, HDFS, Hive, HBase on startup. These tokens are then broadcasted to the executors, so that they don’t have to mess again with Kerberos auth, keytabs, etc. … Read more

There are 0 datanode(s) running and no node(s) are excluded in this operation

Two things worked for me, STEP 1 : stop hadoop and clean temp files from hduser sudo rm -R /tmp/* also, you may need to delete and recreate /app/hadoop/tmp (mostly when I change hadoop version from 2.2.0 to 2.7.0) sudo rm -r /app/hadoop/tmp sudo mkdir -p /app/hadoop/tmp sudo chown hduser:hadoop /app/hadoop/tmp sudo chmod 750 /app/hadoop/tmp … Read more

While writing to hdfs path getting error java.io.IOException: Failed to rename

You can do all the selects in one single job, get all the selects and union in a single table. Dataset<Row> resultDs = givenItemList.parallelStream().map( item -> { String query = “select $item as itemCol , avg($item) as mean groupBy year”; return sparkSession.sql(query); }).reduce((a, b) -> a.union(b)).get saveDsToHdfs(hdfsPath, resultDs );