java.lang.NoClassDefFoundError: org/apache/spark/Logging

org.apache.spark.Logging is available in Spark version 1.5.2 or lower version. It is not in the 2.0.0. Pls change versions as follows <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-streaming_2.11</artifactId> <version>1.5.2</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>1.5.2</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.10</artifactId> <version>1.5.2</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-streaming-kafka-0-8_2.11</artifactId> <version>1.6.2</version> </dependency>

How to query JSON data column using Spark DataFrames?

zero323’s answer is thorough but misses one approach that is available in Spark 2.1+ and is simpler and more robust than using schema_of_json(): import org.apache.spark.sql.functions.from_json val json_schema = spark.read.json(df.select(“jsonData”).as[String]).schema df.withColumn(“jsonData”, from_json($”jsonData”, json_schema)) Here’s the Python equivalent: from pyspark.sql.functions import from_json json_schema = spark.read.json(df.select(“jsonData”).rdd.map(lambda x: x[0])).schema df.withColumn(“jsonData”, from_json(“jsonData”, json_schema)) The problem with schema_of_json(), as zero323 points … Read more