scala – Page 34 – Make Me Engineer

How to query JSON data column using Spark DataFrames?

May 2, 2022 by Tarik

zero323’s answer is thorough but misses one approach that is available in Spark 2.1+ and is simpler and more robust than using schema_of_json(): import org.apache.spark.sql.functions.from_json val json_schema = spark.read.json(df.select(“jsonData”).as[String]).schema df.withColumn(“jsonData”, from_json($”jsonData”, json_schema)) Here’s the Python equivalent: from pyspark.sql.functions import from_json json_schema = spark.read.json(df.select(“jsonData”).rdd.map(lambda x: x[0])).schema df.withColumn(“jsonData”, from_json(“jsonData”, json_schema)) The problem with schema_of_json(), as zero323 points … Read more

How to store custom objects in Dataset?

May 2, 2022 by Tarik

Update This answer is still valid and informative, although things are now better since 2.2/2.3, which adds built-in encoder support for Set, Seq, Map, Date, Timestamp, and BigDecimal. If you stick to making types with only case classes and the usual Scala types, you should be fine with just the implicit in SQLImplicits. Unfortunately, virtually … Read more

How do I get around type erasure on Scala? Or, why can’t I get the type parameter of my collections?

April 27, 2022 by Tarik

You can do this using TypeTags (as Daniel already mentions, but I’ll just spell it out explicitly): import scala.reflect.runtime.universe._ def matchList[A: TypeTag](list: List[A]) = list match { case strlist: List[String @unchecked] if typeOf[A] =:= typeOf[String] => println(“A list of strings!”) case intlist: List[Int @unchecked] if typeOf[A] =:= typeOf[Int] => println(“A list of ints!”) } You … Read more

Where does Scala look for implicits?

April 27, 2022 by Tarik

Types of Implicits Implicits in Scala refers to either a value that can be passed “automatically”, so to speak, or a conversion from one type to another that is made automatically. Implicit Conversion Speaking very briefly about the latter type, if one calls a method m on an object o of a class C, and … Read more

What are all the uses of an underscore in Scala?

April 25, 2022 by Tarik

The ones I can think of are Existential types def foo(l: List[Option[_]]) = … Higher kinded type parameters case class A[K[_],T](a: K[T]) Ignored variables val _ = 5 Ignored parameters List(1, 2, 3) foreach { _ => println(“Hi”) } Ignored names of self types trait MySeq { _: Seq[_] => } Wildcard patterns Some(5) match … Read more