Easy idiomatic way to define Ordering for a simple case class

My personal favorite method is to make use of the provided implicit ordering for Tuples, as it is clear, concise, and correct: case class A(tag: String, load: Int) extends Ordered[A] { // Required as of Scala 2.11 for reasons unknown – the companion to Ordered // should already be in implicit scope import scala.math.Ordered.orderingToOrdered def

Scala case class inheritance

My preferred way of avoiding case class inheritance without code duplication is somewhat obvious: create a common (abstract) base class: abstract class Person { def name: String def age: Int // address and other properties // methods (ideally only accessors since it is a case class) } case class Employer(val name: String, val age: Int,

How to define schema for custom type in Spark SQL?

Spark 2.0.0+: UserDefinedType has been made private in Spark 2.0.0 and as for now it has no Dataset friendly replacement. See: SPARK-14155 (Hide UserDefinedType in Spark 2.0) Most of the time statically typed Dataset can serve as replacement There is a pending Jira SPARK-7768 to make UDT API public again with target version 2.4. See

What is the difference between Scala’s case class and class?

Case classes can be seen as plain and immutable data-holding objects that should exclusively depend on their constructor arguments. This functional concept allows us to use a compact initialization syntax (Node(1, Leaf(2), None))) decompose them using pattern matching have equality comparisons implicitly defined In combination with inheritance, case classes are used to mimic algebraic datatypes.

Case class equality in Apache Spark

This is a known issue with Spark REPL. You can find more details in SPARK-2620. It affects multiple operations in Spark REPL including most of transformations on the PairwiseRDDs. For example: case class Foo(x: Int) val foos = Seq(Foo(1), Foo(1), Foo(2), Foo(2)) foos.distinct.size // Int = 2 val foosRdd = sc.parallelize(foos, 4) foosRdd.distinct.count // Long

