Bits of Shapeless part 2: Generic Derivation
This is the second installment in a series of articles about Shapeless. The first post explained heterogeneous lists (HLists) and how to use them to do type-level recursion. Building on this ground, we will now talk about Generic derivation of case classes and sealed trait hierarchies. Along the way we shall cover other related subjects such as singleton-typed literals, products, coproducts (and labelled versions of those) and shapeless syntax goodies.
At the end of this post you will have built a basic Json rendering library for sealed hierarchies of case classes. Actually, if you aim to be a rockstar Scala coder you have to do a Json library one way or another :).
It is assumed that the reader knows about json, shapeless heterogeneous lists and how to work with them (see my last post), knows about implicit resolution and is acquainted with the concept of type classes in Scala.
Setup
Further from Shapeless itself, we will use the JAWN Json parsing and AST
library.
Some imports are assumed:
import shapeless._
import jawn.ast._
HLists and beyond
Let’s look at some very nice syntax shapeless provides us to manipulate objects as if they were HLists:
import syntax.std.tuple._
// import syntax.std.tuple._
(1,2,3).head
// res0: Int = 1
(1,2,3).tail
// res1: (Int, Int) = (2,3)
How is this possible?
(1,"a", 1.0).productElements // give me an HList
// res2: shapeless.::[Int,shapeless.::[String,shapeless.::[Double,shapeless.HNil]]] = 1 :: a :: 1.0 :: HNil
Maybe there is an implicit conversion from tuple to HList involved somehow… But:
case class C(x: Int, y: String)
// defined class C
val c = C(1, "hello")
// c: C = C(1,hello)
c.length
// res3: shapeless.Succ[shapeless.Succ[shapeless._0]] = Succ()
c.head
// res4: Int = 1
c.productElements
// res5: shapeless.::[Int,shapeless.::[String,shapeless.HNil]] = 1 :: hello :: HNil
As you might know, both tuples and case classes are products, so we may assume that the machinery involved is the same.
Let us go back a bit and explain how all these things were magically converted to HLists:
Generic and products
Here I present you the Generic
type class:
trait Generic[T] extends Serializable {
type Repr
def to(t : T) : Repr
def from(r : Repr) : T
}
What this essentially means is that instances of Generic[T]
can convert back and forth from T
to Repr
– a generic representation of T
. However two questions arise:
- How do we get instances of
Generic[T]
, further from implementing them by hand? - And shouldn’t be
Repr
some kind ofHList
?
To answer those:
val gen = Generic[(Int, String)]
// gen: shapeless.Generic[(Int, String)]{type Repr = shapeless.::[Int,shapeless.::[String,shapeless.HNil]]} = anon$macro$24$1@753b3812
gen.to((1, "hello"))
// res6: gen.Repr = 1 :: hello :: HNil
Note the type of the gen
variable: the inner type Repr
is automatically derived for us. This Generic
type might be strange for some. There exists an associated Generic.Aux[T, Repr]
, which is more commonly used, as you will see below. If you are not familiar with the Aux pattern, this is a good explanation, and this answer explains its applicability in shapeless.
val genc = Generic[C]
// genc: shapeless.Generic[C]{type Repr = shapeless.::[Int,shapeless.::[String,shapeless.HNil]]} = anon$macro$27$1@6dee40f3
genc.to(c)
// res7: genc.Repr = 1 :: hello :: HNil
genc.from(1 :: "bye" :: HNil)
// res8: C = C(1,bye)
Hopefully you are now convinced that Generic
instances can be derived for product types, namely tuples and case classes. The exact implementation details are not important for common usage, but for the brave ones here is the code.
Meanwhile, case classes convey more information than a product of elements. They associate a label with each element. How can this concept be represented in a generic form?
Types for All Literals
What if we had specific types for all literals? For instance, a single type for the literal 1
which is a subtype of integer? You might think “ah, nice, good for you”. Believe this will become useful.
Shapeless provides singleton typed literals – Witness
– for this purpose:
Witness(1)
// res9: shapeless.Witness.Aux[Int(1)] = shapeless.Witness$$anon$1@2febff2f
Witness("hello")
// res10: shapeless.Witness.Aux[String("hello")] = shapeless.Witness$$anon$1@dea9afe
Notice the types of these expressions, they refine integers and other literal types.
And they can be put to work:
// A function that only accepts the integer 2:
def just2(arg: Witness.`2`.T) = arg
// just2: (arg: Int(2))Int
just2(2)
// res12: Int = 2
It fails, at compile-time, for every integer except 2.
just2(3)
// <console>:23: error: type mismatch;
// found : Int(3)
// required: Int(2)
// just2(3)
// ^
just2(0)
// <console>:23: error: type mismatch;
// found : Int(0)
// required: Int(2)
// just2(0)
// ^
This Witness.<literal>
weird syntax is just Scala’s dynamic member lookup along with more macro-based machinery. Details here.
Last but not the least look at the signature of the following function:
def singletonValue[K <: Symbol](implicit ev: Witness.Aux[K]): ev.T = ev.value
// singletonValue: [K <: Symbol](implicit ev: shapeless.Witness.Aux[K])ev.T
singletonValue[Witness.`'name`.T]
// res15: Symbol with shapeless.tag.Tagged[String("name")] = 'name
This expression computes the Witness
and singleton type for the symbol 'name
. If we associate a value and its type with a singleton typed symbol, we have a good starting point to handle case classes in a Generic way.
Labelling values with types
As you may already have guessed, shapeless provides exactly what we need to label our values:
Object Labelled {
//...
type FieldType[K, +V] = V with KeyTag[K, V]
trait KeyTag[K, +V]
//...
}
FieldType[K,V]
specialises the type V
, tagging it with the K
type. And K
, can be a symbol, tagged with the literal symbol’s name. Actually K
is a singleton type which covers what we need to make a better Generic representation of case classes (and other types):
case class Person(name: String, age: Int)
// defined class Person
val p1 = Person("John", 30)
// p1: Person = Person(John,30)
val genPerson = LabelledGeneric[Person]
// genPerson: shapeless.LabelledGeneric[Person]{type Repr = shapeless.::[String with shapeless.labelled.KeyTag[Symbol with shapeless.tag.Tagged[String("name")],String],shapeless.::[Int with shapeless.labelled.KeyTag[Symbol with shapeless.tag.Tagged[String("age")],Int],shapeless.HNil]]} = shapeless.LabelledGeneric$$anon$1@48552363
val reprP1 = genPerson.to(p1)
// reprP1: genPerson.Repr = John :: 30 :: HNil
Pay attention to the Repr
type of the second last expression. It is essentially aHList
of the case class elements, each one refined wit the FieldType
tagging type described above. Shapeless calls this a Record
, and it has many interesting operations: by itself
import record._
// import record._
reprP1.get('name)
// res16: String = John
reprP1.get('age)
// res17: Int = 30
reprP1.toMap
// res18: Map[Symbol with shapeless.tag.Tagged[_ >: String("age") with String("name") <: String],Any] = Map('age -> 30, 'name -> John)
reprP1.keys
// res19: shapeless.::[Symbol with shapeless.tag.Tagged[String("name")],shapeless.::[Symbol with shapeless.tag.Tagged[String("age")],shapeless.HNil]] = 'name :: 'age :: HNil
reprP1.values
// res20: shapeless.::[String,shapeless.::[Int,shapeless.HNil]] = John :: 30 :: HNil
reprP1.remove('age)
// res21: (Int, shapeless.::[String with shapeless.labelled.KeyTag[Symbol with shapeless.tag.Tagged[String("name")],String],shapeless.HNil]) = (30,John :: HNil)
note that all the value types are properly retained.
Moreover, LabelledGeneric
is very much like Generic
, but specialised for these labelled product types.
Magic? No… Just even more macros and implicits along the way.
Rendering Json
This post’s motivation is to render case class type instances into Json. For this we need to step back – again – to scaffold the necessary infrastructure. If you know some other type class-based Json library this should be familiar ground to you.
We define the JsonWrites
typeclass bellow:
trait JsonWrites[-T] {
def write(t: T): JValue
}
// Companion object
object JsonWrites {
def apply[T](f: T => JValue) = new JsonWrites[T] {
def write(t: T) = f(t)
}
}
// some default instances
// probably better in the companion object...
implicit val stringWrites = JsonWrites(JString(_:String))
implicit val intWrites = JsonWrites[Int](JNum(_))
implicit val longWrites = JsonWrites[Long](JNum(_))
implicit def optionWrites[T](implicit tWrites: JsonWrites[T]) = JsonWrites[Option[T]](_.fold[JValue](JNull)(t => tWrites.write(t)))
// .......
// Some nice syntax
implicit class ToJsonOps[T](t: T)(implicit writes: JsonWrites[T]) {
def toJson = writes.write(t)
}
Let’s test the basics:
"hello".toJson
// res30: jawn.ast.JValue = "hello"
1.toJson
// res31: jawn.ast.JValue = 1
(Some(1): Option[Int]).toJson
// res32: jawn.ast.JValue = 1
(None: Option[Int]).toJson
// res33: jawn.ast.JValue = null
Ok, we have the basics in place, so let’s get our hands dirty.
Deriving case Classes
To derive JsonWrites
for case classes in a generic way we need to know:
- A Labelled generic representation of the case class
- How to render all the values in the case class
- How to render values along with their tagging symbols, and combine those.
As you may recall from the previous post, we process HLists
recursively and stop on HNil
. So let us have a go there first:
implicit val hnilToJson = JsonWrites[HNil](_ => JObject.empty)
Next we shall derive the recursive case from what we already know and, not surprisingly, from the recursive definition itself:
import labelled.FieldType
implicit def hconsToJson[Key <: Symbol, Head, Tail <: HList](
implicit key: Witness.Aux[Key],
headWrites: JsonWrites[Head],
tailWrites: JsonWrites[Tail])
: JsonWrites[FieldType[Key, Head] :: Tail] =
JsonWrites[FieldType[Key, Head] :: Tail] { l =>
// compute the tail json:
val json = tailWrites.write(l.tail)
// JObject has a mutable map, just add to it:
// (yes, Mutable).
json.set(key.value.name, headWrites.write(l.head))
json
}
To finish we just have to use the labelled product representation of the case class to render it to Json, so we tell exactly that to the compiler, given we can always infer an implicit labelled generic representation for our type T
:
implicit def lgenToJson[T, Repr](
implicit lgen: LabelledGeneric.Aux[T, Repr],
reprWrites: JsonWrites[Repr]) = JsonWrites[T] { t =>
reprWrites.write(lgen.to(t))
}
// lgenToJson: [T, Repr](implicit lgen: shapeless.LabelledGeneric.Aux[T,Repr], implicit reprWrites: JsonWrites[Repr])JsonWrites[T]
Testing it:
Person("John", 30).toJson
// res35: jawn.ast.JValue = {"age":30,"name":"John"}
case class Address(Street: String, Number: Option[Int])
// defined class Address
Address("Small Street", None).toJson
// res36: jawn.ast.JValue = {"Number":null,"Street":"Small Street"}
Address("Big Street", Some(22))
// res37: Address = Address(Big Street,Some(22))
Deriving sealed hierarchies
The canonicall way to represent ADTs – abstract data types – in scala is using sealed traits and case classes/objects, such as the following:
sealed trait Shape extends Product with Serializable
case class Square(side: Long) extends Shape
case class Rectangle(width: Long, eight: Long) extends Shape
case class Circle(radius: Long) extends Shape
We can render a square, circle or rectangle to json, since they are case classes and we already know how to render those:
Circle(2).toJson
// res38: jawn.ast.JValue = {"radius":2}
Square(4).toJson
// res39: jawn.ast.JValue = {"side":4}
Rectangle(2,4).toJson
// res40: jawn.ast.JValue = {"eight":4,"width":2}
However, if we don’t know what kind of shape we have there is a problem:
def shapeToJson(s: Shape) = implicitly[JsonWrites[Shape]].write(s)
// <console>:36: error: could not find implicit value for parameter e: JsonWrites[Shape]
// def shapeToJson(s: Shape) = implicitly[JsonWrites[Shape]].write(s)
// ^
To solve this problem there is the concept of a Coproduct, the dual concept of a product. As oposed to the product, a coproduct represents one element that belong to just one of a set of types. It is like scale’s Either extended to an arbitrary number of alternatives. In Shapeless:
type stringOrIntCP = String :+: Int :+: CNil
// defined type alias stringOrIntCP
This type represents something that can be a string or an integer.
Let us see what the coproduct definition looks like in shapeless:
sealed trait Coproduct extends Product with Serializable
sealed trait :+:[+H, +T <: Coproduct] extends Coproduct {
// Non-recursive fold (like Either#fold)
def eliminate[A](l: H => A, r: T => A): A
}
final case class Inl[+H, +T <: Coproduct](head : H) extends :+:[H, T] {
override def eliminate[A](l: H => A, r: T => A) = l(head)
}
final case class Inr[+H, +T <: Coproduct](tail : T) extends :+:[H, T] {
override def eliminate[A](l: H => A, r: T => A) = r(tail)
}
sealed trait CNil extends Coproduct {
def impossible: Nothing
}
It is a bit more involved than HList
.
- As analogous to HList’s
HNil
,CNil
marks the end of the type definition, and is used to stop recursion and to enable proper type construction. :+:
is analogous to Hlist’s::
type constructor, however it can be either anInl
orInr
.- At runtime if we have an instance of
Inl
we Know we are handling an instance of the “head” of the coproduct. - Else if we come across an
Inr
we’ve got the “tail”, which can have either anInl
or another nestedInr
inside. - We are not supposed to reach
CNil
, ever. - The
eliminate
method works much likeeither
’sfold
. If it is called on an instance ofInl
the first function is used to produce anA
value from the “head” element, else the second one is used to handle the recursive case within the “tail” part of the coproduct, eventually returning anA
value.
The derivation example below will make this all much clearer.
Shapeless LabelledGeneric
is capable of generating a coproduct representation of a sealed trait hierarchy, having all the known children as the possible values, tagged with its class’s name – a labelled coproduct.
LabelledGeneric[Shape]
// res41: shapeless.LabelledGeneric[Shape]{type Repr = shapeless.:+:[Square with shapeless.labelled.KeyTag[Symbol with shapeless.tag.Tagged[String("Square")],Square],shapeless.:+:[Rectangle with shapeless.labelled.KeyTag[Symbol with shapeless.tag.Tagged[String("Rectangle")],Rectangle],shapeless.:+:[Circle with shapeless.labelled.KeyTag[Symbol with shapeless.tag.Tagged[String("Circle")],Circle],shapeless.CNil]]]} = shapeless.LabelledGeneric$$anon$2@46e64391
Note the type of the expression result; all shapes are there, properly tagged.
To make our generic shape json-renderable (and other sealed trait hierarchies) let us tell the compiler how to render Coproduct
instances into Json.
Although CNil
should never be reached, we need an implicit JsonWrites[CNil]
in scope, so the compiler can properly infer the right implicit (stop the type-level recursion):
implicit def cnilJsonWrites: JsonWrites[CNil] =
JsonWrites[CNil](_ => throw new RuntimeException("shall not happen"))
The recursive case is easy, we add a "$type"
field with the class’s simple name so a deserializer can differentiate among types.
implicit def cconsJsonWrites[Key <: Symbol, Head, Tail <: Coproduct](
implicit key: Witness.Aux[Key],
headWrites: JsonWrites[Head],
tailWrites: JsonWrites[Tail])
: JsonWrites[FieldType[Key, Head] :+: Tail] =
JsonWrites[FieldType[Key, Head] :+: Tail] {
_.eliminate({ head =>
val json = headWrites.write(head)
// Add the $type discriminator
json.set("$type", JString(key.value.name))
json
}, tail => tailWrites.write(tail))
}
It is quite similar to what we did with the HList
above.
Instead of using the clever eliminate
method we could certainly have employed pattern matching, something like:
case Inl(head) => ...
case Inr(tail) => ...
We already defined the conversion from T
to its generic representation above, so it will be used as such (the new coproduct-related implicit will be picked automatically).
Now our shapeToJson
method already works:
def shapeToJson(s: Shape) = implicitly[JsonWrites[Shape]].write(s)
// shapeToJson: (s: Shape)jawn.ast.JValue
shapeToJson(Circle(20))
// res42: jawn.ast.JValue = {"$type":"Circle","radius":20}
shapeToJson(Rectangle(2,3))
// res43: jawn.ast.JValue = {"$type":"Rectangle","eight":3,"width":2}
Voi la!
Conclusion
What we have shown is the basic pattern for processing case classes and sealed trait hierarchies. In summary:
- Define how labelled products (case classes) are to be processed. Cases for HNil and
FieldType[Key, Head] :: Tail
. - Define processing of labelled coproducts (sealed traits). Cases for
CNil
andFieldType[Key, Head] :+: Tail
. - Use the automatically derived generic
T
representation to process values of typeT
, i.e. implicit parameters of typeLabelledGeneric.Aux[T, Repr]
.
Further considerations
The Json dumping code we have shown is pretty basic and not that flexible (think, for instance in treatment of absent values, we could have omitted them instead of using null
). Moreover, it would be good to have the reading infrastructure (JsonReads[T]
), which is left as an exercise. It is a bit more verbose and complex to implement due to failure handling, so I chose to present the more simpler example to avoid distractions.
There are proper Scala Json libraries using shapeless. spray-json-shapeless is a very interesting case since the code is very clear and concise, and has some nice ideas and patterns for this sort of implementations. Circe’s generic derivation is more complex, although much more complete and flexible.
In the shapeless realm there is automatic type class derivation which can help a bit with generic derivation, although I find the presented pattern more flexible for intricate problems.
Two nice articles that cover this subject which I also recommend are this and this.
At e.Near we have directly relied on Shapeless generic derivation to implement such things as writing/reading XML or producing messages to Kafka topics.