A bottom up approach to category theory: Functors

This post is the first of a series attempting to give an intuitive idea of some “objects” taken from category theory and applied to programming: Functors, Applicative Functors and Monads. I’ll try to follow a practical, bottom-up approach rather than explaining the abstract concepts first and then how to derive them in programming. I will just follow a very basic programming principle to see the emergence of these objects: generalizing problems. Let’s start with Functors.

Imagine you are writing a Web application to manage contacts. The domain model can be represented by the following Contact datatype:

You are also using a persistent storage which exposes the following API to find contacts (for the sake of brevity, that’s a simplified API):

Ok now let’s start coding the application itself. The first feature you want to code is to expose the list of contacts in XML. To do that you retrieve all the contacts from the persistence system, transform them into a XML representation and return them to the client. Here is a possible implementation of a Play controller doing just that:

The listToXml function computes a list of XML nodes from a list of contacts.

Fine, but you want to support JSON as well as XML, so you write a listToJson function, transforming a list of contacts into a list of JSON objects representing the contacts:

Abstract over list computations

Immediately you notice that listToXml and listToJson are similar and duplicate some logic: applying a given function over a list of elements (in one case a function going from Contact to XML, in the other case a function going from Contact to JSON). Let’s refactor your code to abstract over this function: write a listMap function which takes a List[A] and a function f from A to B and returns a List[B] containing the application of f over each element of the list:

It’s more interesting to use two parameter lists instead of one because it allows to partially apply the function:

Written like this the listMap function can be equivalently read as “a function taking a function from A to B and a List[A] and returning a List[B]” or as “a function taking a function from A to B and returning a function from List[A] to List[B]” (when only the first parameter is supplied).

You can rewrite listToXml and listToJson in terms of listMap:

Where toXml and toJson have just been pulled out of the previous listToXml and listToJson functions:

The partial application of e.g. “listMap(toXml) _” yields a function which takes a List[Contact] and returns a List[Node], exactly as the listToXml function did.

Here is a REPL session showing the use of listToXml and listToJson functions:

scala> val paul = Contact("Paul", "paul@mail.com")
paul: models.Contact = Contact(Paul,paul@mail.com)

scala> val peter = Contact("Peter", "peter@mail.com")
peter: models.Contact = Contact(Peter,peter@mail.com)

scala> listToXml(List(paul, peter))
res0: List[scala.xml.Node] =
List(<contact>
       <name>Paul</name>
       <email>paul@mail.com</email>
     </contact>,
     <contact>
       <name>Peter</name>
       <email>peter@mail.com</email>
     </contact>)

scala> listToJson(List(paul, peter))
res1: List[play.api.libs.json.JsObject] =
List({"name":"Paul","email":"paul@mail.com"},
     {"name":"Peter","email":"peter@mail.com"})

This first refactoring helped to break down the code into smaller pieces that are more reusable: the toXml function just transforms a Contact into a XML tree, the toJson does the same but yields a JSON representation, and by composing these two functions with the listMap function you get functions applying these transformations over lists of contacts rather than just a single contact.

Your Web service also has to expose an entry point allowing to retrieve information about just one contact. So you want to write an action that finds a contact in the database (using the byId API method), transforms it into a representation (XML or JSON) and returns it. The byId method of your persistence engine returns an Option[Contact] so you want to transform an Option[Contact] into an Option[Node] or an Option[JsObject] similarly to what you did for lists of contacts. You can follow the same pattern as with lists and write an optionMap function, taking a function f from A to B and an Option[A] and returning an Option[B] containing the result of the application of f over the element held in the Option[A] (or None if it was empty):

Cool, you can reuse toXml and toJson and compose them with optionMap:

scala> optionMap(toXml)(Some(paul))
res2: Option[scala.xml.Node] =
Some(<contact>
       <name>Paul</name>
       <email>paul@mail.com</email>
     </contact>)

scala> optionMap(toJson)(Some(paul))
res3: Option[play.api.libs.json.JsObject] =
Some({"name":"Paul","email":"paul@mail.com"})

Abstract over computational contexts (a.k.a. Functors)

So your Web service exposes contacts and lists of contacts in JSON and XML. To expose a contact in, say, JSON, it has to find it using the persistence layer and maybe transform it into JSON. Similarly, to expose a list of contacts in JSON, it has to find them using the persistence layer and transform them into JSON. These two transformations are performed by the optionMap and listMap functions. They are similar in a sense that they both map an Option[Contact] or a List[Contact] into an Option[JsObject] or a List[JsObject], respectively.

Look at their type signatures:

What is the common intent shared by these two functions? Just by looking at their type signatures you can give the following answer: to give the opportunity to apply a function in a context. That’s exactly what the type signatures say: from a function from A to B, return a function from List[A] to List[B] (or from Option[A] to Option[B]). Indeed, listMap applies a function over the elements of a List and optionMap applies a function over the element of an Option. Can you write just one toJsonF function applying the toJson function over a context, whatever this context is (List, Option, etc.)? In other words, can you abstract over the type of context? To achieve that you want to write a fmap function with this type signature:

Where F is the type of the context (List, Option or whatever). You can capture this idea in a trait Functor that depends on two things: a type parameter representing the context type, and a method lifting a function into this context:

And you can implement it for List and Option:

Now you can define a function toJsonF computing the JSON representation of a contact in any context (any type for which it exists an implementation of Functor, actually):

You can check that it works well with both lists and options:

scala> toJsonF(listFunctor)(List(paul, peter))
res5: List[play.api.libs.json.JsObject] =
List({"name":"Paul","email":"paul@mail.com"},
     {"name":"Peter","email":"peter@mail.com"})

scala> toJsonF(optionFunctor)(Some(paul))
res6: Option[play.api.libs.json.JsObject] =
Some({"name":"Paul","email":"paul@mail.com"})

You can get some syntactic sugar by marking the functor parameter as implicit and the Functor instances as implicit values (following the typeclass pattern):

So you can just write:

scala> toJsonF(List(paul, peter))
res8: List[play.api.libs.json.JsObject] =
List({"name":"Paul","email":"paul@mail.com"},
     {"name":"Peter","email":"peter@mail.com"})

scala> toJsonF(Some(paul))
res9: Option[play.api.libs.json.JsObject] =
Some({"name":"Paul","email":"paul@mail.com"})

This toJsonF function is interesting because it can be applied in any context for which you have a Functor implementation. For now you only wrote implementations for List and Option but in the future you may support another type of context and it will just work without having to modify toJsonF. What kind of contexts could you use? There are a lot. Here are some examples:

Any Traversable subtype ;
Future is a context representing a value being computed in another execution context (another thread) ;
The type of functions where the type of their argument is fixed, e.g. String:

scala> type StringTo[A] = String => A
defined type alias StringTo

scala> val strToInt: StringTo[Int] = _.size
strToInt: java.lang.String => Int = <function1>

scala> strToInt("foo")
res13: Int = 3

Here the type StringTo[A] can be thought of as an infinite container of values of type A indexed by keys of type String.

Consistency of Functors

Being able to write such generic code is very interesting because of it is more extensible, however when you look at the implementation of the toJsonF function:

It’s not easy to understand what will happen to the context fc because you have no intuitive idea of how the implementation of Functor[F] will behave for a given context F. For instance, a possible implementation of Functor[Option] may always return None:

The above code compiles but isn’t worth using because it always discards the option’s value:

scala> toJsonF(Some(paul))(optionFunctor2)
res15: Option[play.api.libs.json.JsObject] = None

Thus, you want to have some guarantees on the behavior of Functor implementations so you can write code taking functors as parameters and can reason about it. Basically, you want a Functor[List] to map over each element of the list (in order), you want a Functor[Option] to map over the element of the option (if there is such an element), you want a Functor[Future] to map over a successful value (otherwise you want to get back the error), and so on. In other words, you want functors to be structure preserving: if you apply a function which does nothing (the identity function) over a context (e.g. a list), you want to get back exactly the same context (the same list). Here is a function which checks that a Functor implementation does not violate this constraint for a given context:

Having all Functors respecting this constraint (or this law) adds consistency and helps you to reason about the abstract concept of Functor.

You can see in the REPL that optionFunctor2 does not satisfy the identity law:

scala> identity(Some(paul), optionFunctor)
res17: Boolean = true

scala> identity(Some(paul), optionFunctor2)
res18: Boolean = false

scala> identity(List(paul, peter), listFunctor)
res19: Boolean = true

Some bits of category theory

To summarize, the Functor you have written allows to lift a function in a computational context in a way that preserves this context.

In category theory a Functor is a structure preserving mapping between two categories. In your case, the objects of the first category are Scala types and the objects of the second category are “types in a context” (List[Int], Option[String], etc., which are types too by the way). A “mapping” between these categories means being able to:

map a type A into a type F[A], and that’s exactly what the F[_] type constructor does ;
map a function from A to B into a function from F[A] to F[B], and that’s exactly what the fmap function does.

Finally, the fact that this mapping between categories must be structure preserving is reflected by the identity constraint introduced above (although this constraint is not checked by the type system). (Actually there is a second law a Functor must satisfy but in your case this second law is automatically satisfied for every implementation which satisfies the first one)

A runnable application inspired by this post can be found on GitHub.