**NOTE**: I recently learned about Yours. As an experiment, I mirrored this post there. I couldn’t find a way to embed math and didn’t want to write ASCII art, so the version here is much easier to follow.

Sequences of numbers appear everywhere throughout computer science and mathematics. One example is the triangular numbers (0, 1, 3, 6, 10, 15, …), where the \(n\)th term is described by the equation

\[ a_{n+1} = a_n + n + 1 \]

This equation is known as a recurrence. Another example is the Fibonacci numbers (0, 1, 1, 2, 3, 5, …), given by the recurrence

\[ a_{n+2} = a_{n+1} + a_n \]

Given a recurrence, is there a way to find a closed-form expression in \(n\) that describes the \(n\)th term of the sequence? There is more than one answer to this question, but the one we will investigate involves the theory of generating functions.

If a function \(f(x)\) can be represented as a formal power series in \(x\), with \(a_n\) being the coefficient of \(x^n\), then \(f(x)\) is called the (ordinary) generating function of \(a_n\). In other words, \(f(x)\) must satisfy

\[ f(x) = \sum_{n \ge 0} a_n x^n \]

For example, the sequence consisting of all 1’s is represented by the formal sum

\[ \sum_{n \ge 0} x^n = 1 + x + x^2 + \cdots \]

This sum is a geometric series, and can be simplified to the generating function

\[ \sum_{n \ge 0} x^n = \frac{1}{1 - x} \]

**NOTE:** For readers who are familiar with calculus, the fact that we are working with formal power series and not continuous functions means that we don’t have to worry about issues like convergence.

Once we have a generating function, we can use a method such as partial fraction expansion to compute the terms we care about. But is there a more general method for finding a generating function from a recurrence? One way is as follows:

- Multiply both sides of the equation by \(x^n\)
- Sum both sides of the equation for all \(n \ge 0\)
- Rewrite sums to be in the form of the generating function we are looking for, and rename them to A(x)
- Solve the equation for A(x)

Let’s try it out on the triangular numbers. To simplify things, let’s apply steps 1 through 3 to each side separately, and then combine them for the final step. Starting with the left-hand side, we have

\[ \begin{align*} \sum_{n \ge 0} a_{n+1}\,x^n &= \frac{\sum_{n \ge 0} a_{n+1}\,x^{n+1}}{x}\\ &= \frac{\sum_{n \ge 0} a_n\,x^n - a_0}{x}\\ &= \frac{A(x) - a_0}{x}\\ &= \frac{A(x)}{x} \end{align*} \]

The right hand reduces to

\[ \begin{align*} \sum_{n \ge 0} \left(a_n + n + 1\right)\,x^n &= \sum_{n \ge 0} a_n\,x^n + \sum_{n \ge 0} n\,x^n + \sum_{n \ge 0} x^n \\ &= A(x) + \frac{x}{(1-x)^2} + \frac{1}{1-x} \end{align*} \]

Combining the two and solving for A(x), we find the generating function:

\[ \begin{align*} \frac{A(x)}{x} &= A(x) + \frac{x}{(1-x)^2} + \frac{1}{1-x} \\ A(x) &= x A(x) + \frac{x^2}{(1-x)^2} + \frac{x}{1-x} \\ (1-x) A(x) &= \frac{x}{(1-x)^2} \\ A(x) &= \frac{x}{(1-x)^3} \end{align*} \]

which gives the following formula after partial fraction expansion (ommitted):

\[ a_n = \frac{n(n+1)}{2} \]

If you’re familiar with other methods of solving recurrences, this approach may seem fairly heavy-handed without much benefit. However, not only do generating functions encompass many different kinds of recurrences, later techniques allow solving a wide variety of counting problems in enumerative combinatorics.

Ordinary generating functions are not the only kind of generating functions. There are several other families, including exponential and Dirichlet generating functions. Choosing the right family for a problem can make it much easier to solve. If you’re interested in learning more, I recommend checking out the book generatingfunctionology, available for free and in hard copy at major online retailers.

Using the method described in the article, find the generating function for the Fibonacci numbers. Remember that a₀ = 0 and a₁ = 1.

Solution available on Yours.

]]>Parametric poymorphism is a common theme in typed functional programming. It is sometimes shortened to just “polymorphism”, but we will refrain in this article due to a conflicting definition in an object-oriented context.

In everyday programming, we often deal with concrete types like strings and integers. Introducing parametric polymorphism allows us to write functions that operate over any type, not just a specific one. This helps to promote code reuse, allowing programmers to write a few generic functions instead of many specialized ones.

One way to view a generic function is that it takes in one or more types as parameters along with the normal parameters. For instance, consider the identity function:

```
{-# LANGUAGE ExplicitForAll #-}
id :: forall a. a -> a
id x = x
```

`id`

can be seen as a function in 2 arguments. The first argument is a type named `a`

and the second is a value of type `a`

. There’s just no reason to explicitly represent the type as an argument to the function because we cannot manipulate types at the term level.

We are quite limited in the number of operations we can perform on values of a generic type. Since the type is not constrained in any way, it is only possible to use the value in one of the following ways:

- return the value as-is
- discard the value
- apply a generic function to the value

This seems very restricting, but it enables us to come to very strong conclusions about what these generic values look like.

Let’s play a game. In this game, I give you the type of an expression, and you classify all the different values of that type. The goal is to be specific as possible, with the best answer uniquely characterizing all of the values. We will ignore any “tricky” values like `undefined`

and only consider ground terms.

For example, suppose I gave you the type `Bool`

. `Bool`

has two possible values which uniquely define it: `True`

and `False`

. If I gave you an arbitrary value of type `Bool`

, you would know that it must be one of these two. Any type that is a regular ADT will be easy to describe in a similar manner.

The situation gets a lot more complicated when function types become involved. How do you classify the type `Int → Int`

? This type contains “simple” values like the successor function, but also functions which can be extraordinarily complex. As an example, consider the following function:

```
c :: Int → Int
c 1 = 1
c n | n `mod` 2 == 0 = c $ n `div` 2
| otherwise = c $ 3 * n + 1
```

It is currently an open problem whether this function even terminates at all. It seems our game quickly becomes intractable as we start to make the types even slightly more complicated.

What if we try to classify `forall a. a → a`

? Any implementation is limited by the three available operations listed above, and must evaluate to a value of type `a`

. This leaves the identity function as the only possible implementation. Paradoxically, by knowing less about our type, we can say more about what functions over that type do.

What can we say about the following type?

`f :: forall a. a → a → a`

Unlike before, there is more than one possible implementation. This corresponds to the fact that we can now choose whether to evaluate to the first or second argument of the function. Thus, the two implementations are `\x y → x`

and `\x y → y`

.

By induction we can show that any function which accepts `n`

values of type `a`

and evaluates to an `a`

has `n`

possible implementations. The `i`

th implementation corresponds to the function which evaluates to its `i`

th parameter. Since you can form an isomorphism between two types that have the same number of inhabitants, this provides a way to encode values of finite types into a generic function.

A final example involves functions over tuples. Consider the type:

`f :: forall a. (a, a) → (a, a)`

Functions of this type can either leave the tuple intact or flip the first and second values. We can show that for a generic function over `n`

-tuples, there are `n!`

possible implementations. Each implementation corresponds to a permutation of the original tuple.

Generalizing further, we can show that a function from n-tuples to m-tuples (where *m* ≤ *n*) has n choose m possible implementations. Other formulas exist involving generic functions in more than one type parameter — try to figure some out!

While we have restricted our analysis to combinatoric methods, there are other results that come from parametricity. In the paper Theorems for free!, Philip Wadler describes even stronger “theorems” that every polymorphic type must satisfy. This allows him to derive results about familiar functions such as map and fold.

]]>This past weekend, I participated in Ludum Dare 38 with Kyle Sammons, a friend of mine. Kyle handled the programming while I took on some of the creative aspects. The “story” revolves around a child hopping between paintings, solving puzzles, and jumping between platforms.

I figured it would be fun to break out of my comfort zone and try my hand at pixel art since I spend most of my time programming. While I didn’t do all of the art for the game, there was enough to give me main credit. One amusing symptom is that you can infer the order in which I drew the art — tiles designed at the end are *slightly* more intricate than what I started with.

If you’d like to give the game a shot, it’s available here.

]]>Java is the de facto choice for my daily work. As a result, I often miss useful features from other languages. Introducing non-idiomatic code in a team-based context can do more harm than good, but it doesn’t stop me from wondering: what would higher-kinded types look like in a language that does not have built-in support for them? In these posts, I take on this quixotic task.

**NOTE**: This is an ongoing research topic, so I have no definite solutions. For brevity’s sake, most examples will be in Haskell.

Before we can tackle the problem of implementing higher-kinded types, let’s take a step back and talk about what a higher-kinded type is. Just as a type classifies a collection of values, a *kind* classifies a collection of types. It is the same concept, but “one level higher” in hierarchy of abstraction.

Since most languages have relatively simple type systems, the concept of kind may be new. In these languages, all types have the same kind, denoted ★ (or `*`

in ASCII). Higher-kinded types extend this system by introducing a new way to form kinds: given any two kinds `k`

and `k'`

, we can form the *function kind* `k -> k'`

. We can form kinds such as ★, ★ →★, (★→★)→(★→★), et al.

This is an analogue of the *function type*, which allows us to form a new type `a -> b`

from any two types `a`

and `b`

. The only difference is that instead of dealing with types of values, we are now dealing with kinds of types.

A common type in Haskell is `Identity a`

, which is just a wrapper over the type `a`

. Usually it is defined like so:

`newtype Identity a = Identity { runIdentity :: a }`

We can construct a value such as `Identity 10`

, which has a type of `Identity Int`

. Since this is a value, it has kind ★. However, the type constructor `Identity`

can be seen as a function over types, taking in an input type `a`

and producing an output type `Identity a`

. Thus, the type constructor `Identity`

has kind ★ →★.

`Identity`

can be defined in many languages. For instance, it can be defined in Java:

```
public final class Identity<A> {
private final A value;
public Identity(A value) {
this.value = value;
}
public A value() {
return this.value;
}
}
```

Given this, it may be confusing when someone says that a language does not support higher-kinded types. Haven’t we just defined one here? Usually, they mean that the language does not support *abstraction* over these types, i.e. we cannot be polymorphic over any higher-kinded type. A more correct description of the missing feature is *higher-kinded polymorphism*. If we had support for higher-kinded polymorphism, we would be able to define an interface for `Functor`

in Java:

```
public interface Functor<F> {
<A, B> F<B> fmap(Function<A, B> f, F<A> x);
}
```

The issue here is that `F`

must be of kind ★ →★ in order for `F<A>`

and `F<B>`

to be well-typed. Without the ability to instantiate `F`

to a particular parameter, we cannot implement this.

This topic is heavily inspired by Lightweight Higher-Kinded Polymorphism, which describes an implementation in OCaml. However, most languages do not have support for ML functors, which limits the applicability of these results.

The core idea of this paper is to convert a type `F<A>`

into `App<F, A>`

by giving `F`

the kind of ★. Since all polymorphic types now have kind ★, we no longer run into limitations of the host language. In this approach, we also apply a type family to map `App<F, A>`

back to the the underlying type.

What is a type family? The previous link describes it in much better detail than I could hope to, but for the sake of self-containment we will go over it briefly here.

Previously, we mentioned that a type constructor could be interpreted as a function over types. However, these functions are very restricted in what we allow them to map to, e.g. `Identity`

can only produce `Identity a`

s. A *type family* allows us to relax this restriction and map input types to arbitrary types. For example, this type family will map `Int`

to `String`

, but `Float`

is mapped to `Maybe String`

:

```
type family F a where
F Int = String
F Float = Maybe String
```

**NOTE:** The `TypeFamilies`

extension must be enabled in GHC for this to work.

This is a *closed* type family, which means that we have exhaustively defined all of the input types `F`

will map. In an *open* type family we leave the definition open for extension by others. To allow arbitrary implementations of functors we will use an open type family, initially defining a mapping for list types:

```
data List
type family Apply f a
type instance Apply List a = [a]
```

Ideally we would continue in our quest, defining `Functor`

and providing an implementation:

```
class Functor f where
fmap :: (a -> b) -> Apply f a -> Apply f b
instance Functor List where
fmap = map
```

However, this fails because `Apply`

does not represent an injective function over types. Without this knowledge, the type checker rejects our definition of `Functor`

. To solve this, we can introduce a new type that wraps `Apply f a`

:

`newtype App f a = Inject { project :: Apply f a }`

Since all type constructors are injective, the type checker is suitably convinced and we can complete our implementation:

```
class Functor f where
fmap :: (a -> b) -> App f a -> App f b
instance Functor List where
fmap f = Inject . map f . project
```

This approach is straightforward enough to warrant an automated solution. One implementation is the higher library in OCaml.

Let’s take the Haskell implementation and naïvely attempt to transform it to Java. In A Comparative Study of Language Support for Generic Programming an approach for associated types in Java is laid out. This allows us to recover behavior similar to type families:

```
public interface Apply<F, A> {
}
public class App<F, A, T extends Apply<F, A>> {
private final T value;
public App(T value) {
this.value = value;
}
}
public interface Functor<F> {
<A, T extends Apply<F, A>, B, U extends Apply<F, B>>
App<F, B, U> fmap(Function<A, B> f, App<F, A, T> x);
}
```

Everything looks good so far; we have managed to define a `Functor`

interface. Since Java does not have full support for associated types, we introduce the type `T`

(resp. `U`

) as a parameter, using `F`

and `A`

(resp. `B`

) to constrain what is allowed.

However, the problem surfaces when you attempt to implement `Functor`

. In its definition, we allow `T`

and `U`

to be any subtypes of `Apply<F, A>`

and `Apply<F, B>`

, respectively. But in our implementation, we wish for them to be a specific concrete subtype. Without this, we will not be able to extract a `List<B>`

after performing `fmap`

, since we do not have enough information to conclude that `U = List<B>`

.

We’ve hit our first show stopper. In the next post, I will describe another approach based on GADTs which runs into its own set of challenges during implementation.

]]>In dependently typed programming, I find it’s common to start with some “imprecise” type and refine it to something that contains more information. For example, I might receive a `String`

from user input and want to turn that into a `Vect n Char`

(for some given `n`

) when I write proofs about my code:

```
fromString : String
-> (n : Nat ** Vect n Char)
```

Generally this refinement is accompanied by proofs that the refinement was correct. For instance, we might want to show that length is preserved:

```
lengthPreserved : (s : String)
-> getWitness (fromString s) = length s
```

and use this to build a copy of the refined value with the existential unpacked:

```
fromString' : (s : String) -> Vect (length s) Char
fromString' s = let prf = lengthPreserved s in
rewrite sym prf in
getProof $ fromString s
```

**Exercise**: Implement `fromString`

and `lengthPreserved`

.

This factorization of `fromString'`

into the simpler components of `lengthPreserved`

and `fromString`

makes you wonder if this process can be generalized. To start off this generalization, we are looking for a function that takes `length`

and the two components and produces `fromString'`

:

```
fromString' : (length : String -> Nat)
-> (fromString : String -> (n : Nat ** Vect n Char))
-> (lengthPreserved : (s : String) -> getWitness (fromString s) = length s)
-> (s : String)
-> Vect (length s) Char
```

Generalizing the dependent pair to the `Sigma`

typeclass (and making the substitution `Vect' n = Vect n Char`

), we get:

```
fromString' : (length : String -> Nat)
-> (fromString : String -> Sigma Nat Vect')
-> (lengthPreserved : (s : String) -> getWitness (fromString s) = length s)
-> (s : String)
-> Vect' (length s)
```

and we’re basically there. Now, all that’s left is to parameterize on our unrefined type `String`

, our refined type `Vect'`

, and its index the `Nat`

(along with some generous renaming and re-arranging):

```
sko : {a b : Type}
-> { pred : b -> Type }
-> (iexists : a -> Sigma b pred)
-> (f : a -> b)
-> (witness : (s : a) -> getWitness (iexists s) = f s)
-> (s : a)
-> p (f s)
```

Now, to figure out what this thing really is, let’s turn to the Curry-Howard interpretation of it, going parameter by parameter:

`iexists : a -> Sigma b t`

If we think of `a`

as an index, this describes an indexed family of existence proofs on the predicate `pred`

, with witnesses taking type `a`

.

`f : a -> b`

`f`

tells us that `a`

can be interpreted as something other than an index, by relating it to `b`

.

`(witness : (s : a) -> getWitness (iexists s) = f s)`

Now, `f`

has been related to the existence proofs by being identical to the witness over `a`

. This tells us that `f`

perfectly describes the witnesses of the indexed family `iexists`

.

`(s : a) -> p (f s)`

Given all these conditions, we can think of this as a new family of indexed proofs, except the witness has been replaced with a transformation of `f`

. The existential type has been eliminated and replaced with a universally quantified type instead.

Now, whenever we want to prove that a refinement into another type has a certain property, we don’t have to do it all in one go. We also don’t have to worry about unpacking existentials afterwards. Curiously, this process looks a lot like skolemization.

]]>Consider a basic language containing integer and boolean literals along with addition. The first approach to modeling this language could be a type like the following:

```
data Exp
= I Int
| B Bool
| A Exp Exp
```

with terms such as `I 10`

, `B True`

, `A (I 5) (I 6)`

, and so on. Since this is a language, we will want to end up writing some interpreters for it, such as the following function to pretty-print a term:

```
pprint :: Exp -> String
pprint (I i) = show i
pprint (B b) = show b
pprint (A l r) = "(" ++ pprint l ++ " + " ++ pprint r ++ ")"
```

The kind of recursion being performed here can be expressed easily as a catamorphism, so let’s go ahead and generalize it out. We’ll use the language of F-algebras, introducing a new type parameter `a`

and replacing all of the recursive references with it.

```
newtype Fix f = Fix { unFix :: f (Fix f) }
cata :: Functor f => (f a -> a) -> Fix f -> a
cata h = h . fmap (cata h) . unFix
data ExpF a
= I Int
| B Bool
| A a a
deriving (Functor)
type Exp = Fix ExpF
```

Then, we can restate our pretty-printer without doing explicit recursion:

```
pprint :: Exp -> String
pprint = cata go
where go (I i) = show i
go (B b) = show b
go (A l r) = "(" ++ l ++ " + " ++ r ++ ")"
```

But what happens when we want to write an evaluator for our language? The existence of the `B`

constructor “messes everything up” - we can’t just implement `eval :: Exp -> Int`

because not all of our terms will evaluate to an integer. Enter GADTs:

```
data Exp a where
I :: Int -> Exp Int
B :: Bool -> Exp Bool
A :: Exp Int -> Exp Int -> Exp Int
```

Now, the `Exp`

type is indexed by the type of the expression it represents, so now we can implement a well-typed evaluator:

```
eval :: Exp a -> a
eval (I i) = i
eval (B b) = b
eval (A l r) = eval l + eval r
```

But now we’re back to where we’ve started in terms of expressing recursive functions. Must we really implement our recursion manually for *every* GADT we define?

As you might have guessed from the title, the answer is no! We can find fixed points of GADTs in a way that is very similar to the recursive ADTs we are used to.

`HFix`

We can’t simply replace every recursive reference with a new type parameter `a`

of kind ★ — since `Exp`

is indexed by a type, we would lose the fact that the argument to the constructor should also be indexed by a type. Instead, let’s introduce a new type parameter `f :: * -> *`

, which will allow us to retain the indexed type:

```
data ExpF a f where
I :: Int -> ExpF Int f
B :: Bool -> ExpF Bool f
A :: f Int -> f Int -> ExpF Int f
```

Now we need an analogue to `Fix`

that works for this kind of type. The kind of `ExpF`

is `* -> (* -> *) -> *`

, and our output should be the same kind as our original `Exp`

, so we want a type of kind `(* -> (* -> *) -> *) -> (* -> *)`

.

```
newtype HFix h a = HFix { unHFix :: h a (HFix h) }
type Exp a = HFix ExpF a
```

Just like in `Fix`

, we instantiate the last type parameter of our input to the fixed-point type, recursively tying the knot. To build up an `Exp a`

, we just apply `HFix`

after applying each constructor from `ExpF`

. Some examples of such terms include `HFix $ I 1`

and `HFix $ A (HFix $ I 2) (HFix $ I 2)`

.

It’s getting rather unwieldy to construct terms of our language, so let’s go ahead and define some smart constructors which use `HFix`

for us:

```
i :: Int -> Exp Int
i x = HFix $ I x
b :: Bool -> Exp Bool
b x = HFix $ B x
a :: Exp Int -> Exp Int -> Exp Int
a l r = HFix $ A l r
```

These aren’t necessary to perform our evaluation, but would come in handy in a real-life implementation scenario.

Now that we have a fixed-point representation of our data type, we need an analogue of `cata`

for types of this kind. Recall the type of cata:

`cata :: Functor f => (f a -> a) -> Fix f -> a`

Since `Exp`

can’t be a `Functor`

, we will need a higher-order analogue that fills the same role. Instead of changing our `a`

, however, we will want to change `f`

:

```
class HFunctor h where
hmap :: (f a -> g a) -> h a f -> h a g
```

With a `HFunctor`

, we can lift a natural transformation `f a -> g a`

into a map over `h`

. We can implement `HFunctor`

for `ExpF`

easily:

```
instance HFunctor ExpF where
hmap m (I x) = I x
hmap m (B x) = B x
hmap m (A l r) = A (m l) (m r)
```

Now, we should be able to express our analogue `cataH`

. With `cata`

, the caller is able to choose what the folding function is, along with the carrier type `a`

. Continuing the analogy, in `cataH`

we will be able to choose the carrier **functor** `f`

:

```
cataH :: HFunctor => (h a f -> f a) -> HFix h a -> f a
cataH m = m . hmap (cataH m) . unHFix
```

The definition of `cataH`

is identical to our previous definition of `cata`

, modulo the choices of typeclass and fixed-point operator. This should give us some confidence that the definition is correct.

Finally, let’s define evaluation for our new fixed type:

```
eval :: Exp a -> a
eval = cataH go
where go (I i) = i
go (B b) = b
go (A l r) = l + r
```

And we’re done! If your type can be a `HFunctor`

, you can use `cataH`

to perform catamorphisms over it. This same approach can be generalized to types of other kinds in a very similar fashion.

Equality is one of the most fundamental concepts in mathematics. But what is it? This one rule:

`for all x, x = x.`

If you are familiar with the concept of relations: for any given set S, the equality relation (let’s call it S₌) is the least reflexive relation on the set. That means for all reflexive relations R on S, S₌ ⊂ R. S₌ is also the least symmetric and transitive relations as well, which allows equality to be generalized to equivalence relations.

Topics in mathematics usually play out like this:

- Define the underlying mathematical structure.
- Define what it means for two of these structures to be equivalent.
- Come up with a way of deciding whether any given two structures are equivalent or not.

When conducting arguments during this process, this local concept of equivalence is often treated as if it were equality, with equivalent things being substituted with each other. How is this justified?

That’s where homotopy type theory (or HoTT) comes in. In HoTT, you are able to convert an equivalence proof into an equality proof with the so-called “univalence axiom”, which fills in the gaps during reasoning.

But HoTT doesn’t stop at logical niceties. Through a famous isomorphism, type theories are known to describe certain logics. A dependent type theory corresponds to “normal” first-order logic. HoTT is an extension of dependent type theory, so the logic it describes is more expressive. Given all this, it remains a possibility that mathematics itself could be encoded in this logic. If so, HoTT stands as a potential new foundation to mathematics.

With type history being computational in nature, it’s also natural to seek for HoTT to be computational as well. “The” open problem is to determine a computational meaning for the univalence axiom. Initial attempts of encoding the axiom quickly ran into issues of decidability and computational complexity. As much as I can understand it, it seems like cubical type theory may solve these problems.

As the relationship between type theory and logic runs deep, it’s natural to build automated proof checkers using type theory. The computational nature of types provides all the necessary rules for building up a proof of a certain logical proposition. There is some excitement that formulating these proof checkers using HoTT may improve the reasoning power available. Traditionally, if you have wanted to reason about equivalent structures, you have had to use awkward structures like setoids.

Finally, HoTT may have an impact on programming languages as well. Some of the known consequences of HoTT include things like quotient types, which would let us construct much more fine-grained types into our program. Imagine if you have a type which represents some piece of state. Many times, multiple values in this type can represent the same equivalent piece of state. “Quotienting” this type would allow us to only focus on the truly distinct pieces when writing functions over those types.

]]>I’ve had a lot of opportunity to get some reading in while spending time in the beautiful Chania. One of the books I’m reading is Logic as algebra, written by Halmos himself.

One of the examples in the book really stuck out for me — this post is about that example.

A formal language, described informally, is just a set of words that are composed from letters from an alphabet. For example, suppose we have the following alphabet:

$$
\Sigma = \left\\\{ D, N, E, A \right\\\}
$$

and that our words are built up inductively from *D**E**D*, *N**E**N*, and the following rewrite rules:

*D*→*D**A**N**D*→*N**A**D**N*→*D**A**D**N*→*N**A**N*

Some examples of words in our language are *D**E**D*, *D**A**N**E**D*, *N**A**N**A**D**A**D**E**N*, and so on.

When we have a language, we generally want to create a parser for it. A parser has one task: given a word, does it exist in the language or not?

Given a BNF description of the language’s grammar, it’s very easy to create a parser. The BNF for our language is as follows:

```
<eqn> ::= <d> "e" <d>
| <n> "e" <n>
<d> ::= <d> "a" <n>
| <n> "a" <d>
| "d"
<n> ::= <n> "a" <n>
| <d> "a" <d>
| "n"
```

**Exercise:** Remove the left recursion from this grammar and implement a parser for it using a method of your choice.

So far, everything is all fine and dandy. But the syntax of a language has no meaning on its own — there is an additional step of denoting words in our language with an interpretation.

Suppose that we interpret our language as a logical theory. Every word in the language represents a provable theorem. If that were the case, then a parser for this language would effectively be a theorem prover — given a theorem, is it provable in the theory or not?

The language we described above has one such interpretation. It can be thought of as equations over natural numbers involving [parity](https://en.wikipedia.org/wiki/Parity_(mathematics%29); the addition of two odd numbers is an even number, and so on.

Given that:

*D*means an od**d**number*N*means an eve**n**number*E*means**e**quals*A*means**a**ddition

So when we have the word *D**A**N**E**D*, that’s really saying:

odd + even = odd

Not many logical theories can be captured so nicely in this way, but it is an interesting & useful way to think about them.

]]>I’ve been helping a friend out with a compiler that he’s building for his awesome product, ShaderFrog. Andy was running into a serious performance issue as the number of shaders increased, so we sat down and brainstormed potential solutions.

It turned out that the biggest bottleneck was a phase that was performing a substitution of one shader AST into another. Since this requires looking for names and performing α-conversion, it can get costly quick — especially if you are traversing the entire AST.

Since substitution is well understood in the lambda calculus, I thought of the usual solutions and which would be the easiest to apply. Ultimately I decided on a gensym-like approach, where top-level declarations were prefixed with an AST-unique value to ensure uniqueness during composition. I chose it because of its relative simplicity to the other approaches.

This worked great! The performance issue disappeared and Andy could go back to focusing on more important things (he has some really cool stuff in the pipeline that I can’t wait to see released). But I was unsatisfied — this approach, while effective, seemed very antiquated. What if we parsed the AST into a nameless representation, à la De Bruijn indices? Then the issue of renaming would be abolished forever.

With De Bruijn indices, we replace names with numbers. These numbers represent how many binders “deep” they are, relative to the binder they were introduced at. For example, *λ**x*.*x* is converted to *λ*0, and *λ**x*.*λ**y*.*x* to *λ**λ*1.

With an imperative C-like language like GLSL, the “binders” are just the block of code that the identifier is declared in. Given that shadowing exists, these behave very much like a lambda would. But these languages add one extra dimension — you can have multiple declarations within one block.

If you simply use a *pair* of numbers to track the names, where the second number is the relative ordering of the declaration within the current block, you get a nameless representation. For example:

```
int foo = 7; /* (0, 0) */
char bar; /* (0, 1) */
void baz /* (0, 2) */
(int x) { /* (1, 0) */
int y = 3; /* (1, 1) */
if (x != 1) {
int z = 5; /* (2, 0) */
return z; /* (2, 0) */
}
return foo /* (0, 0) */ + x /* (1, 0) */ + y /* (1, 1) */;
```

I haven’t spent the time proving that this is a valid representation, but since it seems like a fairly natural extension of De Bruijn indices, I’m confident in it.

This makes substitution very simple:

- Let
`n`

be the highest value for which`(0, n)`

occurs in the first AST. - For every pair
`(0, k)`

in the second AST, replace it with`(0, k + n)`

.

All of the pairs are “shifted down”, just like they would be if the two sources had been concatenated before parsing into an AST.

This is very nice and pretty, but there is an issue of performance: Every time we compose these two ASTs, we have to traverse all of the functions defined in the second one to replace `(0, k)`

with `(0, k + n)`

. If we extend this pair to a triple, where the third value represents an “offset”, we only have to traverse the top-level declarations. That offset can then propagate from the top-level definition into its block of code. For example:

```
// Source from first shader
int a; /* (0, 0, 0) */
char b; /* (0, 1, 0) */
// Source from second shader, after composition
float c; /* (0, 0, 1) */
void bar() { /* (0, 1, 1) */
int y = 3; /* (1, 0) */
return y /* (1, 0) */ + c /* (0, 0) */;
}
```

Since `bar`

has an offset of 1, we can ensure that during interpretation the enclosed reference to `c`

will be `(0, 1)`

instead of `(0, 0)`

, which is what `a`

is mapped to. Essentially, we delay the computation of `c`

’s real index until it is actually necessary.

I’m not sure how often substitution comes up with the implementation of imperative compilers, but it could be effective depending on the scenario.

]]>Folds are a pretty big deal. I’ve written a lot about them in the past. One thing I haven’t covered, however, is their sister: the unfold, or anamorphism.

In Data.List there is the unfoldr function, which has the following type:

`unfoldr :: (b -> Maybe (a, b)) -> b -> [a]`

This function can be thought of as a very simple state machine. You pass in the initial `b`

state, and for each iteration you feed the current state into the supplied function. If the result is `Nothing`

, we’re done. Otherwise, we have an `a`

along with a new state `b`

. It should be clear how repeated application can generate a list.

**Exercise:** Implement `unfoldr`

.

However, viewed under the lens of fixed points, the type signature of `unfoldr`

is a bit deceptive. Recall the definition of `ListF`

:

`data ListF a b = NilF | ConsF a b`

But this type is actually isomorphic to `Maybe (a, b)`

! This is given by the following isomorphism:

```
to :: ListF a b -> Maybe (a, b)
to NilF = Nothing
to (ConsF a b) = Just (a, b)
from :: Maybe (a, b) -> ListF a b
from Nothing = NilF
from (Just (a, b)) = ConsF a b
```

This means that we can rewrite the type of `unfoldr`

to use `ListF a b`

instead of `Maybe (a, b)`

.

`unfoldr :: (b -> ListF a b) -> b -> [a]`

Now, recalling that `[a] ~ Fix (ListF a)`

, we can start to see the true generality of the unfold come out:

`unfoldr :: (b -> ListF a b) -> b -> Fix (ListF a)`

Of course, as with `cata`

, the only thing we truly need to assume about `ListF a`

is that it is a `Functor`

. So we end up with this new function:

```
ana :: Functor f => (b -> f b) -> b -> Fix f
ana h = Fix . fmap (ana h) . h
```

The type signature of `ana`

is different from `cata`

in two ways:

- The
`f b -> b`

has been flipped to`b -> f b`

`b`

and`Fix f`

have been flipped

This corresponds to the fact that `ana`

and `cata`

are duals in a precise categorical sense that I won’t go into.

`ana`

is a very handy function to use any time a list is being built up inductively. But `cata`

seems somehow more special: any value of a data type can be uniquely represented as a fold on the value itself. Is there a similar representation property for unfolds?

For the case of lists, the question is answered (at least somewhat) positively. Recall that when encoding a list as a fold, we “partially apply” the list value and leave everything else free. Here, we will do the opposite: we will supply the builder function and the seed value, but not the list itself. I chose `Int`

for `b`

for reasons that will become clear soon.

`type List1 a = (Int -> Maybe (a, Int), Int)`

Decoding the list is as easy as simply calling unfold:

```
decode :: List1 a -> [a]
decode (f, n) = unfoldr f n
```

But how do we encode an arbitrary list? Well, we can use the `Int`

to describe the length of the list. When it reaches 0, we know that there are no more elements in our list to produce. Given a list encoded in this way, all we have to do to is increment the `Int`

and wrap the current builder function with one that produces the supplied value as our new head.

```
encode :: [a] -> List1 a
encode [] = (const Nothing, 0)
encode (h:t) = (go, n + 1)
where (f, n) = encode t
go x = if x == n + 1
then Just (h, n)
else f x
```

Encoding a list and decoding it again will certainly produce the same list. (**Exercise:** why?) However, if we go in the other direction — that is, decode a list, and then encode it — we could very well end up with a different value than what we had before. `encode xs`

only describes one particular method of encoding `xs`

; there are plenty others. This process can be generalized to other (recursive) data types, but extra data will have to be included proportional to the number of constructors and their arguments.

There’s still one more interesting duality that I’d like to mention. If you have a fold on a data type, e.g. a list, the type signature looks like this, modulo some equivalences:

```
foldr :: b -> (a -> b -> b) -> [a] -> b
:: (() -> b) -> (a -> b -> b) -> [a] -> b
:: (() -> b, a -> b -> b) -> [a] -> b
```

That is, it takes a *product* of destructors, a built-up value, and produces a *destroyed* result. Now, if we think about unfolds the same way, we get:

```
unfoldr :: (b -> Maybe (a, b)) -> b -> [a]
:: (b -> ListF a b) -> b -> [a]
```

which takes a (function to a) *coproduct* of constructors, a seed value, and produces a built-up result. I think there’s a more precise way of stating this. Maybe the answer is to use the language of final coalgebras, but I’m not sure.

**Exercise:** Figure this out for me :)