Back

Writing an Interpreter in Typescript, Part 1

The title might be a bit misleading here - I am not talking about writing a typescript program that compiles into a javascript program which in turn is an interpreter, instead this blog post explores the use of the type system itself as a programming language, purely as an experiment. There really isn't really any practical use case for this experiment, but exploring the details of the type system helped me to get a deeper understanding of how it works under the hood and also to define concise types for APIs.

The type system of typescript itself is pretty powerful itself and has been shown to be turing complete. As a result we can use typescript types as a programming language of its own: In its essence the typescript type system is a functional programming language - we don't have any sort of global state and have to solely rely on functions. In the case of typescript our functions are generic type aliases. There is however one small detail: We cannot use our generic type aliases as values. In other words there is no way to return functions from functions - in JavaScript this is a pattern that is used quite often.

The Basics

I am going to assume some basic knowledge about the type system here. The typescript type system of typescript really is a good choice for this task as it is very easy to construct new types and manipulate them. As an interesting property most literals and most values have a literal type that solely accept this single value. The integer literal 42 has a type 42 which only has a single value. Similar to that we can represent abstract data structures by more complex literal types such as object types.

Conditional types are also essential: Given a type and a pattern we can check whether this type "matches" that pattern (strictly speaking it checks whether the type is a subtype of that pattern). This makes them similar to if statements, ternary expressions or match statements. Using them we could define a type alias Not:

type Not<X extends boolean> = X extends true ? false : true;

We always have to be careful here: If X extends boolean that doesn't necessarily mean that it is either true or false - it could very well be a couple of other types: boolean and true | false are the two other options. Interestingly Not<boolean> isn't equivalent to Not<true | false> in this case (due to the distributive nature of conditional types). For this to be the case we could define Not as:

type Not<X extends boolean> = [X] extends [true] ? false : true;

On the other hand Not<boolean> = true also doesn't sound correct, so we would have to special case it here:

type Not<X extends boolean> = boolean extends X
  ? boolean
  : X extends true
  ? false
  : true;
type X0 = Not<true>; // false
type X1 = Not<false>; // true
type X2 = Not<true | false>; // boolean
type X3 = Not<boolean>; // boolean

Natural Numbers

Let's start with the basics. Every interpreter needs some sort of data type, as a common data type in most programming languages we will implement integers here. The typescript type system already has number literal types, e.g. 1 and 2 are already types. However we will encounter a problem if we use this as our integer type for our interpreter. Applying arithmetic operations to number literal types is no easy feat, there is currently no way to add or subtract number literal types. Instead we will have to define a model for our set of natural numbers from scratch. Tuples are an easy way to model them: We can just say that the number of elements in a tuple is the number that the tuple represents. Let's define a type alias for this kind of type:

type Num = 42[];

We can really choose any number and even any type as a replacement for 42 here, we simply need an array of "something" as the length is the only thing that matters. But didn't we say that we wanted to use tuples? Yes we did, if we allow our Num type to be an array of any length the result of that type would actually be an array. To recap: The type [] is our zero, the type [42] is our one and so on. Every single one of these "numbers" is a subtype of Num, i.e. X extends Num, but Num itself is also a Num which might not be something we want - the length attribute of Num itself is number, not a number type literal. We'll use Num as our infinity, the typescript type-checker will automatically decide to widen the type to 42[] if there are too many elements in a tuple. To check if a Num X is "infinity" we can simply test whether Num extends X holds. n-Tuples of the same element are subtypes of their respective array types, but the other way around it doesn't hold.

To convert this weird Num type into an actual number we can simply read the length attribute of the tuple.

type ToNumber<A extends Num> = A["length"];
type X = ToNumber<[42, 42, 42]>; // 3

Let's also define a Zero type so that empty array don't look as weird in the rest of our program:

type Zero = [];

We can now start to define a few arithmetic functions:

type Inc<A extends Num> = [...A, 42];
type Dec<A extends Num> = A extends [42, ...infer R] ? R : Zero;

Inc simply appends a 42 to a tuple (which means that its length also gets incremented) and Dec removes an element from a Num. If we try to decrease Zero we simply get another Zero back. Now that we have these primitives we can define Add, Sub and Mul based on these primitives. Add can actually be defined using two spread operations, eliminating the need for recursion here:

type Add<A extends Num, B extends Num> = [...A, ...B];
type Sub<A extends Num, B extends Num> = Num extends B
  ? never
  : B extends [42, ...infer _R]
  ? Sub<Dec<A>, Dec<B>>
  : A;
type Mul<A extends Num, B extends Num> = A extends [42, ...infer _R]
  ? Add<B, Mul<Dec<A>, B>>
  : Zero;

The Add function simply concatenates the two tuples, the length of the tuple constructed by the concatenation of two tuples is obviously equal to the sum of the length of the individual functions so our Num model works in this case. We can think of Sub as the function sub = (x, y) => x > 0 ? sub(x - 1, y - 1) : x which we could easily prove. There is however one small detail that one has to be aware of: The typescript compiler doesn't like it at all if we define types that could lead to an infinite recursion during type-checking. As a result of that we need that 42[] extends B check at the beginning. Without it the compiler would have to repeatedly remove an element from an array if we were to pass it the type 42[] as a parameter (and not finitely sized subtype of it). The Mul function basically does the same, it adds B to Zero A times, implementing multiplication as a recursive function. Because it already knows that Sub cannot return 42[] we don't even need to check for that here.

Division is a lot harder to implement, so we'll leave it out for now.

Parsing numbers

Now on to the next task - our abstract representation isn't that useful if there is no way to convert normal numbers into our model. But how? I previously mentioned that there's no good way to turn apply arithmetic operations to number literal types, but there's one operation we can use: We can convert the number literal type into a string literal type by including it in a string literal type. I.e. the first step towards converting 3 into [42, 42, 42] could be to convert 3 into "3". Once we have the string we can convert the string into a Num digit by digit. Template literal type expressions ${X} give us an easy way to convert from a number literal type to a string literal type. Having it in this interpretation we only need to go through the string character by character just like we would do it in any other functional language.

Let's try to see what this function has to do: Given a string x we can first check whether the string is empty or not. In the case that it is empty we can simply return 0. This case is our base case. If the string is not empty we can simply remove the first digit and call the function recursively again. When we have the result we can simply "put the digit back into the front" by multiplying it with the correct power of ten and adding it to the result. This gives us a simple algorithm for an imperative language, in functional languages however there's a way to do this in a slightly more elegant way: Instead of defining a function with one parameter, we define one with two parameters, the first one being the part of the string we still have to parse and the second one being the integer representation of the part that we already parsed. Let's illustrate this: When we want to parse 574 we would start out with parse("574", 0) and then we could incrementally remove one digit and add it to the end of the "parsed" part (this parameter is often called accumulator or simply acc). The call stack would look like this:

parse("574", 0) > parse("74", 5) > parse("4", 57) > parse("", 574) > 574

In our base case (when the first parameter is empty) we simply return the second parameter. Implemented as a typescript type-alias our function would look like this (Atoi = array to integer):

type Atoi<S extends string, Acc extends Num> = S extends `${infer R}${infer B}`
  ? R extends DecDigit
    ? Atoi<B, Add<Mul<Acc, Ten>, DecDigitMap[R]>>
    : never
  : Acc;

This generic type-alias uses some types we haven't defined yet, but the function it implements is just like I described previously: If the string is non-empty (we check this by trying to split it into two parts) we return our accumulator, otherwise we return a new Atoi "call" that adds the the digit we popped of at the end to our numeric interpretation. The DecDigitMap type is simply a constant object type that acts as a lookup table from a single digit to a corresponding Num:

type DecDigitMap = {
  "0": [];
  "1": [42];
  "2": [42, 42];
  "3": [42, 42, 42];
  "4": [42, 42, 42, 42];
  "5": [42, 42, 42, 42, 42];
  "6": [42, 42, 42, 42, 42, 42];
  "7": [42, 42, 42, 42, 42, 42, 42];
  "8": [42, 42, 42, 42, 42, 42, 42, 42];
  "9": [42, 42, 42, 42, 42, 42, 42, 42, 42];
};
type DecDigit = keyof DecDigitMap;

We also could've implemented it recursively (i.e. setting "1" to Inc<DecDigitMap["0"]>), but there really is no point in making it more complicated than it needs to be.

Having the Atoi function implemented we can now easily define a FromNumber<T> type alias:

type FromNumber<X extends number> = Atoi<`${X}`, Zero>;

If we now define type X = FromNumber<123> we would get a tuple type of 123 42 number literal types back - exactly what we want.

Expression Tree

Let's see how we could use the types we wrote to evaluate a simple abstract syntax tree. Our simple abstract syntax tree consists mostly of 3-tuples representing a binary operation, i.e. [12, "+", [54, "-", 5]] would represent the expression 12 + (54 - 5). The type for our AST types is pretty simple:

type Expression =
  | [Expression, "+", Expression]
  | [Expression, "-", Expression]
  | [Expression, "*", Expression]
  | number;

We can easily see that [12, "+", [54, "-", 5]] is a subtype of Expression. Our eval function now is pretty straightforward: We extract the left-hand-side parameter and the right-hand-side parameter of the binary expression, evaluate them both recursively and compute the result by using the Add, Sub and Mul functions we defined previously. This time we have to fight the type inference engine a bit as it doesn't seem to believe that that the return type of Eval really is a Num. Nevertheless using a few simple conditional types we can finally index into our Binop interface.

When the expression is a literal (i.e. we get a number literal type) we simply use our FromNumber function we defined previously.

interface Binop<A extends Num, B extends Num> {
  "+": Add<A, B>;
  "-": Sub<A, B>;
  "*": Mul<A, B>;
}
type Eval<E extends Expression> = E extends number
  ? FromNumber<E>
  : E extends [infer A, infer Op, infer B]
  ? A extends Expression
    ? B extends Expression
      ? Eval<A> extends infer AVal
        ? AVal extends Num
          ? Eval<B> extends infer BVal
            ? BVal extends Num
              ? Op extends "+" | "-" | "*"
                ? Binop<AVal, BVal>[Op]
                : never
              : never
            : never
          : never
        : never
      : never
    : never
  : never;

Now that we have an Eval function we can finally interpret some simple arithmetic expression by chaining the ToNumber function with our Eval:

// Checks whether `B` is a subtype of `A` and results in an error if that
// isn't the case type Is<A, B extends A> = 0;
type X = Is<20, ToNumber<Eval<[[3, "+", 2], "*", [5, "-", 1]]>>>;
// I want `a` to be (2 + 6) - (5 - 1), but I am too lazy to check it const a: ToNumber<Eval<[[2, "+", 6], "-", [5, "-", 1]]>> = 4;

With the utility types we defined previously we can also define type-safe alternatives for the standard operators +, - and *:

function add<A extends number, B extends number>(a: A, b: B) {
  return ((a + b) as unknown) as ToNumber<Add<FromNumber<A>, FromNumber<B>>>;
}

const x = 12 + 13; // number :(
const y = add(12, 13); // 25 :)

This last example concludes this blog post, I hope no one uses this in a real codebase. In part 2 we'll explore different ways to store the state of our interpreter and see where the limits of different approaches are.