Uns☁und

An extensible and unsound programming languages framework

(This page might be a bit more readable if you enable 1st-party CSS.)

Architecture

Unsound is at its core a framework for implementing programming languages whose syntax and semantics are extensible. Its design is one in which multiple language extensions are composed to produce a more or less traditional (though simplified) compiler pipeline:

Extensions can extend each step in that pipeline.

Beyond allowing the extension of the compiler, Unsound allows extending the runtime semantics of programs. The style of code that languages developed with Unsound compile to is one in which the evaluation of the compiled program is parameterized by a "semantics" that defines what evaluation actually "is". In the usual evaluation semantics this is essentially the identity

Phases

The core built-in compilation phases are:

The result is a JS file that is parameterized by a semantics that can be applied to the output to produce an interpretation. The standard semantics is $interpret, a "concrete interpretation" of the program in the universe of JS values.

One case also provide other semantics; for instance $type can implement an "abstract interpretation" in a universe of simple nominal types.

Extension System

Extensions provide hooks for each phase, allowing the phase to be extended with additional functionality. The key idea allowing extension is to implement the various phases (parsing, compilation, evaluation) with a similar "open-recursive" approach.

Extensions have the following form:

{
  $parse: ($) => { /* extend parser */ },
  $compile: ($) => { /* extend compiler */ },
  $emit: ($) => { /* extend emitter */ },
  $interpret: ($) => { /* extend default interpreter */ },
  $type: ($) => { /* add type checker */ },
  // ... any $-prefixed key for additional interpreters
}

All hooks mutate $, adding functionality extension by extension (and initially starting with the "empty" language's implementation -- generally () => {}).

To see how this plays out, consider parsing. The framework expects that, after all extensions have been loaded, the $parse phase will result in an object $ that exports a function $.parse that takes the contents of a file and outputs a value that the next phase, $compile, will understand. A simple language might implement $.parse as, for example, parsing a simple numeric expression:

{
  $parse: ($) => {
    // Parse addition: term (('+' | '-') term)*
    $.parse = (s) => {
      let result = $.term(s);
      while (s.peek() === "+" || s.peek() === "-") {
        let op = s.next();
        let right = $.term(s);
        result = { type: op === "+" ? "Add" : "Sub", left: result, right };
      }
      return result;
    };

    // Parse multiplication: number (('*' | '/') number)*
    $.term = (s) => {
      let result = $.number(s);
      while (s.peek() === "*" || s.peek() === "/") {
        let op = s.next();
        let right = $.number(s);
        result = { type: op === "*" ? "Mul" : "Div", left: result, right };
      }
      return result;
    };

    // Parse a number literal
    $.number = (s) => {
      let n = "";
      while (s.peek() >= "0" && s.peek() <= "9") n += s.next();
      return { type: "Num", value: parseInt(n) };
    };
  };
}

A subsequent extension can override parsing to add exponentiation with higher precedence:

{
  $parse: ($) => {
    // Save the base number parser
    let baseNumber = $.number;

    // Exponentiation: number ('^' exponent)*
    $.number = (s) => {
      let result = baseNumber(s);
      while (s.peek() === "^") {
        s.next();
        let right = baseNumber(s);
        result = { type: "Exp", base: result, power: right };
      }
      return result;
    };
  };
}

Now 2^3*4+1 parses with the correct precedence: ((2^3)*4)+1.

Languages

The Unsound framework comes with several languages and language extensions, all building on the lowest level "empty" language extension, which does nothing and returns nothing. That's not very useful, so Unsound also comes with the core language extension, written in Typescript, which actually implements a simple, untyped, expression-based programming language:

Other extensions are provided that build on core. For fun, these extensions form a bootstrappable "tower" -- each extension is written in a simpler language. So e.g. the meso extension adds infix, prefix, and postfix operators, providing a more usable "layer" over core:

Then thermo adds an imperative, JS-like layer over meso:

In addition, example extensions show how other programming features can be composed atop an existing language arbitrarily. For instance, const adds the classic const x = y; syntax, raising a parsing error for subsequent assignments to x. And dyn implements dynamic scoping for a language; note that dyn actually extends the evaluation semantics as well as the parsing and compilation phases.