Yet another LISP dialect by a university undergraduate. This is another one of my older projects; originally most of this text was written back in 2012.
This page is intended to serve as a reference and semi-formal specification. Some confusing terminology may be used (mostly because I am neither a highly educated computer scientist nor a native speaker of English).
The current implementation is called filiokus to continue with the naming theme. Currently this reference is a lot more complete than the implementation, but the implementation does work well enough to parse and execute code.
This project started with me putting together a very minimalistic LISP implementation, MISP, mostly for educational purposes. But though I am quite fond of minimalism I found that MISP would be more interesting to work with if it had some more interesting features, homoiconicity in particular (see below). So the name was changed to Hokus (the Swedish spelling of "Hocus" as in hocus-pokus, mostly a reference to how lambda-calculus feels a lot like magic) and here we are. And although the original pure minimalism is gone some effort has still been made to keep the required definitions and symbols to a minimum.
I believe much of the original inspiration was from doing some lambda calculus in LaTeX. Turns out it can be implemented entirely in the TeX "mouth" which allows for some neat tricks.
Everything in Hokus is represented as atoms. An atom may be one of the following: an integer, a symbol, or a pair.
Integers are (32-bit) signed integers. Booleans are represented as a subset of integers. The integer zero is false and all non-zero integers are true. All native procedures return the integer one to represent true.
Symbols are names and may either be bound to a value (to an atom) or unbound (free).
Pairs contain two other atoms and are used to represent arbitrary data structures. To facilitate the use of pairs there is a built in symbol called
null which is used to represent an empty atom.
Something to note is that Hokus does not have a type for procedures; procedures are equivalent to their abstract syntax trees. This property is more formally known as homoiconicity (wikipedia link). Usually this just means that code and its AST is homomorphic but I was going for the more "pure" version of lambda calculus where code and data are literally the same thing.
An expression in the Hokus language is a non-empty, whitespace delimited list of atoms and/or expressions surrounded by one pair of parentheses. For the implementation this is a string of utf-8 coded text, with implicit parenthesis surrounding it.
Expressions are simply parsed into abstract syntax trees containing the contents of the expression. For example parsing
a b (x y) c results in the tree
a → b → (x → y → null) → c → null, or
The notation in this figure will be used throughout. A circle represents a pair, with the single arrow being the first child and the double arrow being the second. A solid box is a symbol and a dotted box is an integer. The term expression may refer to the entire tree, or specifically to the root atom.
Evaluating an expression means applying the following evaluation rules until only an integer atom remains.
- Integers evaluate to themselves.
- Symbols usually evaluate to their bound values. Some symbols carry special meaning (see below) and are not evaluated the same way as user-defined symbols. Attempting to evaluate a free value is illegal.
- Pairs evaluate differently depending on their content. This constitutes the main form of computation.
How a pair is evaluated depends on the type of the first atom:
If the pair contains only a single integer then the pair evaluates to that integer. A pair with and integer and a second non-null atom can not be evaluated.
Evaluating a pair beginning with a symbol is akin to computing the function that the symbol represents. This first symbol may be called the procedure or operator, and the rest of the pair is its argument list. The pair (or more correctly the root of the pair) is called the context. How evaluation proceeds depends on the type of symbol: user defined symbols simply cause replacement by the bound value, while Hokus' built-in symbols have their own special behaviours.
Evaluating pair A when beginning with a pair B is done by "rotating" the expression to make B the new root. This means that chains of expressions behave properly, and is best explained through the cunning use of graphs:
These are the special symbols defined by Hokus and their evaluation rules. At the moment only a few exist; enough to implement a factorial function.
defineThis globaly binds a symbol to an expression, and evaluates to
lambdasymbol corresponds to the $\lambda$ of lambda calculus and is the workhorse of any LISP-like language.
ifsymbol basically behaves like a ternary if similar to most languages.
=Compares a sequence of integer and evaluates to true of false.
+Evaluates to the sum of a sequence of integers.
+but all atoms past the first is subtracted instead of added.
+but all atoms are multiplied instead.
Below is shown an example factorial program that can actually be executed by the current implementation.
( define fact lambda (n) ( if (= n 1) 1 (* n (fact (- n 1))) ) ) ( fact 5 )