Announcing SteelSeries/GoLisp v1.0

05 Aug 2015

by Dave Astels

Coming of age

GoLisp is maturing nicely, and I’ve recently made it a lot more like standard Scheme. This includes some breaking changes where I played fast and loose with the language earlier on. Now I’m bringing some of the more core things into line with how standard Scheme works. Because of that, I’ve decided that it’s time to slap a label on it. v1.0 is now what’s on the master branch, and the previous master branch has been renamed legacy.

So what’s new?

There are several improvements to the runtime, and the standard set of primitives has been expanded.

Forms of let

Golisp initially supported let which allowed you to cascade bindings, just as let* does in standard Scheme. As GoLisp got more and more like standard Scheme, I decided to switch to a Scheme-style let. So now there is let, let*, and letrec. Typically you will want to use let, though in cases where you want or need to cascade, you can use let*. Note that the syntax of each are identical; only the evaluation environment of the binding values varries. Finally, named let has been added.

(let ((name value)…) sexpr…)

Create a local scope and bindings for evaluating a body of code. The first argument is a list of bindings. Each binding is a raw symbol (doesn’t get evaluated) that is the name to be bound, and a value (which is evaluated). These bindings are added to a scope that is local to the let. The body is evaluated in this local scope. The value of the let is the last evaluation result. E.g.

(let ((x 1)
  (y 2))
  (+ x y)) ⇒ 3

Bindings DO NOT cascade. I.e. each binding’s value is evaluated in the context of the scope where the let appears. E.g.

(define x 39)
(let ((x 1)
      (y (+ x 2)))
  (+ x y)) ⇒ 42

When the value of y is computed, x is taken from the outside environment, and there it is bound to 39, so y is bound to 41. When the addition is evaluated in the body of the let, the let-bound value of x (which is 1) is used, so the result is 42.

As mentioned above, let takes a sequence of sexprs as it’s body. In effect the body of the let is an implicit begin.

(let* ((name value)…) sexpr…)

NOTE: this used to be let and there was no let* In many cases this won’t matter, but in some it will. You might want to check any existing use of let to see if it should be a let* now.

This is the same as let EXCEPT that bindings DO cascade. I.e. each binding’s value is evaluated in the context of the local scope. Because of this, the value expression for each binding is evaluated with all preceeding bindings available.E.g.

(define x 39)
(let ((x 1)
      (y (+ x 2)))
  (+ x y)) ⇒ 4

(letrec ((name value)…) sexpr…)

The variables are bound to fresh locations holding unassigned values, then the inits are evaluated in the extended environment (in some unspecified order) Each variable is then assigned to the result of the corresponding init, and the expressions are evaluated sequentially in the extended environment. Each binding of a variable has the entire letrec expression as its region, making it possible to define mutually recursive procedures.

GoLisp allows any of the inits to be omitted, in which case the corresponding variables are unassigned.

(letrec ((even? (lambda (n)
                  (if (zero? n)
                      #t
                      (odd? (- n 1)))))
         (odd? (lambda (n)
                 (if (zero? n)
                     #f
                     (even? (- n 1))))))
  (even? 88))

One restriction on letrec is very important: it shall be possible to evaluate each init without assigning or referring to the value of any variable. If this restriction is violated, then it is an error. The restriction is necessary because Scheme passes arguments by value rather than by name. In the most common uses of letrec, all the inits are lambda expressions and the restriction is satisfied automatically.

(let name ((name value)…) sexpr…)

MIT/GNU Scheme permits a variant on the syntax of let called named let which provides a more general looping construct than do, and may also be used to express recursions. GoLisp now also supports this syntax.

Named let has the same syntax and semantics as ordinary let except that name is bound within the expressions to a procedure whose formal arguments are the variables and whose body is the expressions. Thus the execution of the expressions may be repeated by invoking the procedure named by name.

Here is an example:

(let loop
     ((numbers ’(3 -2 1 6 -5))
      (nonneg ’())
      (neg ’()))
  (cond ((null? numbers)
         (list nonneg neg))
        ((>= (car numbers) 0)
         (loop (cdr numbers)
               (cons (car numbers) nonneg)
               neg))
        (else
         (loop (cdr numbers)
               nonneg
               (cons (car numbers) neg)))))
⇒ ((6 1 3) (-5 -2))

Special forms

Standard Scheme has three types of functions:

procedures which are written in GoLisp
primitives which are part of the runtime, written in Go in this case
special forms which are part of the syntax of the language

Pre-v1.0 GoLisp conflated primitives and special forms. In v1.0 they are handled differently. Both are implemented in Go, using the primitive framework. The primary difference is that primitives, like procedures, use applicative-order. That is, arguments to a function are evaluated when the function is called, and the results of those evaluations are passed into the function. In contrast, special forms use normal-order: the unevaluated argument expressions are passed into the form to be evaluated if and when the form decides.

For example, consider if. The condition expression, the then-clause, and possibly an else-clause, are passed to if. The condition is then evaluated, and if the result is #t the then-clause is evaluated, otherwise the else-clause is.

Pre-v1.0 all primitives used normal-order. While this was reasonable enough at first, the overhead in evaluating and check all arguments in the majority of cases became onerous as the runtime library grew. With v1.0 only special forms use normal-order and primitives have their arguments evaluated and error checked. The resulting values are passed into the primitive.

For example, here is the pre-v1.0 implementation of memp:

func MempImpl(args *Data, env *SymbolTableFrame) (result *Data, err error) {
  f, err := Eval(First(args), env)
  if err != nil {
    return
  }
  if !FunctionP(f) {
    err = ProcessError("memp needs a function as its first argument", env)
    return
  }

  l, err := Eval(Second(args), env)
  if err != nil {
    return
  }

  var found *Data
  for c := l; NotNilP(c); c = Cdr(c) {
    found, err = ApplyWithoutEval(f, InternalMakeList(Car(c)), env)
    if BooleanValue(found) {
      return c, nil
    }
  }

  return LispFalse, nil
}

Notice the evaluation of each argument and the associated error checking. Now look at the v1.0 version:

func MempImpl(args *Data, env *SymbolTableFrame) (result *Data, err error) {
  f := First(args)
  if !FunctionP(f) {
    err = ProcessError("memp needs a function as its first argument", env)
    return
  }

  l := Second(args)

  var found *Data
  for c := l; NotNilP(c); c = Cdr(c) {
    found, err = ApplyWithoutEval(f, InternalMakeList(Car(c)), env)
    if BooleanValue(found) {
      return c, nil
    }
  }

  return LispFalse, nil
}

While the time complexity of calling a priitive is the same, the code for each primitive is simpler and shorter. Multiply that by the number of primitives, and the space savings is significant. Also adding primitives becomes much less work.

The other difference is that special forms are declared using the MakeSpecialForm function instead of MakePrimitiveFunction. See the next section for an example.

Function arity

Pre-v1.0 primitive declarations allowed you to specify how many arguments were expected (if it were a fixed value), and allowed a wildcard of -1 to denote a variable number. This was quite limited: you could specify a fixed number or any number. In the latter case, it was then up to the primitive implementation to check the number of arguments provided to assure its validity. V1.0 expands on that, adding a great deal of control. Now, instead of a numeric arity, the primitive (or special form) declaration specifies allowable arity as a string having the follow syntax:

* specifies the wildcard: any number of arguments is allowed, the implementation has to validate it. This was exactly like the case of specifying -1 pre-v.10. If this is used, it has to be the entirity of the arity specification.
x|y (where x and y are a one of the forms below) either x or y are valid argument counts.
n (where n is a positive integer) specifies an exact number of arguments.
>=n (where n is a positive integer) specifies that at least n arguments are required.
(x,y) (where x & y are a positive integers) specifies that at least x, and no more than y argument are allowed.

Some examples:

MakePrimitiveFunction("list", "*", ListImpl)
MakePrimitiveFunction("make-list", "1|2", MakeListImpl)
MakePrimitiveFunction("length", "1", ListLengthImpl)
MakePrimitiveFunction("map", ">=2", MapImpl)

Here is the pre-v1.0 declaration and implemetation of if:

MakePrimitiveFunction("if", -1, IfImpl)

func IfImpl(args *Data, env *SymbolTableFrame) (result *Data, err error) {
  if Length(args) < 2 || Length(args) > 3 {
    err = ProcessError(fmt.Sprintf("IF requires 2 or 3 arguments.
                                    Received %d.", Length(args)), env)
    return
  }

  c, err := Eval(Car(args), env)
  if err != nil {
    return
  }
        
  if BooleanValue(c) {
    return Eval(Second(args), env)
  } else {
    return Eval(Third(args), env)
  }
}

And here it is in v1.0:

MakeSpecialForm("if", "2|3", IfImpl)

func IfImpl(args *Data, env *SymbolTableFrame) (result *Data, err error) {
  c, err := Eval(Car(args), env)
  if err != nil {
    return
  }

  if BooleanValue(c) {
    return Eval(Second(args), env)
  } else {
    return Eval(Third(args), env)
  }
}

I/O

Pre-v1.0 GoLisp has some basic functions that output to stdout.

V1.0 provides a more complete I/O system. To start, it introduces ports to GoLisp. All of the new I/O fucntions can take a port as a source/destination.

You create a port by using the open-input-file and open-output-file functions that take a filename and return a port. Ports (either input or output) are closed using the close-port functions.

There is currently only one input function: read that requires a port argument and reads the textual representation of a Lisp object fro the port, returning the corresponding object.

The function write is the counterpoint to read it takes a lisp object and writes it (as much as possible) in a readable format to it’s destination. That destination is a provided port or STDOUT (if a port is not provided).

The write-string function is simplet, writing a string to it’s destination (a port or STDOUT).

Neither write nor write-string add a line terminator to their output. Use the newline function for that. Again, it takes a port or will use STDOUT.

The write-bytes function is a bit different: it requires a byte-array and a port and writes the raw bytes to the port. This was added as a way to send commands/data to external devices connected via a USB/serial port. Specifically Arduino based devices. See My post on driving Arduino based hardware from GameSense.

V1.0 also includes Scheme’s format function.

List operations

V1.0 adds a few handy list operations:

find and find-tail both take a predicate and a list and finds the first element for which the predicate returns true. They return that element and the list whose car is that element, respectively.

for-each operates just like map except that it doesn’t accumulate and return the values returned by the function being mapped over the list. You would use for-each when you just want the effect of mapping, not the result (e.g. printing/outputing a sequence of values).

To go along with filter (which returns elements from a list for which a predicate returns true), remove returns the elements for which the predicate returns false.

Pre v1.0 included the partition function that broke a list up into sublists of a certain size. V1.0 extends partition to accept a predicate function (it still works as before if given a number) to split the list into 2 lists: elements for which the predicate returns true, and those for which it returns false.

String functions

String support has been expanded.

split has been renamed string-split.

The ‘trim’ function has been renamed to string-trim and has new versions for only trimming on the right (string-trim-right)or left (string-trim-left) of a string.

In addition to string-upcase and string-downcase, v1.0 also includes string-capitalize. All three now have a mutating version (e.g. string-upcase!).

There is now a substring function to extract portions of a string.

The functions substring?, string-prefix?, and string-suffix? check for a matching substring, prefix, or suffix, respectively.

A new string-length function returns the length of a string. This uses the underlying Go function and so accounts for multi-myte characters correctly.

Finally, string-null? check for an empty string.

Tools

V1.0 sees the beginnings of a profiler. Data collection is done in the runtime, but processing of that data is done in GoLisp, itself. I see this as a concrete sign that the system has reached a milestone. Post to follow.

I’ve also revisited the testing framework and made several improvements. Post to follow

Summary

With v1.0, GoLisp takes a large step forward in terms of capability and Scheme compliance. With a standard form of let forms, GoLisp code is that much more compatible with Scheme. Applicative-order evaluation for primitives coupled with a more capable arity specification system make expanding the runtime far easier, efficient, and flexible. The future of GoLisp is bright.

technology blog