GoLisp: Same great taste, less filling
by Dave Astels
When I first wrote GoLisp I wasn’t overly concerned with space efficiency as the immediate use was quite small scale. I was also fairly new to Go at the time. As time has gone on, we have been writing more and more production code in GoLisp. Space and time performance is something that we would now like to improve as we go forward.
Brute force
A data element in GoLisp can be one of several types:
- ConsCell
- Integer
- Float
- Boolean
- String
- Symbol
- Function
- Macro
- Primitive
- BoxedObject
- Frame
To keep things simple and easy to work with as I was evolving the language, I went with a simple, brute force approach to storage management:
type Data struct {
Type int
Car *Data
Cdr *Data
String string
Integer int64
Float float32
Func *Function
Mac *Macro
Prim *PrimitiveFunction
Frame *FrameMap
ObjType string
Obj unsafe.Pointer
}
The type field is used by every data element, and indicates the type of the element. This in effect defines what other field (or fields) are used. For most types that’s just a single other field. Conscells and boxed objects each take two fields.
In 64-bit Go with 8 byte alignment this takes 112 bytes per data element. For any given element, most of this is unused.
Safety last
If I were using C, I could just use a union and do away with the unused space. If I were using Ruby (and I have see http://daveastels.com/rubylisp/language-ref.html) I could simply use a single field for the element’s value.
But, I’m using Go and those aren’t on the table. What Go does provide
is a way to step outside the type system and use what amounts to a
plain old void*
.
Let’s look back at the Data
structure for a second. As I mention,
most types require a single field in addition to the type field. There
are two exceptions: cons cells which require car
and cdr
pointers
to a Data
, and boxed objects which have a type and value.
Go provides the unsafe
package for letting us avoid the constraints
of the type system. Of particular interest to us is unsafe.Pointer
which is basically a way of having void*
in a Go friendly way.
You convert an object (called something
) to an unsafe.Pointer
very simply:
unsafe.Pointer(&something)
The result is a generic pointer with no attached type information. If
you later want to get the data that is being pointed to, you must cast
the unsafe.Pointer
back to the appropriate type and dereference it:
x := *((*int)(my_unsafe_pointer))
Note that we have to put the *int
in parens to make it clear that
we’re casting to a pointer.
Putting GoLisp on a diet
Using unsafe pointers we can change the Data
struct to:
type Data struct {
Type uint8
Value unsafe.Pointer
}
Noticed the Type
field has been constrained to 8 bits, although we
don’t realize much of a savings due to field alignment.
For the cons cell and boxed object types I added structs to group the required fields:
type ConsCell struct {
Car *Data
Cdr *Data
}
type BoxedObject struct {
ObjType string
Obj unsafe.Pointer
}
There’s one other slight wrinkle: you can’t take the address of a
boolean literal. That means you can’t convert it directly to an
unsafe.Pointer
. To get around this I simply made a variable for each
value, and used these to make the boolean constants:
var b_true = true
var b_false = false
var True *Data = &Data{Type: BooleanType, Value: unsafe.Pointer(&b_true)}
var False *Data = &Data{Type: BooleanType, Value: unsafe.Pointer(&b_false)}
With these changes, the size of the core data entity is 16 bytes.
That’s 14% of the original size. Granted, cons cells (which are very
common) require another 16 bytes for the Car
and Cdr
pointers.
That’s a total of 32 bytes for a cons cell compared to the
previous 112. Still a significant savings.
Busy work
So now data has to be converted to and from unsafe.Pointer
s as it
goes in and out of Lisp data elements. Thankfully that work is all in
the data.go
file. For each type of Lisp data there is a set of
functions that check the type, wrap, and unwrap raw values. Code
outside of data.go
knows nothing about how storage is managed. For
example, here are the function for integers (ignoring coercion):
func IntegerP(d *Data) bool {
return d != nil && TypeOf(d) == IntegerType
}
func IntegerWithValue(n int64) *Data {
return &Data{Type: IntegerType, Value: unsafe.Pointer(&n)}
}
func IntegerValue(d *Data) int64 {
if d == nil {
return 0
}
if IntegerP(d) {
return *((*int64)(d.Value))
}
return 0
}
Each type of Lisp data element will have a similar set of functions.
Conclusion
This project proved the value of two aspects of the GoLisp project.
First, data storage encapsulated in one file. Actually there had been some
leakage of that, but now all data access is done via functions in
data.go
. Second, a quite exhaustive suite of tests, split between Go
and Lisp level testing. Getting all the tests passing after what was
fairly major surgery, resulted in a fully functioning system.
Initially this will live on the smaller
branch. It looks solid so
far, and once it’s proved to be ready for production I’ll move it to
master
.
Update: This has been merged into the master branch and is now just how GoLisp works.