txus/terrorvm
{ "createdAt": "2012-12-29T21:34:42Z", "defaultBranch": "master", "description": "Lightweight, fast Virtual Machine for dynamic, object-oriented languages.", "fullName": "txus/terrorvm", "homepage": null, "language": "C", "name": "terrorvm", "pushedAt": "2013-12-23T19:49:48Z", "stargazersCount": 42, "topics": [], "updatedAt": "2024-02-25T17:01:17Z", "url": "https://github.com/txus/terrorvm"}TerrorVM 
Section titled “TerrorVM ”A lightweight Virtual Machine for dynamic, object-oriented languages. It aims to
be fast, as simple as possible, easily optimizable, with LLVM support, and
easily targetable for language designers and implementors. That’s why its
interface (instruction set and bytecode format) is extensively documented and an
example compiler is provided under the compiler folder.
Before anything, I want to give special thanks to my awesome mentors [Jeremy Tregunna][jtregunna], [Brian Ford][brixen], [Dirkjan Bussink][dbussink] and [Evan Phoenix][evanphx]. Without their teachings and patience I would never have started this in the first place.
Disclaimer
Section titled “Disclaimer”I’d love to discuss literally anything about my choices regarding the design and the implementation of TerrorVM. Feel free to ping me [on twitter][twitter], drop me [an email][email], or if you are in Berlin, just grab some beers together :) After all,

Object model
Section titled “Object model”In TerrorVM, everything is an object, and every object may have a prototype. The basic value types that the VM provides are:
Number: Double-precision floating point numbers.String: Immutable strings.Vector: Dynamically sized vectors that may contain any type.Map: Hashmaps (for now only strings are supported as keys).Closure: A first-class function.True: True boolean.False: False boolean.Nil: Represents nothingness. It is falsy just likefalse.
These basic types are objects themselves (of type Object). They are the
prototype for any objects of their own kind, and are provided
with all the functionality that those objects will need — this is done in the
preludes ([alpha][alpha] and [beta][beta]), I’ll explain what those are a bit
further ahead.
Objects are simply collections of slots that may contain any kind of object.
I’m considering adding Traits, although I’ll wait until I see a need for it. In the simplicity of Terror lies its power.
The VM runtime object
Section titled “The VM runtime object”TerrorVM exposes as much of itself as possible at runtime. The goal of this is
to make it easily targetable and flexible. For example, the toplevel object VM
exposes two subobjects (types and primitives). VM.types is a map of all
the VM types like this:
{ :object => Object, :vector => Vector, :number => Number, ...}Primitives contains all the native functions that the VM exposes (such as
puts, print, clone, arithmetic primitives, etc):
{ :clone => #<Closure ...>, :puts => #<Closure ...>, ...}Preludes
Section titled “Preludes”As you already know, TerrorVM tries to implement as much as possible in its own code, rather than C. This makes it a perfect candidate as a multi-language VM to implement any language on top of it. You can find the high-level [alpha][alpha] and [beta][beta] preludes and their respective compiled versions [alpha][alpha_native] and [beta][beta_native].
The alpha prelude wires up the VM primitives to the real objects at runtime, so that your code can use them conveniently. This is our current prelude in high-level Ruby (interpreted by the VM in the bootstrap phase):
VM.types[:object].clone = VM.primitives[:clone]VM.types[:object].print = VM.primitives[:print]VM.types[:object].puts = VM.primitives[:puts]
VM.types[:number][:+] = VM.primitives[:'number_+']VM.types[:number][:-] = VM.primitives[:'number_-']VM.types[:number][:/] = VM.primitives[:'number_/']VM.types[:number][:*] = VM.primitives[:'number_*']
VM.types[:vector][:[]] = VM.primitives[:'vector_[]']VM.types[:vector][:to_map] = VM.primitives[:vector_to_map]Beautiful, isn’t it? :)
In later stages, such as [beta][beta], we define other high-level functions like
Vector#map.
If you wish to change any kernel files such as the prelude, you’ll have to recompile the files to the native TVM format, like this:
$ make kernelAnd if you add more high-level examples (in Ruby) under the compiler/examples
folder, you must recompile them as well:
$ make examplesDebugging your TerrorVM programs
Section titled “Debugging your TerrorVM programs”TerrorVM ships with a debugger that you can use to debug your programs. The debugger can set breakpoint at specific lines and step through either high-level lines of code or low-level bytecode instructions.
To use the debugger, pass -d as a second argument to terror:
$ bin/terror -d examples/functions.tvmHere’s an example session:
/Users/txus/Code/terrorvm/compiler/examples/functions.rb:11 > a = 1232 foo = 1233 self.fn = -> foo {4 # foo is shadowed because it's a local argument
> nDEBUG src/terror/vm.c:82: PUSH 0DEBUG src/terror/vm.c:284: SETLOCAL 0
/Users/txus/Code/terrorvm/compiler/examples/functions.rb:21 a = 1232 > foo = 1233 self.fn = -> foo {4 # foo is shadowed because it's a local argument5 a + foo
>The debugger will always show you the high-level code (in
compiler/examples/functions.rb) so you know where you are at every point.
The commands for the debugger are:
h: show helps: step to the next bytecode instructionn: step to the next line of codec: continue executiond: show the stackl: show localst: show backtraceb: set breakpoint in a line. Example: b 30Implementing your own dynamic language running on TerrorVM
Section titled “Implementing your own dynamic language running on TerrorVM”TerrorVM is designed to run dynamic languages. You can easily implement a compiler of your own that compiles your favorite dynamic language down to TVM bytecode.
I’ve written a demo compiler in Ruby under the compiler/ folder, just to
show how easy it is to write your own. This demo compiler compiles a subset of
Ruby down to TerrorVM bytecode, so you can easily peek at the source code or
just copy and modify it.
You can write your compiler in whatever language you prefer, of course.
Garbage collection
Section titled “Garbage collection”The algorithm of choice for TerrorVM was [Baker’s treadmill][treadmill], an incremental, real-time, non-moving GC algorithm, implemented in [libtreadmill][libtreadmill].
Unfortunately I couldn’t make it work so for now I’m using a simple Mark and Sweep implemented in [libsweeper][libsweeper] as a separate library and included via Git submodules.
Concurrency
Section titled “Concurrency”This is a really important topic these days, not to be overlooked. Although its concurrency support is not in place yet, it will feature forking, threads and coroutines, but I might change my mind as I learn more.
Bytecode format
Section titled “Bytecode format”The bytecode format might change to be more compact, but I’ll describe what it is for now. A file must contain a main block, and may contain other blocks (functions defined there). This is how a block looks like (if you’re curious, it’s just a hello world):
_0_main:2:8"hello world"puts16 PUSHSELF17 PUSH0128 SEND1120 PUSHNIL144 RETAs you can see, _main, defines the entry point of the file. Then these
mysterious numbers :2:8 mean that this block has two literals and eight lines
of instructions. There are actually only 5 instructions, but the operands for
these instructions count
as well, so we’re in a total of 8.
Right after these counts, we have the literals, each one in its own line. There
are two kinds of literals: numbers and strings. Numbers are just numbers, but
strings must be preceded by a ".
And finally we get to eight lines of numbers, namely the instructions and their
operands. The labels you see beside every instruction (PUSHSELF) are totally
optional, the VM doesn’t read them, but they help debugging when looking at a
bytecode file manually.
After that there might be more functions. Imagine our hello world defined an
empty closure, then we’d have right after 144 RET:
_4_block_153:0:220 PUSHNIL144 RETThat’s it! :)
Examples (high-level Ruby code and its Terror compiled counterpart)
Section titled “Examples (high-level Ruby code and its Terror compiled counterpart)”- Hello world (Ruby code, TVM bytecode)
- Maps (Ruby code, TVM code)
- Vectors (Ruby code, TVM code)
- Numbers (Ruby code, TVM code)
- Objects with prototypal inheritance (Ruby code, TVM bytecode)
- Functions and closures (Ruby code, TVM bytecode)
Instruction set
Section titled “Instruction set”- NOOP: no operation — does nothing.
Values
Section titled “Values”- PUSHSELF: pushes the current self to the stack.
- PUSHLOBBY: pushes the Lobby (toplevel object) to the stack.
- PUSH A: pushes the literal at index
Ato the stack. - PUSHTRUE: pushes the
trueobject to the stack. - PUSHFALSE: pushes the
falseobject to the stack. - PUSHNIL: pushes the
nilobject to the stack.
Local variables
Section titled “Local variables”- PUSHLOCAL A: pushes the local at index
Ato the stack. - SETLOCAL A: sets the current top of the stack to the local variable
A. Does not consume any stack. - PUSHLOCALDEPTH A, B: pushes the local at index
Bfrom an enclosing scope (at depthA) to the stack. - SETLOCALDEPTH A, B: sets the current top of the stack to the local variable
Bin an enclosing scope (at depthA). Does not consume any stack.
Branching
Section titled “Branching”- JMP A: Skips as much as
Ainstructions. - JIF A: Pops the top of the stack and skips as much as
Ainstructions if it is falsy (falseornil). - JIT A: Pops the top of the stack and skips as much as
Ainstructions if it is truthy (any value other thanfalseornil).
Slots (attributes)
Section titled “Slots (attributes)”- GETSLOT A: Pops the object at the top of the stack and asks for its slot with name
A(a literal), pushing it to the stack if found — if not, it’ll raise an error. - SETSLOT A: Pops a value to be set, then pops the object at the top of the stack and sets its slot with name
A(a literal) to the value that was first popped. Then pushes that value back to the stack.
- POP N: pops N values off the stack.
- DEFN A: takes the closure with the name
A(a literal) and pushes it to the stack. - MAKEVEC A: Pops as much as
Aelements off the stack and pushes a vector with all of them in the order they were popped (the reverse order they were pushed in the first place).
Call frames
Section titled “Call frames”- SEND A, B: Pops as much as
Barguments off the stack, then the receiver, and sends it the message with the nameA(a literal) with those arguments.
Debugging
Section titled “Debugging”- DUMP: Prints the contents of the value stack to the standard output.
Building the VM
Section titled “Building the VM”Since TerrorVM makes use of Clang’s block extension, you’ll need at least clang 3.4 to compile it.
$ git clone git://github.com/txus/terrorvm.git$ cd terrorvm$ makeTo run the tests:
$ make devAnd to clean the mess:
$ make cleanIf you want to run the tests under Valgrind to ensure there are no memory leaks:
$ make valgrindRunning programs
Section titled “Running programs”TerrorVM runs .tvm bytecode files such as the numbers.tvm under the
examples directory.
$ bin/terror examples/numbers.tvmIt ships with a simple compiler written in Ruby (Rubinius) that compiles a
tiny subset of Ruby to .tvm files. Check out the compiler directory, which
has its own Readme, and the compiler/examples where we have the
hello_world.rb file used to produce the hello_world.tvm.
TerrorVM doesn’t need Ruby to run; even the example compiler is a proof of concept and could be written in any language (even in C obviously).
The terror executable acts as a wrapper for the example compiler as well. To
compile and run on the fly a file written in our subset of Ruby:
$ bin/terror -x compiler/examples/numbers.rbTo compile a file yourself:
$ bin/terror -c output.tvm path/to/my/input.rbIn case of doubt, -h is your friend :)
$ bin/terror -hWho’s this
Section titled “Who’s this”This was made by Josep M. Bach (Txus) under the MIT license. I’m [@txustice][twitter] on twitter (where you should probably follow me!).
Contributing
Section titled “Contributing”- Fork it
- Create your feature branch (
git checkout -b my-new-feature) - Commit your changes (
git commit -am 'Added some feature') - Push to the branch (
git push origin my-new-feature) - Create new Pull Request
[twitter] !: https://twitter.com/txustice [email] !: mailto:josep.m.bach@gmail.com [alpha] !: https://github.com/txus/terrorvm/blob/master/compiler/kernel/alpha.rb [beta] !: https://github.com/txus/terrorvm/blob/master/compiler/kernel/beta.rb [alpha_native] !: https://github.com/txus/terrorvm/blob/master/kernel/alpha.tvm [beta_native] !: https://github.com/txus/terrorvm/blob/master/kernel/beta.tvm [treadmill] !: http://www.pipeline.com/~hbaker1/NoMotionGC.html [libtreadmill] !: https://github.com/txus/libtreadmill [libsweeper] !: https://github.com/txus/libsweeper [jtregunna] !: https://twitter.com/jtregunna [brixen] !: https://twitter.com/brixen [dbussink] !: https://twitter.com/dbussink [evanphx] !: https://twitter.com/evanphx