Welcome back to the “Compiling a Lisp” series. Last time we added support for let
expressions. This time we’re going to compile if
expressions.
Compiling if
expressions will allow us to write code that performs decision making. For example, we can write code that does something based on the result of some imaginary function coin-flip
:
(if (= (coin-flip) heads) 123 456)
If the call to coin-flip
returns heads
, then this whole if-expression will evaluate to 123
. Otherwise, it will evaluate to 456
. To determine if an expression is truthy, we’ll check if it is not equal to #f
.
Note that the iftrue and iffalse expressions (consequent and alternate, respectively) are only evaluated if their branch is reached.
People normally compile if
expressions by taking the following structure in Lisp:
(if condition consequent alternate)
and rewriting it to the following pseudo-assembly (where ...compile(X)
is replaced with compiled code from the expressions):
...compile(condition) compare result, #f jump-if-equal alternate ...compile(consequent) jump end alternate: ...compile(alternate) end:
This will evaluate the condition
expression. If it’s falsey, jump to the code for the alternate
expression. Otherwise, continue on to the code for the consequent
expression. So that the program does not also execute the code for the alternate
, jump over it.
This transformation requires a couple of new pieces of infrastructure.
First, we’ll need two types of jump instructions! We have a conditional jump (jcc
in x86-64) and an unconditional jump (jmp
in x86-64). These are relatively straightforward to emit.
Somewhat more complicated are the targets of those jump instructions. We’ll need to supply each of the instructions with some sort of destination code address.
When emitting text assembly, this is not so hard: make up names for your labels (as with alternate
and end
above), and keep the names consistent between the jump instruction and the jump target. Sure, you need to generate unique labels, but the assembler will at least do address resolution for you. This address resolution transparently handles backward jumps (where the label is already known) and forward jumps (where the label comes after the jump instruction).
Since we’re not emitting text assembly, we’ll need to calculate both forward and backward jump offsets by hand. This ends up not being so bad in practice once we come up with an ergonomic way to do it. Let’s take a look at some production-grade assemblers for inspiration.
I read some source code for assemblers like the Dart assembler. Dart is a language runtime developed by Google and part of their infrastructure includes a Just-In-Time compiler, sort of like what we’re making here. Part of their assembler is some slick C++-y RAII infrastucture for emitting code and doing cleanup. Their implementation of compiling if
expressions might look something like:
// Made-up APIs to make the Dart code look like our code int Compile_if(Buffer *buf, ASTNode *cond, ASTNode *consequent, ASTNode *alternate) { Label alternate; Label end; compile(buf, cond); buf->cmp(kRax, Object::false()); buf->jcc(kEqual, &alternate); compile(buf, consequent); buf->jmp(&end); buf->bind(&alternate); compile(buf, alternate); buf->bind(&end); }
Their Label
objects store information about where in the emitted machine code they are bound with bind
. If they are bound before they are used by jcc
or jmp
or something, then the emitter will just emit the destination address. If they are not yet bound, however, then the Label
will keep track of where it has to go back and patch the machine code once the label is bound to a location.
When the labels are destructed — meaning they can no longer be referenced by C++ code — their destructors have code to go back and patch all the instructions that referenced the label before it was bound.
While x86-64 has multiple jump widths available (for speed, I guess), it is a little tricky to use them for forward jumps. Because we don’t know in advance how long the intermediate code will be, we’ll just stick to generating 32-bit relative jumps always.
Virtual Machines like ART, OpenJDK Hotspot, SpiderMonkey, V8, HHVM, and Strongtalk also use this approach. So do the VM-agnostic AsmJit and GNU lightning assemblers. If I didn’t link an implementation, it’s either because I found the it too complicated to reproduce or couldn’t quickly track it down. Or maybe I don’t know about it!
Basically what I am trying to tell you is that this bind-and-backpatch approach is tried and tested and that we’re going to implement it in C. I hope you enjoyed the whirlwind tour of assemblers in various other runtimes along the way.
if
-expressions, finallyAlright, so we finally get the big idea about how to do this transformation. Let’s put it into practice.
First, as with let
, we’re going to need to handle the if
case in Compile_call
.
int Compile_call(Buffer *buf, ASTNode *callable, ASTNode *args, word stack_index, Env *varenv) { if (AST_is_symbol(callable)) { // ... if (AST_symbol_matches(callable, "if")) { return Compile_if(buf, /*condition=*/operand1(args), /*consequent=*/operand2(args), /*alternate=*/operand3(args), stack_index, varenv); } } // ... }
As usual, we’ll pull apart the expression so Compile_if
has less work to do. Since we now have more than two operands (!), I’ve added operand3
. It works just like you would think it does.
For Compile_if
, we’re going to largely replicate the pseudocode C++ from above. I think you’ll find that if you squint it looks similar enough.
int Compile_if(Buffer *buf, ASTNode *cond, ASTNode *consequent, ASTNode *alternate, word stack_index, Env *varenv) { _(Compile_expr(buf, cond, stack_index, varenv)); Emit_cmp_reg_imm32(buf, kRax, Object_false()); word alternate_pos = Emit_jcc(buf, kEqual, kLabelPlaceholder); // je alternate _(Compile_expr(buf, consequent, stack_index, varenv)); word end_pos = Emit_jmp(buf, kLabelPlaceholder); // jmp end Emit_backpatch_imm32(buf, alternate_pos); // alternate: _(Compile_expr(buf, alternate, stack_index, varenv)); Emit_backpatch_imm32(buf, end_pos); // end: return 0; }
Instead of having a Label
struct, though, I opted to just have a function to backpatch forward jumps explicitly. If you prefer to port Label
to C, be my guest. I found it very finicky1.
Also, instead of bind
, I opted for a more explicit backpatch
. This makes it clearer what is happening, I think.
This explicit backpatch approach requires manually tracking the offsets (like alternate_pos
and end_pos
) inside the jump instructions. We’ll need those offsets to backpatch them later. This means functions like Emit_jcc
and Emit_jmp
should return the offsets inside buf
where they write placeholder offsets.
Let’s take a look inside these helper functions’ internals.
jcc
and jmp
implementationsThe implementations for jcc
and jmp
are pretty similar, so I will only reproduce jcc
here.
word Emit_jcc(Buffer *buf, Condition cond, int32_t offset) { Buffer_write8(buf, 0x0f); Buffer_write8(buf, 0x80 + cond); word pos = Buffer_len(buf); Buffer_write32(buf, disp32(offset)); return pos; }
This function is like many other Emit
functions except for its return value. It returns the start location of the 32-bit offset for use in patching forward jumps. In the case of backward jumps, we can ignore this, since there’s no need to patch it after-the-fact.
Here is the implementation of Emit_backpatch_imm32
. I’ll walk through it and explain.
void Emit_backpatch_imm32(Buffer *buf, int32_t target_pos) { word current_pos = Buffer_len(buf); word relative_pos = current_pos - target_pos - sizeof(int32_t); Buffer_at_put32(buf, target_pos, disp32(relative_pos)); }
The input target_pos
is the location inside the jmp
(or similar) instruction that needs to be patched. Since we need to patch it with a relative offset, we compute the distance between the current position and the target position. We also need to subtract 4 bytes (sizeof(int32_t)
) because the jump offset is relative to the end of the jmp
instruction (the beginning of the next instruction).
Then, we write that value in. Buffer_at_put32
and disp32
are similar to their 8-bit equivalents.
Congratulations! You have implemented if
.
Radare2 has a tool called Cutter for reverse engineering and binary analysis. I decided to use it on the compiled output of a function containing an if
expression. It produced this pretty graph!
It’s prettier in the tool, trust me.
I added two trivial tests for the condition being true and the condition being false. I also added a nested if case as a smoke test but I did not foresee that being troublesome with our handy recursive approach.
TEST compile_if_with_true_cond(Buffer *buf) { ASTNode *node = Reader_read("(if #t 1 2)"); int compile_result = Compile_function(buf, node); ASSERT_EQ(compile_result, 0); byte expected[] = { // mov rax, 0x9f 0x48, 0xc7, 0xc0, 0x9f, 0x00, 0x00, 0x00, // cmp rax, 0x1f 0x48, 0x3d, 0x1f, 0x00, 0x00, 0x00, // je alternate 0x0f, 0x84, 0x0c, 0x00, 0x00, 0x00, // mov rax, compile(1) 0x48, 0xc7, 0xc0, 0x04, 0x00, 0x00, 0x00, // jmp end 0xe9, 0x07, 0x00, 0x00, 0x00, // alternate: // mov rax, compile(2) 0x48, 0xc7, 0xc0, 0x08, 0x00, 0x00, 0x00 // end: }; EXPECT_FUNCTION_CONTAINS_CODE(buf, expected); Buffer_make_executable(buf); uword result = Testing_execute_expr(buf); ASSERT_EQ_FMT(Object_encode_integer(1), result, "0x%lx"); AST_heap_free(node); PASS(); } TEST compile_if_with_false_cond(Buffer *buf) { ASTNode *node = Reader_read("(if #f 1 2)"); int compile_result = Compile_function(buf, node); ASSERT_EQ(compile_result, 0); byte expected[] = { // mov rax, 0x1f 0x48, 0xc7, 0xc0, 0x1f, 0x00, 0x00, 0x00, // cmp rax, 0x1f 0x48, 0x3d, 0x1f, 0x00, 0x00, 0x00, // je alternate 0x0f, 0x84, 0x0c, 0x00, 0x00, 0x00, // mov rax, compile(1) 0x48, 0xc7, 0xc0, 0x04, 0x00, 0x00, 0x00, // jmp end 0xe9, 0x07, 0x00, 0x00, 0x00, // alternate: // mov rax, compile(2) 0x48, 0xc7, 0xc0, 0x08, 0x00, 0x00, 0x00 // end: }; EXPECT_FUNCTION_CONTAINS_CODE(buf, expected); Buffer_make_executable(buf); uword result = Testing_execute_expr(buf); ASSERT_EQ_FMT(Object_encode_integer(2), result, "0x%lx"); AST_heap_free(node); PASS(); }
I made sure to test the generated code because we added some new instructions and also because I had trouble getting the offset computations perfectly right initially.
Anyway, that’s all for today. This post was made possible by contributions2 to my blog from Viewers Like You. Thank you.
Next time on PBS, heap allocation.
Maybe it would be less finicky with __attribute__((cleanup))
, but that is non-standard. This StackOverflow question and associated answers have some good information.
By “contributions” I mean thoughtful comments, questions, and appreciation. Feel free to chime in on Twitter, HN, Reddit, lobste.rs, the mailing list… ↩