DEV Community

Gealber Morales
Gealber Morales

Posted on • Originally published at gealber.com

Challenge RE #8

At this point we don't need further introduction, I'll bring you my solution to the 8th challenge.

Description

The description of the challenge can be found in Challenge RE #8. In the previous ones I forgot the share the link, my bad.

Introduction

Assembly code to understand

 <f>: 0: push r12 2: test rsi,rsi 5: mov r12,rdi 8: push rbp 9: mov rbp,rdx c: push rbx d: mov rbx,rsi 10: je 32 12: nop WORD PTR [rax+rax*1+0x0] 18: mov rsi,QWORD PTR [rbx] 1b: mov rdi,rbp 1e: call r12 21: test eax,eax 23: je 56 25: js 40 27: mov rbx,QWORD PTR [rbx+0x18] 2b: test rbx,rbx 2e: xchg ax,ax 30: jne 18 32: pop rbx 33: pop rbp 34: xor eax,eax 36: pop r12 38: ret 39: nop DWORD PTR [rax+0x0] 40: mov rbx,QWORD PTR [rbx+0x10] 44: test rbx,rbx 47: je 32 49: mov rsi,QWORD PTR [rbx] 4c: mov rdi,rbp 4f: call r12 52: test eax,eax 54: jne 25 56: mov rax,rbx 59: pop rbx 5a: pop rbp 5b: pop r12 5d: ret 
Enter fullscreen mode Exit fullscreen mode

This time f it's way more longer than in our previous challenges, instead of complaining about this and give up at the first moment. It's better to try to understand small pieces of this code.

Let's start from instructions in memory 0 to instruction in memory 0x10. From these instructions we can notice, that our parameters came from registers rdi, rsi and rdx. We perform a check on one of them, to see if it's equal to 0.

Now interesting enough, we have that in rdi we receive a callback function. If you take a look, rdi it's moved to r12 who later it's been called on instruction in 1e. For that reason we can infer, that one of the arguments passed to f it's a callback function.

What can we tell about the signature of this callback function? Well tracking the use of register r12 you can notice that the function takes two arguments, rsi and rdi and return the value int eax. Every call r12 it's followed by a check on eax, where if we have zero or NULL value we jump to the end of the program, otherwise we jump to position 0x25. Looking at the use of its arguments, we can take a save assumption that the first one can be number while the second...well the second one is not so straight forward, let's take a look at its use in the program.

Identifying rbx layout

The second argument it's been taken from rbx, this register it's only modified once, when we copied rsi into it on instruction mov rbx,rsi. After that we've been calling it in this three times:

;; first time mov rsi,QWORD PTR [rbx] ;; more down in position 0x27 mov rbx,QWORD PTR [rbx+0x18] ;; more down in position 0x40 mov rbx,QWORD PTR [rbx+0x10] 
Enter fullscreen mode Exit fullscreen mode

In the last two cases, we are overwriting the value of rbx, with QWORD PTR [rbx+0x18] and QWORD PTR [rbx+0x10]. The padding or offset from rbx it's 0x18 = 24 and 0x10 = 16, so we can assume the data stored in rbx has to following arrangement.

[0......|0x10.....|0x18....] 
Enter fullscreen mode Exit fullscreen mode

And the data in [rbx+0x10] and [rbx+0x18] can overwrite rbx. I don't know you, but two me this smells like a binary tree, where [rbx+0x10] and [rbx+0x18] are the left and right children of the node rbx.

This is a huge insight. If we are on the right path, we can be looking at a binary search here, but's let's not take hasty conclusions.

First resume

At the moment, we have that:

  1. r12 contains a callback function
  2. rbx contains a binary tree

With this in mind we can start declaring our Node structure and our callback function.

typedef struct Node { int data; struct Node *left; struct Node *right; } Node; // signature of callback function int g_callback(int val, Node *n); 
Enter fullscreen mode Exit fullscreen mode

Signature of f

We already know that the arguments for f comes from registers:

  1. rdi, which is a callback function
  2. rsi, which is our binary tree
  3. rdx, which it's data of a node of our binary tree, let's assume the data are just numbers, for simplicity. In any case later we can check if this is the case.

What it's missing it's our returned value. For this, looking at the last ret you can notice that what we return it's actually the value in rbx which hold a Node of our binary tree.

With this in mind we can assume the following signature for f

 Node *f(int (int, Node *), Node *, int); 
Enter fullscreen mode Exit fullscreen mode

Main logic

With the signature, will be easier to infer the logic in the program. Having a clear idea of what are the structures involved also help us a lot. I mean, we haven't even looked at the logic, and you might be guessing that this has to be some kind of search in a binary tree. Maybe I'm the only one imagining that, given that I'm my only user in my blog 😟. Anyway let's continue.

Let's gather all our finding and put it in C code

typedef struct Node { int data; struct Node *left; struct Node *right; } Node; int g_callback(int val, Node *n); Node *f(int cb(int, Node *), struct Node *n, int data) { return NULL; } 
Enter fullscreen mode Exit fullscreen mode

Ok, this looks good. Does it? Don't doubt on yourself great Gealber, this is amazing.

Let's analyze the logic by small chunks, first chunk

 18: mov rsi,QWORD PTR [rbx] 1b: mov rdi,rbp 1e: call r12 21: test eax,eax 23: je 56 25: js 40 ;; in 56 we just return the Node 
Enter fullscreen mode Exit fullscreen mode

Nothing fancy, now that we know more about the layout of rbx, we know it's a Node, so this code can be interpreted as a call to to the callback function we passed as an argument. In C code would be:

int ret = cb(n, data); if (ret == 0) return n; if (ret < 0) { // do something } 
Enter fullscreen mode Exit fullscreen mode

Second chunk of code...maybe chunk is not the right word Gealber 🤔

 27: mov rbx,QWORD PTR [rbx+0x18] 2b: test rbx,rbx 2e: xchg ax,ax 30: jne 18 
Enter fullscreen mode Exit fullscreen mode

In this case, we first overwrite our current node for our right child, and later perform a check on it, followed by a jump to 18 if it's not NULL. Interesting, the code in 0x18 it's our previous chunk of code. This seems as a recursion call, because the code in 0x18 it's actually the start of our f method. Given the nature of binary trees and the operations we normally perform with them, is not a crazy assumption to say that this is a recursion call.

Taking this into C, would be like this

Node *f(int cb(int, Node *), struct Node *n, int data) { int ret = cb(n, data); if (ret == 0) return n; if (ret < 0) { // do something } n = n->right; if (n != NULL) n = f(cb, n, data); return NULL; } 
Enter fullscreen mode Exit fullscreen mode

The next most important part, is the code in 0x40. This code it's quite similar to our previous one, just that in this case we check with our left node. The way to reach this code is from instruction 0x25, with js 40.

A way to put this in C would be

n = n->left; if (n != NULL) n = f(cb, n, data); 
Enter fullscreen mode Exit fullscreen mode

We already can make a correct forecast of what this function does, in a simple way this code performs a search in a binary tree, with a supplied comparison callback function.

Let's put all this together in a clean way

typedef struct Node { int data; struct Node *left; struct Node *right; } Node; // f performs a search in a binary tree // using cb callback function as our comparison data Node *f(int cb(int, Node *), struct Node *n, int data) { int ret = cb(n, data); if (ret == 0) return n; if (ret < 0) { return f(cb, n->left, data); } return f(cb, n->right, data); } 
Enter fullscreen mode Exit fullscreen mode

Final notes

Take into account that data doesn't has to be an int, given that we are supplying a comparison function we can declare data as void *. As long as the comparison function can handle it, won't be a problem.

As a personal note, I found this challenge quite instructive, not because it taught me a new data structure. What I found valuable it's that it showed me how a structure is defined in assembly, basically there's just an offset between field and another. The more I do this challenges the more I realize that C is so close to assembly 😅.

Top comments (0)