DEV Community

Cover image for rlox: A Rust Implementation of “Crafting Interpreters” – Scanner
Be Hai Nguyen
Be Hai Nguyen

Posted on

rlox: A Rust Implementation of “Crafting Interpreters” – Scanner


I am attempting a Rust implementation of Robert Nystrom's Lox language discussed in Crafting Interpreters. This post describes my Rust code equivalence for the Scanning chapter.

🦀 Index of the Complete Series.

This is the long list of existing Rust Lox Implementations. I downloaded and ran the first two, but I did not have a look at the code. I would like to take on this project as a challenge. If I complete it, I want it to reflect my own independent effort.


🚀 Please note, code for this post can be downloaded from GitHub with:

git clone -b v0.1.0 https://github.com/behai-nguyen/rlox.git 
Enter fullscreen mode Exit fullscreen mode


● To run interactively, first change to the rlox/ directory, then run the following command:

$ cargo run 
Enter fullscreen mode Exit fullscreen mode

Enter something like var str2 = "秋の終わり";, and press Enter — you will see the tokens printed out. Please refer to the screenshot below for an illustration.

139-01.png

At the moment, inputs are processed independently, meaning each new input does not retain any connection to previous inputs.

To exit, simply press Enter without entering anything.

● To Run with a Lox script file, first change to the rlox/ directory, then run the following command:

$ cargo run ./tests/data/scanning/numbers.lox 
Enter fullscreen mode Exit fullscreen mode

If there are no errors, you will see the tokens printed out.


● To run existing tests, first change to the rlox/ directory, then run the following command:

$ cargo test 
Enter fullscreen mode Exit fullscreen mode


❶ Repository Layout

. ├── Cargo.toml ├── README.md ├── src │   ├── lib.rs │   ├── lox_error.rs │   ├── main.rs │   ├── scanner_index.rs │   ├── scanner.rs │   ├── token.rs │   └── token_type.rs └── tests ├── data │   └── scanning │   ├── identifiers.lox │   ├── keywords.lox │   ├── numbers.lox │   ├── punctuators.lox │   ├── README.md │   ├── sample.lox │   ├── strings.lox │   ├── utf8_text.lox │   └── whitespace.lox ├── test_common.rs └── test_scanner.rs 
Enter fullscreen mode Exit fullscreen mode


❷ Let's briefly describe the project.

● Identifier names follow Rust convention. In the Scanning chapter, method names such as scanTokens(), peekNext() are scan_tokens() and peek_next() in Rust respectively.

● Identifier names which are keywords in Rust will simply have an underscore () suffix appended. For example, match() becomes match(), and type becomes type_.

● The src/scanner_index.rs module is not in the original Java version. It implements the Java variables start, current, line and some additional fields to support UTF-8 text scanning and slicing; please refer to this post for a full discussion on supporting UTF-8 text slicing.

● In the src/token.rs module, I am not sure if we need the literal field in the Token struct in the future. I leave it for the time being.

● 💥 In the src/scanner.rs module, the method scan_tokens() returns an array (vector) of Token; and the run() function in the src/main.rs module consumes this array and drops it. This array is local. In the Java implementation, it is a global class variable. This implementation might change in the future.

● The src/lox_error.rs module is also not in the original Java version. It implements a Rust specific error struct.

● Under tests/data/scanning/ directory, except for utf8_text.lox which is mine; the README.md lists the original addresses of all other test data files.

● The tests/test_scanner.rs module implements test for each of the test data files in the tests/data/scanning/ directory.


❸ The above points are specific to this implementation, otherwise the code adhere to Crafting Interpreters, chapter Scanning.

Thank you for reading. I hope you find this post helpful. Stay safe, as always.

✿✿✿

Feature image sources:

🦀 Index of the Complete Series.

Top comments (0)