Lexers, tokenizers, parsers, compilers, renderers, stringifiers... What's the difference, and how do they work?
Sifts, or "tokenizes" the characters in a string to create an array of objects, referred to as a "token stream"
Returns: Token stream.
Concepts
- token stream
- token
- lexical scoping
- lexical context
Description
A token stream is an array of "tokens", where each "token" is an object that contain details about a specific substring that was "captured", such as column and row, or line number and character position.
Example token
{ type: 'text', value: 'abc', position: { start: { column: 1 line: 1 }, end: { column: 3, line: 1 } } }
A token should also (and only) attempt to describe basic lexical context, such as character "type", which might be something like text
, number
, escaped
, delimiter
, or something similar.
Lexers should not attempt to describe dynamic scope, like where a bracketed section begins or ends, this kind of thing is left to the parser and is better represented by an Abstract Syntax Tree (AST).
Example token stream
A JavaScript token stream for the string abc{foo}xyz
might look something like this:
[ { type: 'text', value: 'abc', position: {start: {column: 1 line: 1}, end: {column: 3, line: 1}} }, { type: 'left-brace', value: '{', position: {start: {column: 4 line: 1}, end: {column: 4, line: 1}} }, { type: 'text', value: 'foo', position: {start: {column: 5 line: 1}, end: {column: 7, line: 1}} }, { type: 'right-brace', value: '}', position: {start: {column: 8 line: 1}, end: {column: 8, line: 1}} }, { type: 'text', value: 'xyz', position: {start: {column: 9 line: 1}, end: {column: 11, line: 1}} } ]
Parses a stream of tokens into an Abstract Syntax Tree (AST)
Returns: AST object
Concepts
- Abstract Syntax Tree (AST)
- nodes
- node
- dynamic scoping
- dynamic context
Description
Whereas a token stream is a "flat" array, the Abstract Ayntax Tree generated by a parser gives the tokens a dynamic, or global, context.
Thus, an AST is represented as an object, versus an array.
Example
A JavaScript AST for the string abc{foo}xyz
might look something like this:
{ type: 'root', nodes: [ { type: 'text', value: 'abc', position: {start: {column: 1 line: 1}, end: {column: 3, line: 1}} }, { type: 'brace', nodes: [ { type: 'left-brace', value: '{', position: {start: {column: 4 line: 1}, end: {column: 4, line: 1}} }, { type: 'text', value: 'foo', position: {start: {column: 5 line: 1}, end: {column: 7, line: 1}} }, { type: 'right-brace', value: '}', position: {start: {column: 8 line: 1}, end: {column: 8, line: 1}} } ] }, { type: 'text', value: 'xyz', position: {start: {column: 9 line: 1}, end: {column: 11, line: 1}} } ] }
Creates a function by converting an AST into a string of function statements and wrapping it with a boilerplate function body that defines the arguments the function can take. This generated function is then cached for re-use before being returned.
Returns: Function
Concepts
- function body
- function statements
- caching
Notes
The goal of a compiler is to create a cached function that can be invoked one or more times, on-demand, with the same or different arguments. The arguments passed to a compiled function are referred to as "context".
Invokes the function returned from a compiler with a given "context", producing a string where any placeholders or variables that may have been defined are replaced with actual values.
Returns: String
Concepts
- context
- variables
- the-super-tiny-compiler
- stringify: typically refers to converting an object to a string representation of the object. Example:
{foo: 'bar'}
would convert to the string'{"foo": "bar"}'
. - assembler: todo
- interpreter: todo
- translater: todo