DEV Community

Tomasz Wegrzanowski
Tomasz Wegrzanowski

Posted on

100 Languages Speedrun: Episode 78: Better Whenever Interpreter with Python and SLY

A while back I covered Whenever - a language where the program is a todo list. Whenever has no functions, no variables, and the state is just a todo list. As long as there's something on it, it picks an item at random, and performs it.

For this episode, I want to create a better version of Whenever language. It's mostly backwards compatible with the original and can run every Whenever program I could find. However, behavior for edge cases isn't documented anywhere, and absolute compatibility isn't the goal.

I made the following changes to make it more convenient, while keeping the challenge level just high as before:

  • todo items can be identified by name, not just line number
  • todo items don't need to be in any specific order
  • empty lists are allowed
  • comments with // are allowed
  • semicolons are optional
  • you can mix print actions and counter actions in the same todo
  • if you name one of the todo items as start, program will begin with it as the only active action on the todo list
  • it uses much smarter execution model, so it can run much more ambitious programs than the original interpreter
  • it detects some infinite loops if no action can possibly happen and throws an error
  • it uses infinite precision integers
  • ! is supported
  • I didn't add forget as it's listed as deprecated

How it works

There's a lot of code coming, and it's only somewhat commented, so here's the big picture overview:

  • you can run it with whenever.py whenever_file.txt
  • lexer and parser are implemented with SLY, I recently wrote about it, check it out if you need some help understanding how SLY works
  • WheneverLexer.py implements the lexer, which turns Whenever program into a stream of tokens like ["print", "(", '"Hello, World!"', ")"] - it also handles trivial things like removing comments
  • WheneverParser.py implements the parser, which turns that stream of tokens into a program
  • WheneverEval.py is all the code that evaluates Whenever expressions, dealing with all the operators, type conversions, and so on
  • whenever.py runs the program using the previous classes

I'll go into a bit of detail when I deal with each of the files.

Examples of Better Whenever Programs

I included all examples from previous episode, as well as all examples from official documentation, but since I want to improve the language I also included a few more.

math.txt, just a quick test script to check that operators have correct precedence:

a print(300 + 50 * 4 + 80 / 4 - (80 - 30) * 2) 
Enter fullscreen mode Exit fullscreen mode

fizzbuzz.txt, a simple program simplified to take advantage of new features like comments, no-semicolons, and start, but still using numbers for todo items:

// start start 10,11 // variables 2 2 3 3 4 4 // printing 5 defer((N(6) + N(7) + N(8) + N(9)) != 3) 6#-N(6),7#-N(7),8#-N(8),9#-N(9) 6 defer(N(3) == 0 || N(4) == 0) print(N(2)) 7 defer(3 || N(4) == 0) print("Fizz") 8 defer(N(3) == 0 || 4) print("Buzz") 9 defer(3 || 4) print("FizzBuzz") // loop logic 10 defer(5 || N(2) >= 100) 2,3,-3#((N(3)/3)*3),4,-4#((N(4)/5)*5),5,6,7,8,9,10 11 defer(5 || N(2) < 100) -2#N(2),-3#N(3),-4#N(4),-10#N(10) 
Enter fullscreen mode Exit fullscreen mode

fizzbuzz2.txt, same, but now with all todo items named, and ! used:

// start start loop,loopend // variables a a t t f f // printing prn defer((N(pn) + N(pf) + N(pb) + N(pfb)) != 3) -pn#N(pn),-pf#N(pf),-pb#N(pb),-pfb#N(pfb) pn defer(!t || !f) print(N(a)) pf defer(t || !f) print("Fizz") pb defer(!t || f) print("Buzz") pfb defer(t || f) print("FizzBuzz") // loop logic loop defer(prn || N(a) >= 100) a,t,-t#((N(t)/3)*3),f,-f#((N(f)/5)*5),prn,pn,pf,pb,pfb,loop loopend defer(prn || N(a) < 100) -a#N(a),-t#N(t),-f#N(f),-loop#N(loop) 
Enter fullscreen mode Exit fullscreen mode

WheneverLexer.py

This is the simplest file. Possibly improvements would be support for escape codes like \n in strings (but then print would need to change to stop autoprinting \n), and maybe some extra features like read or forget. I also didn't bother tracking line numbers, so error messages won't be great.

from sly import Lexer class WheneverLexer(Lexer): tokens = { PLUS, MINUS, TIMES, DIVIDE, LPAREN, RPAREN, COMMA, HASH, REL, NUM, ID, EOS, STRING, AND, OR, NOT, PRINT, DEFER, AGAIN, N, TRUE, FALSE, MOD } ignore = " \t\r" EOS = r"(\n|;|//[^\n]*\n)+" PLUS = r"\+" MINUS = r"-" TIMES = r"\*" DIVIDE = r"/" MOD = r"%" LPAREN = r"\(" RPAREN = r"\)" COMMA = "," HASH = "#" REL = r"<=|<|>=|>|==|!=" OR = r"\|\|" AND = "&&" NOT = "!" STRING = r'"[^"]*"' NUM = r"[0-9]+" ID = r"[a-zA-Z_][a-zA-Z0-9_]*" ID['again'] = AGAIN ID['defer'] = DEFER ID['false'] = FALSE ID['N'] = N ID['print'] = PRINT ID['true'] = TRUE def NUM(self, t): t.value = int(t.value) return t def STRING(self, t): t.value = t.value[1:-1] return t 
Enter fullscreen mode Exit fullscreen mode

SLY makes defining lexers really easy.

WheneverParser.py

The parser is long, but not really complicated. For AST I just use Python dictionaries, and I use explicit rules like exprm instead of precedence tables, as I'm more used to this style. SLY by default has really weird error handling (print error to stderr, do nothing), so we need to override it to throw proper exceptions. I used the same messages as would be printed.

For simplicity, I don't distinguish between identifier (basically symbols) and strings, so you can do N("loop") or print(Fizz), even though that shouldn't really be supported. It's largely harmless for now.

Most of the syntax is fairly obvious. One somewhat unusual thing I did was explicitly disallowing multiple comparison at the same level without parentheses, like (1 == 2 == 3 or 1 < 2 == 3 < 4), as that's just really confusing.

from sly import Parser from WheneverLexer import WheneverLexer class WheneverParser(Parser): tokens = WheneverLexer.tokens @_("") def program(self, p): return {} @_("EOS program") def program(self, p): return p.program @_("todo program") def program(self, p): program = p.program id, todo = p.todo if id in program: raise Exception(f"Duplicate todo id: {id}") program[id] = todo return program @_("id modifiers actions EOS") def todo(self, p): again = [] defer = [] for mod in p.modifiers: if "again" in mod: again.append(mod["again"]) if "defer" in mod: defer.append(mod["defer"]) return (p.id, {"again": again, "defer": defer, "actions": p.actions}) @_("") def modifiers(self, p): return [] @_("modifier modifiers") def modifiers(self, p): return [p.modifier] + p.modifiers @_("DEFER LPAREN expr RPAREN") def modifier(self, p): return {"defer": p.expr} @_("AGAIN LPAREN expr RPAREN") def modifier(self, p): return {"again": p.expr} @_("action") def actions(self, p): return [p.action] @_("action COMMA actions") def actions(self, p): return [p.action] + p.actions @_("id") def action(self, p): return {"action": "change", "id": p.id, "count": {"value": 1}} @_("MINUS id") def action(self, p): return {"action": "change", "id": p.id, "count": {"value": -1}} @_("id HASH expr") def action(self, p): return {"action": "change", "id": p.id, "count": p.expr} @_("MINUS id HASH expr") def action(self, p): return {"action": "change", "id": p.id, "count": {"op": "uminus", "a": p.expr}} @_("PRINT LPAREN expr RPAREN") def action(self, p): return {"action": "print", "arg": p.expr} # expr, going through all the priority levels  @_("exprlo") def expr(self, p): return p.exprlo ### expr logical or  @_("exprla") def exprlo(self, p): return p.exprla @_("exprlo OR exprla") def exprlo(self, p): return {"op": "||", "a": p.exprlo, "b": p.exprla} ### expr logical and  @_("exprr") def exprla(self, p): return p.exprr @_("exprla AND exprr") def exprla(self, p): return {"op": "&&", "a": p.exprla, "b": p.exprr} ### expr relational  ### do not support chaining them without parentheses  @_("expra") def exprr(self, p): return p.expra @_("expra REL expra") def exprr(self, p): return {"op": p.REL, "a": p.expra0, "b": p.expra1} ### expr additive  @_("expra PLUS exprm") def expra(self, p): return {"op": "+", "a": p.expra, "b": p.exprm} @_("expra MINUS exprm") def expra(self, p): return {"op": "-", "a": p.expra, "b": p.exprm} @_("exprm") def expra(self, p): return p.exprm ### expr multiplicative  @_("exprm TIMES expru") def exprm(self, p): return {"op": "*", "a": p.exprm, "b": p.expru} @_("exprm DIVIDE expru") def exprm(self, p): return {"op": "/", "a": p.exprm, "b": p.expru} @_("exprm MOD expru") def exprm(self, p): return {"op": "%", "a": p.exprm, "b": p.expru} #### expr unary  @_("expru") def exprm(self, p): return p.expru @_("value") def expru(self, p): return p.value @_("NOT expru") def expru(self, p): return {"op": "!", "a": p.expru} @_("MINUS expru") def expru(self, p): return {"op": "uminus", "a": p.expru} ### primitive value  @_("LPAREN expr RPAREN") def value(self, p): return p.expr # No reason to distinguish between foo and "foo" yet  @_("ID") def value(self, p): return {"value": p.ID} @_("NUM") def value(self, p): return {"value": p.NUM} @_("N LPAREN id RPAREN") def value(self, p): return {"N": p.id} @_("STRING") def value(self, p): return {"value": p.STRING} @_("TRUE") def value(self, p): return {"value": True} @_("FALSE") def value(self, p): return {"value": False} @_("ID") def id(self, p): return p.ID @_("NUM") def id(self, p): return p.NUM def error(self, token): if token: lineno = getattr(token, "lineno", 0) if lineno: raise Exception(f"sly: Syntax error at line {lineno}, token={token.type}") else: raise Exception(f"sly: Syntax error, token={token.type}") else: raise Exception("sly: Parse error in input. EOF") 
Enter fullscreen mode Exit fullscreen mode

WheneverEval.py

Evaluating expressions involves so much code I put it into a separate file. Whenever has unusual type coercions - for example 3 || !4 means N(3) > 0 || !(N(4) > 0). Whenever also converts everything to string if added to a string, and converts strings (just initial digits, or 0 if there aren't any initial digits) and booleans to ints when used in int context.

I make comparison operators >, <, >=, <=, ==, and != also convert to integers. Whenever doesn't really do anything useful with strings except printing them anyway, so there's never any point doing "foo" == "bar" anyway.

Things like type coercions are among things that could be improved, but that would cost backwards compatibility.

class WheneverEval: def __init__(self, program): self.program = program def int_eval_node(self, node): a = self.int_eval(node["a"]) b = self.int_eval(node["b"]) return (a, b) def int_eval(self, node): return self.int(self.eval(node)) def bool_eval(self, node): return self.bool(self.eval(node)) def str_eval(self, node): return self.str(self.eval(node)) # A lot of aggressive casting to int here  def eval(self, node): if "value" in node: return node["value"] if "N" in node: return self.program.counters[node["N"]] if "op" not in node: raise Exception(f"Neither op nor value? {node}") op = node["op"] if op == "-": (a, b) = self.int_eval_node(node) return a - b if op == "*": (a, b) = self.int_eval_node(node) return a * b if op == "/": (a, b) = self.int_eval_node(node) return a // b if op == "%": (a, b) = self.int_eval_node(node) return a % b # Not even clear about this one:  if op == "+": a = self.eval(node["a"]) b = self.eval(node["b"]) if isinstance(a, str) or isinstance(b, str): return self.str(a) + self.str(b) else: return self.int(a) + self.int(b) if op == "uminus": a = self.int_eval(node["a"]) return -a if op == "||": a = self.bool_eval(node["a"]) b = self.bool_eval(node["b"]) return a or b if op == "&&": a = self.bool_eval(node["a"]) b = self.bool_eval(node["b"]) return a and b if op == "!": a = self.bool_eval(node["a"]) return not a if op == "<": (a, b) = self.int_eval_node(node) return a < b if op == "<=": (a, b) = self.int_eval_node(node) return a <= b if op == ">": (a, b) = self.int_eval_node(node) return a > b if op == ">=": (a, b) = self.int_eval_node(node) return a >= b if op == "==": (a, b) = self.int_eval_node(node) return a == b if op == "!=": (a, b) = self.int_eval_node(node) return a != b raise Exception(f"TODO: Operation {op} not supported yet") def int(self, value): if isinstance(value, bool): if value == True: return 1 if value == False: return 0 if isinstance(value, int): return value if isinstance(value, str): if re.match(r"^-?\d+", value): return int(value) else: return 0 raise Exception(f"Invalid type {value}") def bool(self, value): if isinstance(value, bool): return value return self.program.counters[value] > 0 def str(self, value): if isinstance(value, bool): if value == True: return "true" if value == False: return "false" if isinstance(value, str): return value if isinstance(value, int): return str(value) raise Exception(f"Invalid type {value}") 
Enter fullscreen mode Exit fullscreen mode

whenever.py

And finally the main program.

The main execution loop is like this:

  • if there's start symbol, start with counters at {start: 1, evertyhing_else: 0}. Otherwise give everything starting counter of 1.
  • mark any rule like n n as trivial - these do nothing when executed, we still need to track their value, but we never need to actually run them, as program state is same before and after running it (we could ignore defer here etc. for more optimization)
  • as long as there's a nontrivial rule with nonzero counter, pick something at random at run it

Trivial next todo selection algorithm would result in exceedingly slow execution. For example fib from official documentation is so slow in it that it never finishes. With this improved interpreter it finishes in almost no time.

  • every time, we copy list of all nontrivial rules as actionable
  • we pick a rule at random from among actionable
  • we check if rule is deferred, if yes, we remove it from actionable set and try again
  • if program isn't finished, but all rules are trivial or deferred, we raise exception instead of doing infinite loop

This means if we have situation like this:

  • N(1) - 17167680177566 times, deferred while N(3) > 0
  • N(2) - 27777890035289 times, deferred while N(3) > 0
  • N(3) - 1 times, actionable

We don't have to keep re-rolling 17167680177566+27777890035289+1 times until we get to run todo item 3.

The algorithm is lazy, so we only check deferred rule if we rolled it, so normally it shouldn't slow us down. The interpreter is still not very optimized, and it's definitely possible to write very slow programs for it, but at least it avoids those exponential slowdowns for such very simple programs.

#!/usr/bin/env python3  import sys from WheneverLexer import WheneverLexer from WheneverParser import WheneverParser from WheneverEval import WheneverEval from random import randint class WheneverProgram: def __init__(self, path): self.path = path with open(path) as f: self.text = f.read() lexer = WheneverLexer() parser = WheneverParser() self.program = parser.parse(lexer.tokenize(self.text)) self.eval = WheneverEval(self) # A lot more optimizations are possible  # total is the sum of all nontrivial counters  def init_counters(self): self.total = 0 self.trivial = set() self.nontrivial = set() self.counters = {} has_start = ("start" in self.program) for id in self.program.keys(): if id == "start" or not has_start: starting_value = 1 else: starting_value = 0 self.counters[id] = starting_value is_trivial = (self.program[id] == {'again': [], 'defer': [], 'actions': [{'action': 'change', 'id': id, 'count': {'value': 1}}]}) if is_trivial: self.trivial.add(id) else: self.nontrivial.add(id) self.total += starting_value def is_deferred(self, id): for expr in self.program[id]["defer"]: if self.eval.bool_eval(expr): return True return False def random_todo(self, actionable, actionable_total): i = randint(0, actionable_total - 1) for id in actionable: i -= self.counters[id] if i < 0: return id raise Exception("Math failure") def execute_random_todo(self): actionable = self.nontrivial.copy() actionable_total = self.total while True: if actionable_total == 0: raise Exception("All actions are deferred") id = self.random_todo(actionable, actionable_total) if self.is_deferred(id): actionable.remove(id) actionable_total -= self.counters[id] else: self.execute_todo(id) return # defer is checked before we call this  def execute_todo(self, id): todo = self.program[id] again = todo["again"] remove_todo = True for expr in todo["defer"]: if self.eval.bool_eval(expr): remove_todo = False for action in todo["actions"]: self.execute_action(action) if remove_todo: self.update_counter(id, -1) def execute_action(self, action): action_type = action["action"] if action_type == "print": s = self.eval.str_eval(action["arg"]) print(s) elif action_type == "change": id = action["id"] count = self.eval.int_eval(action["count"]) self.update_counter(id, count) else: raise Exception(f"Unknown action: {action_type}") def update_counter(self, id, change): old_value = self.counters[id] new_value = old_value + change if new_value < 0: new_value = 0 self.counters[id] = new_value if id not in self.trivial: self.total += (new_value - old_value) def run(self): self.init_counters() while self.total > 0: self.execute_random_todo() if sum(self.counters.values()) != 0: raise Exception("Program entered do-nothing infinite loop") program = WheneverProgram(sys.argv[1]) program.run() 
Enter fullscreen mode Exit fullscreen mode

Should you try Better Whenever?

I think the improved version is much more friendly, as things like comments, variable names, start rule just make the whole coding process more pleasant, and you can focus on doing crazy things in a language with esoteric execution model without all the hassle of turning everything into numbers.

It also definitely helps that this interpreter avoids a lot of exponential slowdown.

Like everything else in this series, I hacked this program during one evening, so it's not perfect. If you're interested in improving it further, let me know, and I can setup proper repository, documentation, specs etc. for it.

This interpreter follows the original idea very closely, but it's definitely possible to do something more creative with the "program as todo list" idea, just like AsciiDots really evolved Befunge idea.

Code

All code examples for the series will be in this repository.

Code for the Better Whenever Interpreter with Python and SLY episode is available here.

Top comments (0)