Skip to content

Conversation

@gnarz
Copy link

@gnarz gnarz commented Oct 2, 2025

This patchset implements regular expression matching for abs.

A new type REGEX is implemented, and a set of functions, that will perform the regular expression matching. These functions can be called with STRING or REGEX objects as their first (pattern) argument. In case of a STRING argument, the string will be compiled to a REGEX on every call, otherwise the REGEX object will be used immediately.

As go's regexp package is used, the regular expression syntax is that of RE2, as described here: https://github.com/google/re2/wiki/Syntax except for \C

Tests have been added to evaluator/builtin_functions_test.go for all new functions.

Functions:

regex(string)
compile string to a regular expression

regex("a(x*)b") # "a(x*)b" regex("a(x*)b").type() # REGEX 

matches(pattern, string)
returns boolean true if the string matches the pattern, false if it does not match. pattern can be either a regex or a string.

"a(x*)b".matches("ayb") # false r = regex("a(x*)b"); r.matches("axxxb") # true 

match(pattern, string)
match the pattern against a string. pattern can be either a regex or a string. If the pattern does not match the string, returns null, otherwise returns an array where the first item is the full match, and all items after the first are the corresponding submatches.

r = regex("a(x*)b"); r.match("axxxb") # ["axxxb", "xxx"] "a(x*)b".match("axb") # ["axb", "x"] "a(x*)b".match("ayb") # null 

match_all(pattern, string)
match the pattern against string. pattern can be either a regex or a string. If the pattern does not match the string, returns null, otherwise returns an array where each item is an array as returned by match() for all matches of pattern in string.

r = regex("a(x*)b"); r.match_all("a ab axb") # [["ab", ""], ["axb", "x"]] "a(x*)b".match_all("axb ayb azb ab") # [["axb", "x"], ["ab", ""]] "a(x*)b".match_all("x y z") # null 

replace_match(pattern, string, string|function(s))
replace all occurrences of the pattern in a string with either an expanded string (as explained here: https://pkg.go.dev/regexp#Regexp.Expand), or with the result of calling a function of 1 arg with the string that is to be replaced. The result of this function will not be expanded. Be aware that if you use a string substitution parameter with expansions, you need to quote the backslash character in order to protect it from abs' string interpolation.

r = regex("."); r.replace_match("abx", ".") # "..." "a(x*)b".replace_match("a axb axxb", "\$1") # "a x xx" "x+".replace_match("ab axb axxb", f(x) { len(x) }) # "ab a1b a2b" "x+".replace_match("ab axb axxb", len) # "ab a1b a2b" 
gnarz added 9 commits October 1, 2025 20:12
implemented REGEX object type and regex(string) function
also factored out the distinction between REGEX and STRING object, as this will be used a few more times
added tests for regex() and replace_match() functions
fixed bug with returning errors from match functions shortened error returned by regex() used isError() function in preference to comparing with object.ERROR_OBJ
replace_match can now be called with a builtin as the third argument also, a test for this initial test for regex() now returns the type so we can see that the correct type is actually created
@odino
Copy link
Collaborator

odino commented Oct 3, 2025

hot damn! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants