Closed
Description
Some regular expression engines support (*FAIL)
as a pattern which fails to match anything. (?!)
is an idiomatic way to write this in engines which do not support (*FAIL)
.
It works pretty well, but it can be optimized. Instead of compiling it as ASSERT_NOT opcode
>>> re.compile(r'12(?!)|3', re.DEBUG) BRANCH LITERAL 49 LITERAL 50 ASSERT_NOT 1 OR LITERAL 51 0. INFO 9 0b100 1 2 (to 10) in 5. LITERAL 0x31 ('1') 7. LITERAL 0x33 ('3') 9. FAILURE 10: BRANCH 11 (to 22) 12. LITERAL 0x31 ('1') 14. LITERAL 0x32 ('2') 16. ASSERT_NOT 3 0 (to 20) 19. SUCCESS 20: JUMP 7 (to 28) 22: branch 5 (to 27) 23. LITERAL 0x33 ('3') 25. JUMP 2 (to 28) 27: FAILURE 28: SUCCESS re.compile('12(?!)|3', re.DEBUG)
it can be compiled as FAILURE opcode.
>>> re.compile(r'12(?!)|3', re.DEBUG) BRANCH LITERAL 49 LITERAL 50 FAILURE OR LITERAL 51 0. INFO 9 0b100 1 2 (to 10) in 5. LITERAL 0x31 ('1') 7. LITERAL 0x33 ('3') 9. FAILURE 10: BRANCH 8 (to 19) 12. LITERAL 0x31 ('1') 14. LITERAL 0x32 ('2') 16. FAILURE 17. JUMP 7 (to 25) 19: branch 5 (to 24) 20. LITERAL 0x33 ('3') 22. JUMP 2 (to 25) 24: FAILURE 25: SUCCESS re.compile('12(?!)|3', re.DEBUG)
Unfortunately I do not know good examples of using (*FAIL)
in regular expressions (without using (*SKIP)
) to include them in the documentation. Perhaps other patterns of using (*FAIL)
could be optimized future, but I do not know what to optimize.