Skip to content

DescribeGrammar v10

Viktor Chernev edited this page Nov 10, 2024 · 6 revisions

Describe version 1.0, codenamed Lines is the first official version of the language. It introduced two new concepts - the new-lining of operators, and tildes.

New-lining of operators simply means that production arrows, commas and terminators need to be followed by a new line - \n or \r\n. Also, double operators might be used as means to escape the new-lining - that is, ;\n or ; \r\n is the same as ;; - e.g.

Tildes, on the other hand, are items that are prefixed with a tilde symbol - ~. They are used to add additional data to the title entry of a production, in scenarios where a decorator would be problematic. You can read more about tildes here.

The ANTLR4 parser grammar is given next.

Describe 1.0 - Official

/* Describe Markup Language * version 1.0 (Lines) * Created by DemonOfReason and ChatGPT * Finished on 03 Aug 2024 */ grammar Describe10; // Define lexer rules for white spaces. Linespace is the same but with new line - '\n' // ---------------------------------------------------------------------------------------------------------- // ' '	: A space character. // '\r'	: A carriage return character (ASCII 13). // '\n'	: A newline character (ASCII 10). // '\t'	: A tab character (ASCII 9). // '\u000B'	: A vertical tab character (ASCII 11). // '\u000C'	: A form feed character (ASCII 12). // '\u0085'	: A next line (NEL) character (Unicode character U+0085). // '\u00A0'	: A non-breaking space (Unicode character U+00A0). // '\u1680'	: An ogham space mark (Unicode character U+1680). // '\u2000-\u200A'	: A range of en space to hair space (Unicode characters U+2000 to U+200A, inclusive). // '\u2028'	: A line separator (Unicode character U+2028). // '\u2029'	: A paragraph separator (Unicode character U+2029). // '\u202F'	: A narrow no-break space (Unicode character U+202F). // '\u205F'	: A medium mathematical space (Unicode character U+205F). // '\u3000'	: An ideographic space (Unicode character U+3000). // ---------------------------------------------------------------------------------------------------------- fragment WHITESPACE	: [ \r\t\u000B\u000C\u0085\u00A0\u1680\u2000-\u200A\u2028\u2029\u202F\u205F\u3000] ; fragment LINESPACE	: [ \r\n\t\u000B\u000C\u0085\u00A0\u1680\u2000-\u200A\u2028\u2029\u202F\u205F\u3000] ; // Define lexer rules for comments PROTO_SLASHES	: '://' ; LINE_COMMENT	: '//' .*? ('\r'? '\n' LINESPACE* | EOF) -> skip ; BLOCK_COMMENT	: '/*' .*? ('*/' LINESPACE* | EOF) -> skip ; TAG	: '<' .+? '>' LINESPACE* ; LINK	: '[' .*? ']' LINESPACE* ; DECORATOR	: '{' .*? '}' LINESPACE* ; // Define lexer rules for other tokens HYPHEN	: '-' ; TILDE	: '~' ; PRODUCTION_ARROW	: '>' WHITESPACE* BLOCK_COMMENT* '\n' LINESPACE*	| '>' WHITESPACE* BLOCK_COMMENT* LINE_COMMENT	| '>>' LINESPACE* ; SEPARATOR	: ',' WHITESPACE* BLOCK_COMMENT* '\n' LINESPACE*	| ',' WHITESPACE* BLOCK_COMMENT* LINE_COMMENT	| ',,' LINESPACE* ; TERMINATOR	: ';' WHITESPACE* BLOCK_COMMENT* ('\n' | EOF) LINESPACE*	| ';' WHITESPACE* BLOCK_COMMENT* LINE_COMMENT	| ';;' LINESPACE* ; FORWARD_SLASH	: '/' LINESPACE* ; COMMA	: ',' LINESPACE* ; SEMICOLON	: ';' LINESPACE* ; COLON	: ':' LINESPACE* ; RIGHT_ARROW	: '>' LINESPACE* ; RIGHT_SQUARE	: ']' LINESPACE* ; RIGHT_CURL	: '}' LINESPACE* ; STAR	: '*' LINESPACE* ; ESCAPE_ESCAPE	: '\\\\' LINESPACE* ; ESCAPE_HYPHEN	: '\\-' LINESPACE* ; ESCAPE_TILDE	: '\\~' LINESPACE* ; ESCAPE_RIGHT_ARROW	: '\\>' LINESPACE* ; ESCAPE_LEFT_ARROW	: '\\<' LINESPACE* ; ESCAPE_RIGHT_SQUARE	: '\\]' LINESPACE* ; ESCAPE_LEFT_SQUARE	: '\\[' LINESPACE* ; ESCAPE_RIGHT_CURL	: '\\}' LINESPACE* ; ESCAPE_LEFT_CURL	: '\\{' LINESPACE* ; ESCAPE_SEPARATOR	: '\\,' LINESPACE* ; ESCAPE_TERMINATOR	: '\\;' LINESPACE* ; ESCAPE_COLON	: '\\:' LINESPACE* ; ESCAPE_LCOMMENT	: '\\//' LINESPACE* ; ESCAPE_BCOMMENT	: '\\/*' LINESPACE* ; ESCAPE	: '\\' LINESPACE* ; // Define lexer rule for data // Note: For some reason we don't need to escape '[' and '|' // and ANTLR does not like when we try to escape them fragment DATA_CHAR	: ~[{}[\]\-<>,:;*~/\\] ; DATA	: DATA_CHAR+ ; // Define parser rules producer	: HYPHEN PRODUCTION_ARROW ; text_chunk	: ESCAPE_ESCAPE	| ESCAPE_HYPHEN	| ESCAPE_TILDE	| ESCAPE_RIGHT_ARROW	| ESCAPE_LEFT_ARROW	| ESCAPE_RIGHT_SQUARE	| ESCAPE_LEFT_SQUARE	| ESCAPE_RIGHT_CURL	| ESCAPE_LEFT_CURL	| ESCAPE_SEPARATOR	| ESCAPE_TERMINATOR	| ESCAPE_COLON	| ESCAPE_LCOMMENT	| ESCAPE_BCOMMENT	| ESCAPE	| HYPHEN	| COMMA	| SEMICOLON	| RIGHT_ARROW	| PRODUCTION_ARROW	| RIGHT_SQUARE	| RIGHT_CURL	| FORWARD_SLASH	| PROTO_SLASHES	| COLON	| STAR	| DATA ; item	: TILDE? (text_chunk)+ (TAG)?	| TILDE? (text_chunk)+ (LINK)+	| TILDE? (text_chunk)+ (DECORATOR)+	| TILDE? (text_chunk)+ (LINK)+ TAG	| TILDE? (text_chunk)+ TAG (LINK)+	| TILDE? (text_chunk)+ (DECORATOR)+ TAG	| TILDE? (text_chunk)+ TAG (DECORATOR)+	| TILDE? (text_chunk)+ (DECORATOR)+ (LINK)+	| TILDE? (text_chunk)+ (LINK)+ (DECORATOR)+	| TILDE? (text_chunk)+ TAG (DECORATOR)+ (LINK)+	| TILDE? (text_chunk)+ (DECORATOR)+ TAG (LINK)+	| TILDE? (text_chunk)+ (DECORATOR)+ (LINK)+ TAG	| TILDE? (text_chunk)+ TAG (LINK)+ (DECORATOR)+	| TILDE? (text_chunk)+ (LINK)+ TAG (DECORATOR)+	| TILDE? (text_chunk)+ (LINK)+ (DECORATOR)+ TAG ; expression	: item producer item_or_expression_list TERMINATOR	| item producer item TERMINATOR	| item producer expression TERMINATOR	| item producer TERMINATOR ; item_or_expression_part	: item SEPARATOR	| expression (SEPARATOR)? ; item_or_expression_list	: (item_or_expression_part)+ item	| (item_or_expression_part)+ expression ; expression_list	: (expression)+ expression ; scripture	: expression_list EOF	| expression EOF ; 

Links

Back

Clone this wiki locally