Lecture 2: Declarative Syntax Definition CS4200 Compiler Construction Eelco Visser TU Delft September 2018
This Lecture !2 Java Type Check JVM bytecode Parse CodeGenOptimize Specification of syntax definition from which we can derive parsers
Reading Material 3
The perspective of this lecture on declarative syntax definition is explained more elaborately in this Onward! 2010 essay. It uses an on older version of SDF than used in these slides. Production rules have the form X1 … Xn -> N {cons(“C”)} instead of N.C = X1 … Xn http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2010-019.pdf https://doi.org/10.1145/1932682.1869535
The SPoofax Testing (SPT) language used in the section on testing syntax definitions was introduced in this OOPSLA 2011 paper. https://doi.org/10.1145/2076021.2048080 http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2011-011.pdf
The SDF3 syntax definition formalism is documented at the metaborg.org website. http://www.metaborg.org/en/latest/source/langdev/meta/lang/sdf3/index.html
Syntax 7
What is Syntax? !8 https://en.wikipedia.org/wiki/Syntax In mathematics, syntax refers to the rules governing the behavior of mathematical systems, such as formal languages used in logic. (See logical syntax.) In linguistics, syntax (/ˈsɪntæks/[1][2]) is the set of rules, principles, and processes that govern the structure of sentences in a given language, specifically word order and punctuation. The term syntax is also used to refer to the study of such principles and processes.[3] The goal of many syntacticians is to discover the syntactic rules common to all languages. The word syntax comes from Ancient Greek: σύνταξις "coordination", which consists of σύν syn, "together," and τάξις táxis, "an ordering".
Syntax (Programming Languages) !9 In computer science, the syntax of a computer language is the set of rules that defines the combinations of symbols that are considered to be a correctly structured document or fragment in that language. This applies both to programming languages, where the document represents source code, and markup languages, where the document represents data. The syntax of a language defines its surface form.[1] Text-based computer languages are based on sequences of characters, while visual programming languages are based on the spatial layout and connections between symbols (which may be textual or graphical). Documents that are syntactically invalid are said to have a syntax error. https://en.wikipedia.org/wiki/Syntax_(programming_languages)
Syntax - The set of rules, principles, and processes that govern the structure of sentences in a given language - The set of rules that defines the combinations of symbols that are considered to be a correctly structured document or fragment in that language How to describe such a set of rules? !10 That Govern the Structure
The Structure of Programs 11
What do we call the elements of programs? !12 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; }
What kind of program element is this? !13 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Program Compilation Unit
What kind of program element is this? !14 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Preprocessor Directive
What kind of program element is this? !15 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Function Declaration Function Prototype
What kind of program element is this? !16 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Comment
What kind of program element is this? !17 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Function Definition
What kind of program element is this? !18 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Variable Declaration
What kind of program element is this? !19 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Statement For Loop
What kind of program element is this? !20 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Statement Function Call
What kind of program element is this? !21 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Expression
What kind of program element is this? !22 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Formal Function Parameter
What kind of program element is this? !23 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Type
Syntactic Categories !24 Program Compilation Unit Preprocessor Directive Function Declaration Function Prototype Function Definition Variable Declaration Statement For Loop Expression Formal Function Parameter Type Function Call Programs consist of different kinds of elements
Hierarchy of Syntactic Categories !25 Program Compilation Unit Preprocessor Directive Function Declaration Function Prototype Function Definition Variable Declaration Statement For LoopExpression Formal Function Parameter Type Function Call Some kinds of constructs are contained in others
The Tiger Language !26 Example language used in lectures https://www.lrde.epita.fr/~tiger/tiger.html Documentation https://github.com/MetaBorgCube/metaborg-tiger Spoofax project
!27 let var N := 8 type intArray = array of int var row := intArray[N] of 0 var col := intArray[N] of 0 var diag1 := intArray[N + N - 1] of 0 var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n") ); print("n")) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 & diag2[r + 7 - c] = 0 then ( row[r] := 1; diag1[r + c] := 1; diag2[r + 7 - c] := 1; col[c] := r; try(c + 1); row[r] := 0; diag1[r + c] := 0; diag2[r + 7 - c] := 0)) in try(0) end A Tiger program that solves the n-queens problem
!28 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end A mutilated n-queens Tiger program with ‘redundant’ elements removed What are the syntactic categories of Tiger? What are the language constructs of Tiger called?
!29 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Program Expression Let Binding
!30 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Variable Declaration
!31 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Type Declaration
!32 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Type Expression
!33 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Expression Array Initialization
!34 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Function Definition
!35 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Function Name
!36 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Function Body Expression For Loop
!37 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Expression Sequential Composition
!38 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Formal Parameter
!39 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Expression Assignment
!40 let … function prettyprint(tree : tree) : string = let var output := "" function write(s : string) = output := concat(output, s) function show(n : int, t : tree) = let function indent(s : string) = ( write("n"); for i := 1 to n do write(" ") ; output := concat(output, s)) in if t = nil then indent(".") else ( indent(t.key); show(n + 1, t.left); show(n + 1, t.right)) end in show(0, tree); output end … in … end Functions can be nested
Structure - Programs have structure Categories - Program elements come in multiple categories - Elements cannot be arbitrarily interchanged Constructs - Some categories have multiple elements Hierarchy - Categories are organized in a hierarchy !41 Elements of Programs
Decomposing Programs 42
Decomposing a Program into Elements !43 let function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) in fact(10) end
Decomposing a Program into Elements !44 function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) fact(10) Exp.Let
Decomposing a Program into Elements !45 fact(10) Exp.Let n : int if n < 1 then 1 else n * fact(n - 1) FunDec fact int
Decomposing a Program into Elements !46 fact(10) Exp.Let int if n < 1 then 1 else n * fact(n - 1) FunDec fact n FArg Tid int Tid
Decomposing a Program into Elements !47 fact(10) Exp.Let int n < 1 FunDec fact n FArg Tid Exp.If 1 n * fact(n - 1) int Tid
Decomposing a Program into Elements !48 fact(10) Exp.Let int n FunDec fact n FArg Tid Exp.If 1 n * fact(n - 1) int Tid Exp.Lt Exp.Int 1 Exp.Var Exp.Int
Decomposing a Program into Elements !49 fact(10) Exp.Let int n FunDec fact n FArg Tid Exp.If 1 fact int Tid Exp.Lt Exp.Int 1 Exp.Var Exp.Int Exp.Times n Exp.Var Exp.Call n - 1 Etc. let function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) in fact(10) end
Tree Structure Represented as (First-Order) Term !50 Mod( Let( [ FunDecs( [ FunDec( "fact" , [FArg("n", Tid("int"))] , Tid("int") , If( Lt(Var("n"), Int("1")) , Int("1") , Times( Var("n") , Call("fact", [Minus(Var("n"), Int("1"))]) ) ) ) ] ) ] , [Call("fact", [Int("10")])] ) ) let function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) in fact(10) end
Textual representation - Convenient to read and write (human processing) - Concrete syntax / notation Structural tree/term representation - Represents the decomposition of a program into elements - Convenient for machine processing - Abstract syntax !51 Decomposing Programs
What are well-formed textual programs? What are well-formed terms/trees? How to decompose programs automatically? !52 Formalizing Program Decomposition
Abstract Syntax: Formalizing Program Structure 53
Algebraic Signatures !54 signature sorts S0 S1 S2 … constructors C : S1 * S2 * … -> S0 … Sorts: syntactic categories Constructors: language constructs
Well-Formed Terms !55 The family of well-formed terms T(Sig) defined by signature Sig is inductively defined as follows: If C : S1 * S2 * … -> S0 is a constructor in Sig and 
 if t1, t2, … are terms in T(Sig)(S1), T(Sig)(S2), …, 
 then C(t1, t2, …) is a term in T(Sig)(S0)
Well-Formed Terms: Example !56 signature sorts Exp constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp if n < 1 then 1 else n * fact(n - 1) If( Lt(Var("n"), Int("1")) , Int("1") , Times( Var("n") , Call("fact", [Minus(Var("n"), Int("1"))]) ) ) decompose well-formed wrt
Lists of Terms !57 signature sorts Exp Var constructors … Call : ID * List(Exp) -> Exp [Minus(Var("n"), Int(“1”))] [Minus(Var("n"), Int(“1”)), Lt(Var("n"), Int("1"))] []
Well-Formed Terms with Lists !58 The family of well-formed terms T(Sig) defined by signature Sig is inductively defined as follows: If C : S1 * S2 * … -> S0 is a constructor in Sig and 
 if t1, t2, … are terms in T(Sig)(S1), T(Sig)(S2), …, 
 then C(t1, t2, …) is a term in T(Sig)(S0) If t1, t2, … are terms in T(Sig)(S), Then [t1, t2, …] is a term in T(Sig)(List(S))
Abstract syntax of a language - Defined by algebraic signature - Sorts: syntactic categories - Constructors: language constructs Program structure - Represented by (first-order) term - Well-formed with respect to abstract syntax - (Isomorphic to tree structure) !59 Abstract Syntax
From Abstract Syntax to Concrete Syntax 60
What does Abstract Syntax Abstract from? !61 signature sorts Exp Var constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp Signature does not define ‘notation’
What is Notation? !62 signature sorts Exp Var constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp Notation: literals, keywords, delimiters, punctuation n x e1 * e2 e1 - e2 e1 < e2 if e1 then e2 else e3 f(e1, e2, …) if n < 1 then 1 else n * fact(n - 1)
How can we couple notation to abstract syntax? !63 signature sorts Exp Var constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp n x e1 * e2 e1 - e2 e1 < e2 if e1 then e2 else e3 f(e1, e2, …) if n < 1 then 1 else n * fact(n - 1) Notation: literals, keywords, delimiters, punctuation
Context-Free Grammars !64 grammar non-terminals N0 N1 N2 … terminals T0 T1 T2 … productions N0 = S1 S2 … … Non-terminals (N): syntactic categories Terminals (T): words of sentences Productions: rules to create sentences Symbols (S): non-terminals and terminals
Well-Formed Sentences !65 The family of sentences L(G) defined by context-free grammar G is inductively defined as follows: A terminal T is a sentence in L(G)(T) If N0 = S1 S2 … is a production in G and 
 if w1, w2, … are sentences in L(G)(S1), L(G)(S2), …, 
 then w1 w2 … is a sentence in L(G)(N0)
Well-Formed Sentences !66 if n < 1 then 1 else n * fact(n - 1) grammar non-terminals Exp Var productions Exp = IntConst Exp = Id Exp = Exp "*" Exp Exp = Exp "-" Exp Exp = Exp "<" Exp Exp = "if" Exp "then" Exp "else" Exp Exp = Id "(" {Exp ","}* ")"
What is the relation between concrete and abstract syntax? !67 if n < 1 then 1 else n * fact(n - 1) If( Lt(Var("n"), Int("1")) , Int("1") , Times( Var("n") , Call("fact", [Minus(Var("n"), Int("1"))]) ) ) grammar non-terminals Exp Var productions Exp = IntConst Exp = Id Exp = Exp "*" Exp Exp = Exp "-" Exp Exp = Exp "<" Exp Exp = "if" Exp "then" Exp "else" Exp Exp = Id "(" {Exp ","}* ")" ? ? signature sorts Exp Var constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp
Context-Free Grammars with Constructor Declarations !68 sorts Exp Var context-free syntax Exp.Int = IntConst Exp.Var = Id Exp.Times = Exp "*" Exp Exp.Minus = Exp "-" Exp Exp.Lt = Exp "<" Exp Exp.If = "if" Exp "then" Exp "else" Exp Exp.Call = Id "(" {Exp ","}* ")" signature sorts Exp Var constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp grammar non-terminals Exp Var productions Exp = IntConst Exp = Id Exp = Exp "*" Exp Exp = Exp "-" Exp Exp = Exp "<" Exp Exp = "if" Exp "then" Exp "else" Exp Exp = Id "(" {Exp ","}* ")"
Context-Free Grammars with Constructor Declarations !69 sorts Exp Var context-free syntax Exp.Int = IntConst Exp.Var = Id Exp.Times = Exp "*" Exp Exp.Minus = Exp "-" Exp Exp.Lt = Exp "<" Exp Exp.If = "if" Exp "then" Exp "else" Exp Exp.Call = Id "(" {Exp ","}* ")" Abstract syntax: productions define constructor and sorts of arguments Concrete syntax: productions define notation for language constructs
Abstract syntax - Production defines constructor, argument sorts, result sort - Abstract from notation: lexical elements of productions Concrete syntax - Productions define context-free grammar rules Some details to discuss - Sequences - Lexical syntax - Ambiguities - Converting text to tree and back (parsing, unparsing) !70 CFG with Constructors defines Abstract and Concrete Syntax
Sequences (Lists) 71
Encoding Sequences (Lists) !72 sorts Exp Var context-free syntax Exp.Int = IntConst Exp.Var = Id Exp.Times = Exp "*" Exp Exp.Minus = Exp "-" Exp Exp.Lt = Exp "<" Exp Exp.If = "if" Exp "then" Exp "else" Exp Exp.Call = Id "(" ExpList ")" ExpList.Nil = ExpList = ExpListNE ExpListNE.One = Exp ExpListNE.Snoc = ExpListNE “," Exp printlist(merge(list1,list2)) Call("printlist" , [Call("merge", [Var("list1") , Var("list2")])] )
Sugar for Sequences and Optionals !73 context-free syntax Exp.Call = Id "(" {Exp ","}* ")" printlist(merge(list1,list2)) Call("printlist" , [Call("merge", [Var("list1") , Var("list2")])] ) context-free syntax // automatically generated {Exp “,"}*.Nil = // empty list {Exp “,”}* = {Exp ","}+ {Exp “,"}+.One = Exp {Exp “,"}+.Snoc = {Exp ","}+ "," Exp Exp*.Nil = // empty list Exp* = Exp+ Exp+.One = Exp Exp+.Snoc = Exp+ Exp Exp?.None = // no expression Exp?.Some = Exp // one expression
Normalizing Lists !74 rules Snoc(Nil(), x) -> Cons(x, Nil()) Snoc(Cons(x, xs), y) -> Cons(x, Snoc(xs, y)) One(x) -> Cons(x, Nil()) Nil() -> [] Cons(x, xs) -> [x | xs] context-free syntax // automatically generated {Exp “,"}*.Nil = // empty list {Exp “,”}* = {Exp ","}+ {Exp “,"}+.One = Exp {Exp “,"}+.Snoc = {Exp ","}+ "," Exp Exp*.Nil = // empty list Exp* = Exp+ Exp+.One = Exp Exp+.Snoc = Exp+ Exp Exp?.None = // no expression Exp?.Some = Exp // one expression
Using Sugar for Sequences !75 module Functions imports Identifiers imports Types context-free syntax Dec = FunDec+ FunDec = "function" Id "(" {FArg ","}* ")" "=" Exp FunDec = "function" Id "(" {FArg ","}* ")" ":" Type "=" Exp FArg = Id ":" Type Exp = Id "(" {Exp ","}* ")" let function power(x: int, n: int): int = if n <= 0 then 1 else x * power(x, n - 1) in power(3, 10) end module Bindings imports Control-Flow Identifiers Types Functions Variables sorts Declarations context-free syntax Exp = "let" Dec* "in" {Exp ";"}* "end" Declarations = "declarations" Dec*
Lexical Syntax 76
Context-Free Syntax vs Lexical Syntax !77 let function power(x: int, n: int): int = if n <= 0 then 1 else x * power(x, n - 1) in power(3, 10) end Mod( Let( [ FunDecs( [ FunDec( "power" , [FArg("x", Tid("int")), FArg("n", Tid("int"))] , Tid("int") , If( Leq(Var("n"), Int("0")) , Int("1") , Times( Var("x") , Call( "power" , [Var("x"), Minus(Var("n"), Int("1"))] ) ) ) ) ] ) ] , [Call("power", [Int("3"), Int("10")])] ) ) phrase structure lexeme / token separated by layout structure not relevant not separated by layout
Character Classes !78 lexical syntax // character codes Character = [65] Range = [65-90] Union = [65-90] / [97-122] Difference = [0-127] / [1013] Union = [0-911-1214-255] Character class represents choice from a set of characters
Sugar for Character Classes !79 lexical syntax // sugar CharSugar = [a] = [97] CharClass = [abcdefghijklmnopqrstuvwxyz] = [97-122] SugarRange = [a-z] = [97-122] Union = [a-z] / [A-Z] / [0-9] = [48-5765-9097-122] RangeCombi = [a-z0-9_] = [48-579597-122] Complement = ~[nr] = [0-255] / [1013] = [0-911-1214-255]
Literals are Sequences of Characters !80 lexical syntax // literals Literal = "then" // case sensitive sequence of characters CaseInsensitive = 'then' // case insensitive sequence of characters syntax "then" = [116] [104] [101] [110] 'then' = [84116] [72104] [69101] [78110] syntax "then" = [t] [h] [e] [n] 'then' = [Tt] [Hh] [Ee] [Nn]
Identifiers !81 lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* Id = a Id = B Id = cD Id = xyz10 Id = internal_ Id = CamelCase Id = lower_case Id = ...
Lexical Ambiguity: Longest Match !82 lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} // curried function call ab Mod( amb( [Var("ab"), Call(Var("a"), Var("b"))] ) )
abc Lexical Ambiguity: Longest Match !83 Mod( amb( [ amb( [ Var("abc") , Call( amb( [Var("ab"), Call(Var("a"), Var("b"))] ) , Var("c") ) ] ) , Call(Var("a"), Var("bc")) ] ) ) lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} // curried function call
abc def ghi Lexical Restriction => Longest Match !84 lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} // curried function call Call(Call(Var("abc"), Var("def")), Var("ghi")) Lexical restriction: phrase cannot be followed by character in character class
if def then ghi Lexical Ambiguity: Keywords overlap with Identifiers !85 lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} Exp.IfThen = "if" Exp "then" Exp amb( [ Mod( Call( Call(Call(Var("if"), Var("def")), Var("then")) , Var("ghi") ) ) , Mod(IfThen(Var("def"), Var("ghi"))) ] )
ifdef then ghi Lexical Ambiguity: Keywords overlap with Identifiers !86 amb( [ Mod( Call(Call(Var("ifdef"), Var("then")), Var("ghi")) ) , Mod(IfThen(Var("def"), Var("ghi"))) ] ) lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} Exp.IfThen = "if" Exp "then" Exp
Reject Productions => Reserved Words !87 lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* Id = "if" {reject} Id = "then" {reject} lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers "if" "then" -/- [a-zA-Z0-9_] context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} Exp.IfThen = "if" Exp "then" Exp if def then ghi IfThen(Var("def"), Var("ghi"))
ifdef then ghi Reject Productions => Reserved Words !88 parse error lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* Id = "if" {reject} Id = "then" {reject} lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers "if" "then" -/- [a-zA-Z0-9_] context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} Exp.IfThen = "if" Exp "then" Exp
Real Numbers !89 lexical syntax RealConst.RealConstNoExp = IntConst "." IntConst RealConst.RealConst = IntConst "." IntConst "e" Sign IntConst Sign = "+" Sign = "-"
Ambiguity & Disambiguation 90
Ambiguity !91 t1 != t2 / format(t1) = format(t2) Add VarRef VarRef “y”“x” Mul Int “3” Add VarRef VarRef “y” “x” Mul Int “3” 3 * x + y
!92
Disambiguation Filters !93 parse filter Disambiguation Filters [Klint & Visser; 1994], [Van den Brand, Scheerder, Vinju, Visser; CC 2002]
Associativity and Priority !94 context-free syntax Expr.Int = INT Expr.Add = "Expr" + "Expr" {left} Expr.Mul = "Expr" * "Expr" {left} context-free priorities Expr.Mul > Expr.Add Recent improvement: safe disambiguation of operator precedence [SLE13, Onward15, Programming18] Add VarRef VarRef “y”“x” Mul Int “3” Add VarRef VarRef “y” “x” Mul Int “3” 3 * x + y
context-free syntax Exp.Add = Exp "+" Exp Exp.Sub = Exp "-" Exp Exp.Mul = Exp "*" Exp Exp.Div = Exp "/" Exp Exp.Eq = Exp "=" Exp Exp.Neq = Exp "<>" Exp Exp.Gt = Exp ">" Exp Exp.Lt = Exp "<" Exp Exp.Gte = Exp ">=" Exp Exp.Lte = Exp "<=" Exp Exp.And = Exp "&" Exp Exp.Or = Exp "|" Exp Ambiguous Operator Syntax !95
Disambiguation by Associativity and Priority Rules !96 context-free syntax Exp.Add = Exp "+" Exp {left} Exp.Sub = Exp "-" Exp {left} Exp.Mul = Exp "*" Exp {left} Exp.Div = Exp "/" Exp {left} Exp.Eq = Exp "=" Exp {non-assoc} Exp.Neq = Exp "<>" Exp {non-assoc} Exp.Gt = Exp ">" Exp {non-assoc} Exp.Lt = Exp "<" Exp {non-assoc} Exp.Gte = Exp ">=" Exp {non-assoc} Exp.Lte = Exp "<=" Exp {non-assoc} Exp.And = Exp "&" Exp {left} Exp.Or = Exp "|" Exp {left} context-free priorities { left: Exp.Mul Exp.Div } > { left: Exp.Add Exp.Sub } > { non-assoc: Exp.Eq Exp.Neq Exp.Gt Exp.Lt Exp.Gte Exp.Lte } > Exp.And > Exp.Or
Disambiguation by Associativity and Priority Rules !97 context-free syntax Exp.Add = Exp "+" Exp {left} Exp.Sub = Exp "-" Exp {left} Exp.Mul = Exp "*" Exp {left} Exp.Div = Exp "/" Exp {left} Exp.Eq = Exp "=" Exp {non-assoc} Exp.Neq = Exp "<>" Exp {non-assoc} Exp.Gt = Exp ">" Exp {non-assoc} Exp.Lt = Exp "<" Exp {non-assoc} Exp.Gte = Exp ">=" Exp {non-assoc} Exp.Lte = Exp "<=" Exp {non-assoc} Exp.And = Exp "&" Exp {left} Exp.Or = Exp "|" Exp {left} context-free priorities { left: Exp.Mul Exp.Div } > { left: Exp.Add Exp.Sub } > { non-assoc: Exp.Eq Exp.Neq Exp.Gt Exp.Lt Exp.Gte Exp.Lte } > Exp.And > Exp.Or 1 + 3 <= 4 - 35 & 12 > 16 And( Leq( Plus(Int("1"), Int("3")) , Minus(Int("4"), Int("35")) ) , Gt(Int("12"), Int("16")) )
Testing Syntax Definitions with SPT 98 Are you defining the right language?
Test Suites !99 module example-suite language Tiger start symbol Start test name [[...]] parse succeeds test another name [[...]] parse fails SPT: SPoofax Testing language
Success is Failure, Failure is Success, … !100 module success-failure language Tiger test this test succeeds [[ 1 + 3 * 4 ]] parse succeeds test this test fails [[ 1 + 3 * 4 ]] parse fails test this test fails [[ 1 + 3 * ]] parse succeeds test this test succeeds [[ 1 + 3 * ]] parse fails
Test Cases: Valid !101 Language valid program test valid program [[...]] parse succeeds
Test Cases: Invalid !102 test invalid program [[...]] parse fails invalid program Language
Test Cases !103 module syntax/identifiers language Tiger start symbol Id test single lower case [[x]] parse succeeds test single upper case [[X]] parse succeeds test single digit [[1]] parse fails test single lc digit [[x1]] parse succeeds test single digit lc [[1x]] parse fails test single uc digit [[X1]] parse succeeds test single digit uc [[1X]] parse fails test double digit [[11]] parse fails
Test Corner Cases !104 Language
Testing Structure !105 module structure language Tiger test add times [[ 21 + 14 + 7 ]] parse to Mod(Plus(Int(“21"),Times(Int("14"),Int("7")))) test times add [[ 3 * 7 + 21 ]] parse to Mod(Plus(Times(Int(“3"),Int("7")),Int("21"))) test if [[ if x then 3 else 4 ]] parse to Mod(If(Var("x"),Int("3"),Int("4"))) The fragment did not parse to the expected ATerm. Parse result was: Mod(Plus(Plus(Int("21"),Int("14")),Int("7"))) Expected result was: Mod(Plus(Int(21), Times(Int(14), Int(7))))
Testing Ambiguity !106 Note : this does not actually work in Tiger, since () is a sequencing construct; but works fine in Mini-Java module precedence language Tiger start symbol Exp test parentheses [[(42)]] parse to [[42]] test left-associative addition [[21 + 14 + 7]] parse to [[(21 + 14) + 7]] test precedence multiplication [[3 * 7 + 21]] parse to [[(3 * 7) + 21]]
Testing Ambiguity !107 module precedence language Tiger start symbol Exp test plus/times priority [[ x + 4 * 5 ]] parse to Plus(Var("x"), Times(Int("4"), Int("5"))) test plus/times sequence [[ (x + 4) * 5 ]] parse to Times( Seq([Plus(Var("x"), Int("4"))]) , Int("5") )
Syntax Engineering in Spoofax 108
Developing syntax definition - Define syntax of language in multiple modules - Syntax checking, colouring - Checking for undefined non-terminals Testing syntax definition - Write example programs in editor for language under def - Inspect abstract syntax terms ‣ Spoofax > Syntax > Show Parsed AST - Write SPT test for success and failure cases ‣ Updated after build of syntax definition !110 Syntax Engineering in Spoofax
Declarative Syntax Definition: Summary 111
Syntax definition - Define structure (decomposition) of programs - Define concrete syntax: notation - Define abstract syntax: constructors Using syntax definitions (next) - Parsing: converting text to abstract syntax term - Pretty-printing: convert abstract syntax term to text - Editor services: syntax highlighting, syntax checking, completion !112 Declarative Syntax Definition
Next: Parsing 113
Except where otherwise noted, this work is licensed under

Compiler Construction | Lecture 2 | Declarative Syntax Definition

  • 1.
    Lecture 2: DeclarativeSyntax Definition CS4200 Compiler Construction Eelco Visser TU Delft September 2018
  • 2.
    This Lecture !2 Java TypeCheck JVM bytecode Parse CodeGenOptimize Specification of syntax definition from which we can derive parsers
  • 3.
  • 4.
    The perspective ofthis lecture on declarative syntax definition is explained more elaborately in this Onward! 2010 essay. It uses an on older version of SDF than used in these slides. Production rules have the form X1 … Xn -> N {cons(“C”)} instead of N.C = X1 … Xn http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2010-019.pdf https://doi.org/10.1145/1932682.1869535
  • 5.
    The SPoofax Testing(SPT) language used in the section on testing syntax definitions was introduced in this OOPSLA 2011 paper. https://doi.org/10.1145/2076021.2048080 http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2011-011.pdf
  • 6.
    The SDF3 syntaxdefinition formalism is documented at the metaborg.org website. http://www.metaborg.org/en/latest/source/langdev/meta/lang/sdf3/index.html
  • 7.
  • 8.
    What is Syntax? !8 https://en.wikipedia.org/wiki/Syntax Inmathematics, syntax refers to the rules governing the behavior of mathematical systems, such as formal languages used in logic. (See logical syntax.) In linguistics, syntax (/ˈsɪntæks/[1][2]) is the set of rules, principles, and processes that govern the structure of sentences in a given language, specifically word order and punctuation. The term syntax is also used to refer to the study of such principles and processes.[3] The goal of many syntacticians is to discover the syntactic rules common to all languages. The word syntax comes from Ancient Greek: σύνταξις "coordination", which consists of σύν syn, "together," and τάξις táxis, "an ordering".
  • 9.
    Syntax (Programming Languages) !9 Incomputer science, the syntax of a computer language is the set of rules that defines the combinations of symbols that are considered to be a correctly structured document or fragment in that language. This applies both to programming languages, where the document represents source code, and markup languages, where the document represents data. The syntax of a language defines its surface form.[1] Text-based computer languages are based on sequences of characters, while visual programming languages are based on the spatial layout and connections between symbols (which may be textual or graphical). Documents that are syntactically invalid are said to have a syntax error. https://en.wikipedia.org/wiki/Syntax_(programming_languages)
  • 10.
    Syntax - The setof rules, principles, and processes that govern the structure of sentences in a given language - The set of rules that defines the combinations of symbols that are considered to be a correctly structured document or fragment in that language How to describe such a set of rules? !10 That Govern the Structure
  • 11.
    The Structure ofPrograms 11
  • 12.
    What do wecall the elements of programs? !12 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; }
  • 13.
    What kind ofprogram element is this? !13 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Program Compilation Unit
  • 14.
    What kind ofprogram element is this? !14 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Preprocessor Directive
  • 15.
    What kind ofprogram element is this? !15 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Function Declaration Function Prototype
  • 16.
    What kind ofprogram element is this? !16 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Comment
  • 17.
    What kind ofprogram element is this? !17 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Function Definition
  • 18.
    What kind ofprogram element is this? !18 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Variable Declaration
  • 19.
    What kind ofprogram element is this? !19 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Statement For Loop
  • 20.
    What kind ofprogram element is this? !20 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Statement Function Call
  • 21.
    What kind ofprogram element is this? !21 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Expression
  • 22.
    What kind ofprogram element is this? !22 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Formal Function Parameter
  • 23.
    What kind ofprogram element is this? !23 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Type
  • 24.
    Syntactic Categories !24 Program Compilation Unit PreprocessorDirective Function Declaration Function Prototype Function Definition Variable Declaration Statement For Loop Expression Formal Function Parameter Type Function Call Programs consist of different kinds of elements
  • 25.
    Hierarchy of SyntacticCategories !25 Program Compilation Unit Preprocessor Directive Function Declaration Function Prototype Function Definition Variable Declaration Statement For LoopExpression Formal Function Parameter Type Function Call Some kinds of constructs are contained in others
  • 26.
    The Tiger Language !26 Examplelanguage used in lectures https://www.lrde.epita.fr/~tiger/tiger.html Documentation https://github.com/MetaBorgCube/metaborg-tiger Spoofax project
  • 27.
    !27 let var N :=8 type intArray = array of int var row := intArray[N] of 0 var col := intArray[N] of 0 var diag1 := intArray[N + N - 1] of 0 var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n") ); print("n")) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 & diag2[r + 7 - c] = 0 then ( row[r] := 1; diag1[r + c] := 1; diag2[r + 7 - c] := 1; col[c] := r; try(c + 1); row[r] := 0; diag1[r + c] := 0; diag2[r + 7 - c] := 0)) in try(0) end A Tiger program that solves the n-queens problem
  • 28.
    !28 let var N :=8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end A mutilated n-queens Tiger program with ‘redundant’ elements removed What are the syntactic categories of Tiger? What are the language constructs of Tiger called?
  • 29.
    !29 let var N :=8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Program Expression Let Binding
  • 30.
    !30 let var N :=8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Variable Declaration
  • 31.
    !31 let var N :=8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Type Declaration
  • 32.
    !32 let var N :=8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Type Expression
  • 33.
    !33 let var N :=8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Expression Array Initialization
  • 34.
    !34 let var N :=8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Function Definition
  • 35.
    !35 let var N :=8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Function Name
  • 36.
    !36 let var N :=8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Function Body Expression For Loop
  • 37.
    !37 let var N :=8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Expression Sequential Composition
  • 38.
    !38 let var N :=8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Formal Parameter
  • 39.
    !39 let var N :=8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Expression Assignment
  • 40.
    !40 let … function prettyprint(tree :tree) : string = let var output := "" function write(s : string) = output := concat(output, s) function show(n : int, t : tree) = let function indent(s : string) = ( write("n"); for i := 1 to n do write(" ") ; output := concat(output, s)) in if t = nil then indent(".") else ( indent(t.key); show(n + 1, t.left); show(n + 1, t.right)) end in show(0, tree); output end … in … end Functions can be nested
  • 41.
    Structure - Programs havestructure Categories - Program elements come in multiple categories - Elements cannot be arbitrarily interchanged Constructs - Some categories have multiple elements Hierarchy - Categories are organized in a hierarchy !41 Elements of Programs
  • 42.
  • 43.
    Decomposing a Programinto Elements !43 let function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) in fact(10) end
  • 44.
    Decomposing a Programinto Elements !44 function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) fact(10) Exp.Let
  • 45.
    Decomposing a Programinto Elements !45 fact(10) Exp.Let n : int if n < 1 then 1 else n * fact(n - 1) FunDec fact int
  • 46.
    Decomposing a Programinto Elements !46 fact(10) Exp.Let int if n < 1 then 1 else n * fact(n - 1) FunDec fact n FArg Tid int Tid
  • 47.
    Decomposing a Programinto Elements !47 fact(10) Exp.Let int n < 1 FunDec fact n FArg Tid Exp.If 1 n * fact(n - 1) int Tid
  • 48.
    Decomposing a Programinto Elements !48 fact(10) Exp.Let int n FunDec fact n FArg Tid Exp.If 1 n * fact(n - 1) int Tid Exp.Lt Exp.Int 1 Exp.Var Exp.Int
  • 49.
    Decomposing a Programinto Elements !49 fact(10) Exp.Let int n FunDec fact n FArg Tid Exp.If 1 fact int Tid Exp.Lt Exp.Int 1 Exp.Var Exp.Int Exp.Times n Exp.Var Exp.Call n - 1 Etc. let function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) in fact(10) end
  • 50.
    Tree Structure Representedas (First-Order) Term !50 Mod( Let( [ FunDecs( [ FunDec( "fact" , [FArg("n", Tid("int"))] , Tid("int") , If( Lt(Var("n"), Int("1")) , Int("1") , Times( Var("n") , Call("fact", [Minus(Var("n"), Int("1"))]) ) ) ) ] ) ] , [Call("fact", [Int("10")])] ) ) let function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) in fact(10) end
  • 51.
    Textual representation - Convenientto read and write (human processing) - Concrete syntax / notation Structural tree/term representation - Represents the decomposition of a program into elements - Convenient for machine processing - Abstract syntax !51 Decomposing Programs
  • 52.
    What are well-formedtextual programs? What are well-formed terms/trees? How to decompose programs automatically? !52 Formalizing Program Decomposition
  • 53.
  • 54.
    Algebraic Signatures !54 signature sorts S0S1 S2 … constructors C : S1 * S2 * … -> S0 … Sorts: syntactic categories Constructors: language constructs
  • 55.
    Well-Formed Terms !55 The familyof well-formed terms T(Sig) defined by signature Sig is inductively defined as follows: If C : S1 * S2 * … -> S0 is a constructor in Sig and 
 if t1, t2, … are terms in T(Sig)(S1), T(Sig)(S2), …, 
 then C(t1, t2, …) is a term in T(Sig)(S0)
  • 56.
    Well-Formed Terms: Example !56 signature sortsExp constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp if n < 1 then 1 else n * fact(n - 1) If( Lt(Var("n"), Int("1")) , Int("1") , Times( Var("n") , Call("fact", [Minus(Var("n"), Int("1"))]) ) ) decompose well-formed wrt
  • 57.
    Lists of Terms !57 signature sortsExp Var constructors … Call : ID * List(Exp) -> Exp [Minus(Var("n"), Int(“1”))] [Minus(Var("n"), Int(“1”)), Lt(Var("n"), Int("1"))] []
  • 58.
    Well-Formed Terms withLists !58 The family of well-formed terms T(Sig) defined by signature Sig is inductively defined as follows: If C : S1 * S2 * … -> S0 is a constructor in Sig and 
 if t1, t2, … are terms in T(Sig)(S1), T(Sig)(S2), …, 
 then C(t1, t2, …) is a term in T(Sig)(S0) If t1, t2, … are terms in T(Sig)(S), Then [t1, t2, …] is a term in T(Sig)(List(S))
  • 59.
    Abstract syntax ofa language - Defined by algebraic signature - Sorts: syntactic categories - Constructors: language constructs Program structure - Represented by (first-order) term - Well-formed with respect to abstract syntax - (Isomorphic to tree structure) !59 Abstract Syntax
  • 60.
    From Abstract Syntaxto Concrete Syntax 60
  • 61.
    What does AbstractSyntax Abstract from? !61 signature sorts Exp Var constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp Signature does not define ‘notation’
  • 62.
    What is Notation? !62 signature sortsExp Var constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp Notation: literals, keywords, delimiters, punctuation n x e1 * e2 e1 - e2 e1 < e2 if e1 then e2 else e3 f(e1, e2, …) if n < 1 then 1 else n * fact(n - 1)
  • 63.
    How can wecouple notation to abstract syntax? !63 signature sorts Exp Var constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp n x e1 * e2 e1 - e2 e1 < e2 if e1 then e2 else e3 f(e1, e2, …) if n < 1 then 1 else n * fact(n - 1) Notation: literals, keywords, delimiters, punctuation
  • 64.
    Context-Free Grammars !64 grammar non-terminals N0N1 N2 … terminals T0 T1 T2 … productions N0 = S1 S2 … … Non-terminals (N): syntactic categories Terminals (T): words of sentences Productions: rules to create sentences Symbols (S): non-terminals and terminals
  • 65.
    Well-Formed Sentences !65 The familyof sentences L(G) defined by context-free grammar G is inductively defined as follows: A terminal T is a sentence in L(G)(T) If N0 = S1 S2 … is a production in G and 
 if w1, w2, … are sentences in L(G)(S1), L(G)(S2), …, 
 then w1 w2 … is a sentence in L(G)(N0)
  • 66.
    Well-Formed Sentences !66 if n< 1 then 1 else n * fact(n - 1) grammar non-terminals Exp Var productions Exp = IntConst Exp = Id Exp = Exp "*" Exp Exp = Exp "-" Exp Exp = Exp "<" Exp Exp = "if" Exp "then" Exp "else" Exp Exp = Id "(" {Exp ","}* ")"
  • 67.
    What is therelation between concrete and abstract syntax? !67 if n < 1 then 1 else n * fact(n - 1) If( Lt(Var("n"), Int("1")) , Int("1") , Times( Var("n") , Call("fact", [Minus(Var("n"), Int("1"))]) ) ) grammar non-terminals Exp Var productions Exp = IntConst Exp = Id Exp = Exp "*" Exp Exp = Exp "-" Exp Exp = Exp "<" Exp Exp = "if" Exp "then" Exp "else" Exp Exp = Id "(" {Exp ","}* ")" ? ? signature sorts Exp Var constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp
  • 68.
    Context-Free Grammars withConstructor Declarations !68 sorts Exp Var context-free syntax Exp.Int = IntConst Exp.Var = Id Exp.Times = Exp "*" Exp Exp.Minus = Exp "-" Exp Exp.Lt = Exp "<" Exp Exp.If = "if" Exp "then" Exp "else" Exp Exp.Call = Id "(" {Exp ","}* ")" signature sorts Exp Var constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp grammar non-terminals Exp Var productions Exp = IntConst Exp = Id Exp = Exp "*" Exp Exp = Exp "-" Exp Exp = Exp "<" Exp Exp = "if" Exp "then" Exp "else" Exp Exp = Id "(" {Exp ","}* ")"
  • 69.
    Context-Free Grammars withConstructor Declarations !69 sorts Exp Var context-free syntax Exp.Int = IntConst Exp.Var = Id Exp.Times = Exp "*" Exp Exp.Minus = Exp "-" Exp Exp.Lt = Exp "<" Exp Exp.If = "if" Exp "then" Exp "else" Exp Exp.Call = Id "(" {Exp ","}* ")" Abstract syntax: productions define constructor and sorts of arguments Concrete syntax: productions define notation for language constructs
  • 70.
    Abstract syntax - Productiondefines constructor, argument sorts, result sort - Abstract from notation: lexical elements of productions Concrete syntax - Productions define context-free grammar rules Some details to discuss - Sequences - Lexical syntax - Ambiguities - Converting text to tree and back (parsing, unparsing) !70 CFG with Constructors defines Abstract and Concrete Syntax
  • 71.
  • 72.
    Encoding Sequences (Lists) !72 sortsExp Var context-free syntax Exp.Int = IntConst Exp.Var = Id Exp.Times = Exp "*" Exp Exp.Minus = Exp "-" Exp Exp.Lt = Exp "<" Exp Exp.If = "if" Exp "then" Exp "else" Exp Exp.Call = Id "(" ExpList ")" ExpList.Nil = ExpList = ExpListNE ExpListNE.One = Exp ExpListNE.Snoc = ExpListNE “," Exp printlist(merge(list1,list2)) Call("printlist" , [Call("merge", [Var("list1") , Var("list2")])] )
  • 73.
    Sugar for Sequencesand Optionals !73 context-free syntax Exp.Call = Id "(" {Exp ","}* ")" printlist(merge(list1,list2)) Call("printlist" , [Call("merge", [Var("list1") , Var("list2")])] ) context-free syntax // automatically generated {Exp “,"}*.Nil = // empty list {Exp “,”}* = {Exp ","}+ {Exp “,"}+.One = Exp {Exp “,"}+.Snoc = {Exp ","}+ "," Exp Exp*.Nil = // empty list Exp* = Exp+ Exp+.One = Exp Exp+.Snoc = Exp+ Exp Exp?.None = // no expression Exp?.Some = Exp // one expression
  • 74.
    Normalizing Lists !74 rules Snoc(Nil(), x)-> Cons(x, Nil()) Snoc(Cons(x, xs), y) -> Cons(x, Snoc(xs, y)) One(x) -> Cons(x, Nil()) Nil() -> [] Cons(x, xs) -> [x | xs] context-free syntax // automatically generated {Exp “,"}*.Nil = // empty list {Exp “,”}* = {Exp ","}+ {Exp “,"}+.One = Exp {Exp “,"}+.Snoc = {Exp ","}+ "," Exp Exp*.Nil = // empty list Exp* = Exp+ Exp+.One = Exp Exp+.Snoc = Exp+ Exp Exp?.None = // no expression Exp?.Some = Exp // one expression
  • 75.
    Using Sugar forSequences !75 module Functions imports Identifiers imports Types context-free syntax Dec = FunDec+ FunDec = "function" Id "(" {FArg ","}* ")" "=" Exp FunDec = "function" Id "(" {FArg ","}* ")" ":" Type "=" Exp FArg = Id ":" Type Exp = Id "(" {Exp ","}* ")" let function power(x: int, n: int): int = if n <= 0 then 1 else x * power(x, n - 1) in power(3, 10) end module Bindings imports Control-Flow Identifiers Types Functions Variables sorts Declarations context-free syntax Exp = "let" Dec* "in" {Exp ";"}* "end" Declarations = "declarations" Dec*
  • 76.
  • 77.
    Context-Free Syntax vsLexical Syntax !77 let function power(x: int, n: int): int = if n <= 0 then 1 else x * power(x, n - 1) in power(3, 10) end Mod( Let( [ FunDecs( [ FunDec( "power" , [FArg("x", Tid("int")), FArg("n", Tid("int"))] , Tid("int") , If( Leq(Var("n"), Int("0")) , Int("1") , Times( Var("x") , Call( "power" , [Var("x"), Minus(Var("n"), Int("1"))] ) ) ) ) ] ) ] , [Call("power", [Int("3"), Int("10")])] ) ) phrase structure lexeme / token separated by layout structure not relevant not separated by layout
  • 78.
    Character Classes !78 lexical syntax// character codes Character = [65] Range = [65-90] Union = [65-90] / [97-122] Difference = [0-127] / [1013] Union = [0-911-1214-255] Character class represents choice from a set of characters
  • 79.
    Sugar for CharacterClasses !79 lexical syntax // sugar CharSugar = [a] = [97] CharClass = [abcdefghijklmnopqrstuvwxyz] = [97-122] SugarRange = [a-z] = [97-122] Union = [a-z] / [A-Z] / [0-9] = [48-5765-9097-122] RangeCombi = [a-z0-9_] = [48-579597-122] Complement = ~[nr] = [0-255] / [1013] = [0-911-1214-255]
  • 80.
    Literals are Sequencesof Characters !80 lexical syntax // literals Literal = "then" // case sensitive sequence of characters CaseInsensitive = 'then' // case insensitive sequence of characters syntax "then" = [116] [104] [101] [110] 'then' = [84116] [72104] [69101] [78110] syntax "then" = [t] [h] [e] [n] 'then' = [Tt] [Hh] [Ee] [Nn]
  • 81.
    Identifiers !81 lexical syntax Id =[a-zA-Z] [a-zA-Z0-9_]* Id = a Id = B Id = cD Id = xyz10 Id = internal_ Id = CamelCase Id = lower_case Id = ...
  • 82.
    Lexical Ambiguity: LongestMatch !82 lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} // curried function call ab Mod( amb( [Var("ab"), Call(Var("a"), Var("b"))] ) )
  • 83.
    abc Lexical Ambiguity: LongestMatch !83 Mod( amb( [ amb( [ Var("abc") , Call( amb( [Var("ab"), Call(Var("a"), Var("b"))] ) , Var("c") ) ] ) , Call(Var("a"), Var("bc")) ] ) ) lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} // curried function call
  • 84.
    abc def ghi LexicalRestriction => Longest Match !84 lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} // curried function call Call(Call(Var("abc"), Var("def")), Var("ghi")) Lexical restriction: phrase cannot be followed by character in character class
  • 85.
    if def thenghi Lexical Ambiguity: Keywords overlap with Identifiers !85 lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} Exp.IfThen = "if" Exp "then" Exp amb( [ Mod( Call( Call(Call(Var("if"), Var("def")), Var("then")) , Var("ghi") ) ) , Mod(IfThen(Var("def"), Var("ghi"))) ] )
  • 86.
    ifdef then ghi LexicalAmbiguity: Keywords overlap with Identifiers !86 amb( [ Mod( Call(Call(Var("ifdef"), Var("then")), Var("ghi")) ) , Mod(IfThen(Var("def"), Var("ghi"))) ] ) lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} Exp.IfThen = "if" Exp "then" Exp
  • 87.
    Reject Productions =>Reserved Words !87 lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* Id = "if" {reject} Id = "then" {reject} lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers "if" "then" -/- [a-zA-Z0-9_] context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} Exp.IfThen = "if" Exp "then" Exp if def then ghi IfThen(Var("def"), Var("ghi"))
  • 88.
    ifdef then ghi RejectProductions => Reserved Words !88 parse error lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* Id = "if" {reject} Id = "then" {reject} lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers "if" "then" -/- [a-zA-Z0-9_] context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} Exp.IfThen = "if" Exp "then" Exp
  • 89.
    Real Numbers !89 lexical syntax RealConst.RealConstNoExp= IntConst "." IntConst RealConst.RealConst = IntConst "." IntConst "e" Sign IntConst Sign = "+" Sign = "-"
  • 90.
  • 91.
    Ambiguity !91 t1 != t2/ format(t1) = format(t2) Add VarRef VarRef “y”“x” Mul Int “3” Add VarRef VarRef “y” “x” Mul Int “3” 3 * x + y
  • 92.
  • 93.
    Disambiguation Filters !93 parse filter DisambiguationFilters [Klint & Visser; 1994], [Van den Brand, Scheerder, Vinju, Visser; CC 2002]
  • 94.
    Associativity and Priority !94 context-freesyntax Expr.Int = INT Expr.Add = "Expr" + "Expr" {left} Expr.Mul = "Expr" * "Expr" {left} context-free priorities Expr.Mul > Expr.Add Recent improvement: safe disambiguation of operator precedence [SLE13, Onward15, Programming18] Add VarRef VarRef “y”“x” Mul Int “3” Add VarRef VarRef “y” “x” Mul Int “3” 3 * x + y
  • 95.
    context-free syntax Exp.Add =Exp "+" Exp Exp.Sub = Exp "-" Exp Exp.Mul = Exp "*" Exp Exp.Div = Exp "/" Exp Exp.Eq = Exp "=" Exp Exp.Neq = Exp "<>" Exp Exp.Gt = Exp ">" Exp Exp.Lt = Exp "<" Exp Exp.Gte = Exp ">=" Exp Exp.Lte = Exp "<=" Exp Exp.And = Exp "&" Exp Exp.Or = Exp "|" Exp Ambiguous Operator Syntax !95
  • 96.
    Disambiguation by Associativityand Priority Rules !96 context-free syntax Exp.Add = Exp "+" Exp {left} Exp.Sub = Exp "-" Exp {left} Exp.Mul = Exp "*" Exp {left} Exp.Div = Exp "/" Exp {left} Exp.Eq = Exp "=" Exp {non-assoc} Exp.Neq = Exp "<>" Exp {non-assoc} Exp.Gt = Exp ">" Exp {non-assoc} Exp.Lt = Exp "<" Exp {non-assoc} Exp.Gte = Exp ">=" Exp {non-assoc} Exp.Lte = Exp "<=" Exp {non-assoc} Exp.And = Exp "&" Exp {left} Exp.Or = Exp "|" Exp {left} context-free priorities { left: Exp.Mul Exp.Div } > { left: Exp.Add Exp.Sub } > { non-assoc: Exp.Eq Exp.Neq Exp.Gt Exp.Lt Exp.Gte Exp.Lte } > Exp.And > Exp.Or
  • 97.
    Disambiguation by Associativityand Priority Rules !97 context-free syntax Exp.Add = Exp "+" Exp {left} Exp.Sub = Exp "-" Exp {left} Exp.Mul = Exp "*" Exp {left} Exp.Div = Exp "/" Exp {left} Exp.Eq = Exp "=" Exp {non-assoc} Exp.Neq = Exp "<>" Exp {non-assoc} Exp.Gt = Exp ">" Exp {non-assoc} Exp.Lt = Exp "<" Exp {non-assoc} Exp.Gte = Exp ">=" Exp {non-assoc} Exp.Lte = Exp "<=" Exp {non-assoc} Exp.And = Exp "&" Exp {left} Exp.Or = Exp "|" Exp {left} context-free priorities { left: Exp.Mul Exp.Div } > { left: Exp.Add Exp.Sub } > { non-assoc: Exp.Eq Exp.Neq Exp.Gt Exp.Lt Exp.Gte Exp.Lte } > Exp.And > Exp.Or 1 + 3 <= 4 - 35 & 12 > 16 And( Leq( Plus(Int("1"), Int("3")) , Minus(Int("4"), Int("35")) ) , Gt(Int("12"), Int("16")) )
  • 98.
    Testing Syntax Definitions withSPT 98 Are you defining the right language?
  • 99.
    Test Suites !99 module example-suite languageTiger start symbol Start test name [[...]] parse succeeds test another name [[...]] parse fails SPT: SPoofax Testing language
  • 100.
    Success is Failure,Failure is Success, … !100 module success-failure language Tiger test this test succeeds [[ 1 + 3 * 4 ]] parse succeeds test this test fails [[ 1 + 3 * 4 ]] parse fails test this test fails [[ 1 + 3 * ]] parse succeeds test this test succeeds [[ 1 + 3 * ]] parse fails
  • 101.
    Test Cases: Valid !101 Language validprogram test valid program [[...]] parse succeeds
  • 102.
    Test Cases: Invalid !102 testinvalid program [[...]] parse fails invalid program Language
  • 103.
    Test Cases !103 module syntax/identifiers languageTiger start symbol Id test single lower case [[x]] parse succeeds test single upper case [[X]] parse succeeds test single digit [[1]] parse fails test single lc digit [[x1]] parse succeeds test single digit lc [[1x]] parse fails test single uc digit [[X1]] parse succeeds test single digit uc [[1X]] parse fails test double digit [[11]] parse fails
  • 104.
  • 105.
    Testing Structure !105 module structure languageTiger test add times [[ 21 + 14 + 7 ]] parse to Mod(Plus(Int(“21"),Times(Int("14"),Int("7")))) test times add [[ 3 * 7 + 21 ]] parse to Mod(Plus(Times(Int(“3"),Int("7")),Int("21"))) test if [[ if x then 3 else 4 ]] parse to Mod(If(Var("x"),Int("3"),Int("4"))) The fragment did not parse to the expected ATerm. Parse result was: Mod(Plus(Plus(Int("21"),Int("14")),Int("7"))) Expected result was: Mod(Plus(Int(21), Times(Int(14), Int(7))))
  • 106.
    Testing Ambiguity !106 Note :this does not actually work in Tiger, since () is a sequencing construct; but works fine in Mini-Java module precedence language Tiger start symbol Exp test parentheses [[(42)]] parse to [[42]] test left-associative addition [[21 + 14 + 7]] parse to [[(21 + 14) + 7]] test precedence multiplication [[3 * 7 + 21]] parse to [[(3 * 7) + 21]]
  • 107.
    Testing Ambiguity !107 module precedence languageTiger start symbol Exp test plus/times priority [[ x + 4 * 5 ]] parse to Plus(Var("x"), Times(Int("4"), Int("5"))) test plus/times sequence [[ (x + 4) * 5 ]] parse to Times( Seq([Plus(Var("x"), Int("4"))]) , Int("5") )
  • 108.
  • 110.
    Developing syntax definition -Define syntax of language in multiple modules - Syntax checking, colouring - Checking for undefined non-terminals Testing syntax definition - Write example programs in editor for language under def - Inspect abstract syntax terms ‣ Spoofax > Syntax > Show Parsed AST - Write SPT test for success and failure cases ‣ Updated after build of syntax definition !110 Syntax Engineering in Spoofax
  • 111.
  • 112.
    Syntax definition - Definestructure (decomposition) of programs - Define concrete syntax: notation - Define abstract syntax: constructors Using syntax definitions (next) - Parsing: converting text to abstract syntax term - Pretty-printing: convert abstract syntax term to text - Editor services: syntax highlighting, syntax checking, completion !112 Declarative Syntax Definition
  • 113.
  • 114.
    Except where otherwisenoted, this work is licensed under