Compiler Construction (Introduction) Lecture-1 By Shahid Akbar MS (Computer Science)
Goals Of This Course Know how to build a compiler for a (simplified) (programming) language Know how to use compiler construction tools, such as generators for scanners (Lexical Analyzer) and parsers(Semantic Analyzer ) Be able to write LL(1), LR(1), and LALR(1) grammars (for new languages) Be familiar with compiler analysis and optimization techniques 8-Apr-17 2
Grading Policy  Mid Term 20  Final Term 60  Quizzes / Assignments 20 Total 100 Note= 75% Attendance is compulsory for Exams 8-Apr-17 3
Textbook Compilers: Principles, Techniques, and Tools, 2/E. Alfred V. Aho, Columbia University Monica S. Lam, Stanford University Ravi Sethi, Avaya Labs Jeffrey D. Ullman, Stanford University Dragon
What is Compiler?  The world as we know it depends on programming languages, C, C++, Java, Web programming, and etc.  All the software running on all the computers was written in some programming language.  Before a program can be run, it first must be translated into a form in which it can be executed by a computer. The software systems that do this translation are called compilers.
Different compilers  VC, VC++ (visual compiler, Microsoft products)  GCC Dev C compiler , where Dev C is Integrated development environment (IDE)  JavaC Java compiler  FORTRAN  Pascal  Visual Basic (internet programming)
Why Compilation? The conversion of high level language into low level object code and machine code. High level language can be understandable by the human , contain English words and phrases, but computer can only understand machine language (Low level language), that’s we need compilation for interaction between human and computer.
Computer Languages-1  Machine language (Low Level Language)  Only language computer directly understands  Defined by hardware design  Machine-dependent  Generally consist of strings of numbers  Ultimately 0s and 1s  Instruct computers to perform elementary operations  One at a time  Cumbersome for humans  Example: +1300042774 +1400593419 +1200274027 8
Computer Languages-2  Assembly language  English-like abbreviations representing elementary computer operations  Clearer to humans  Incomprehensible to computers  Translator programs (assemblers)  Convert to machine language  Example: LOAD BASEPAY ADD OVERPAY STORE GROSSPAY 9
Computer Languages-3  High-level languages  Similar to everyday English, use common mathematical notations  Single statements accomplish substantial tasks  Assembly language requires many instructions to accomplish simple tasks  Translator programs (compilers)  Convert to machine language  Interpreter programs  Directly execute high-level language programs  Example: grossPay = basePay + overTimePay 10
A compiler is program that covert high level language to Assembly languages , similarly an Assembler is a program that convert the Assembly language to Machine Level language Compiler Vs Assembler
Assembler VS Compiler It is used to translate the program written in assembly language into Machine language. An assembler perform the translation process in the similar way as compiler, but Assembler is a translator program for low level programming language , while a compiler is the translator program for high level languages
Compilers and Interpreters “Compilation” Translation of a program written in a source language into a semantically equivalent program written in a target language. An important role of the compiler is to report any errors in the source program , errors are detected during the translation process. Compiler Error messages Source Program (HLL) Target Program Input Output Execution Compilation (Error Checking (A L L)
Compilers and Interpreters (cont’d) Interpreter Source Program Input (Execution) Output Error messages “Interpretation”  Language processor Performing the operations implied in the source program on input supplied by users.
Complier Vs Interpreter  In interpreter both execution and compilation process are combined ,At first a statement is compiled and then executed.  In compiler , A program is compiled one time and it can be execute again and again until you need.  While in interpreter A program/statement will be compiled and executed again and again (depend on number of execution)  e.g: if you need to execute a program 100 times, it requires 100 time compilation and 100 times execution in interpreter  Note : 1) Compiler is fast than interpreter (Why) 2) Error diagnosis of interpreter is better than compiler
Interpreter Compiler Translates program statement by statement at a time. Scans the entire program and translates it as a whole into machine code It takes less amount of time to analyze the source code but the overall execution time is slower.(Time consuming) It takes large amount of time to analyze the source code but the overall execution time is comparatively faster. Continues translating the program until the first error is met, in which case it stops. Hence debugging is easy. It generates the error message only after scanning the whole program. Hence debugging is comparatively hard. Programming language like Python, Ruby use interpreters. Programming language like C, C++ use compilers.
Compiler History 1952: First compiler (linker/loader) written by Grace Hopper for A-0 programming language 1957: First complete compiler for FORTRAN by John Backus and team 1960: COBOL compilers for multiple architectures 1962: First self-hosting compiler for LISP
Compiler Construction requirements Programming languages (parameter passing, variable scoping, memory allocation, etc) theory (automata, context-free languages, etc) Algorithms and data structures (hash tables, graph algorithms, dynamic programming, etc) Computer architecture (assembly programming) Software Engineering.
Compiler Qualities Make the Code Accurate Compiler runs fast Compile time proportional to program size Support for separate compilation Efficient to check syntax errors Works well with the debugger Good diagnostics for flow anomalies Cross language calls --- interface compatibilities Consistent, predictable optimization
A General language processing System Preprocessor HLL Pure HLL Compiler Assembler Linker/Loader Assembly Language Relocatable machine Code Target Machine Code #include < ?? . h > # define Const variable File Inclusion Macro Expansion NO preprocessor Directives MOV a, R1 ADD #2, R1 MOV R1, b i.e. b:= a +2 0001 01 00 00000000 0011 01 10 00000010 0010 01 00 00000100 Loading the data(relocatable machine Code) and instruction to the proper locations
Phases of Compiler Lexical Analyzer Syntax Analyzer Semantic Analyzer Intermediate Code Generator Code Optimizer Target Code Generator Character Stream (H L L) Token Stream Syntax Tree Syntax Tree (semantically verified meaningful) 3 Address Code Assembly code Symbol Table Reduce size of program (number of lines) Error Handler
Phase Output Sample Programmer Source string A=B+C; Scanner (performs lexical analysis) Token string ‘A’, ‘=’, ‘B’, ‘+’, ‘C’, ‘;’ And symbol table for identifiers Parser (performs syntax analysis based on the grammar of the programming language) Parse tree or abstract syntax tree ; | = / A + / B C Semantic analyzer (type checking, etc) Parse tree or abstract syntax tree Intermediate code generator Three-address code int2fp B t1 + t1 C t2 := t2 A Optimizer Three-address code int2fp B t1 + t1 #2.3 A Code generator Assembly code MOVF #2.3,r1 ADDF2 r1,r2 MOVF r2,A Peephole optimizer Assembly code ADDF2 #2.3,r2 MOVF r2,A

Lecture1 compilers

  • 1.
  • 2.
    Goals Of ThisCourse Know how to build a compiler for a (simplified) (programming) language Know how to use compiler construction tools, such as generators for scanners (Lexical Analyzer) and parsers(Semantic Analyzer ) Be able to write LL(1), LR(1), and LALR(1) grammars (for new languages) Be familiar with compiler analysis and optimization techniques 8-Apr-17 2
  • 3.
    Grading Policy  MidTerm 20  Final Term 60  Quizzes / Assignments 20 Total 100 Note= 75% Attendance is compulsory for Exams 8-Apr-17 3
  • 4.
    Textbook Compilers: Principles, Techniques,and Tools, 2/E. Alfred V. Aho, Columbia University Monica S. Lam, Stanford University Ravi Sethi, Avaya Labs Jeffrey D. Ullman, Stanford University Dragon
  • 5.
    What is Compiler? The world as we know it depends on programming languages, C, C++, Java, Web programming, and etc.  All the software running on all the computers was written in some programming language.  Before a program can be run, it first must be translated into a form in which it can be executed by a computer. The software systems that do this translation are called compilers.
  • 6.
    Different compilers  VC,VC++ (visual compiler, Microsoft products)  GCC Dev C compiler , where Dev C is Integrated development environment (IDE)  JavaC Java compiler  FORTRAN  Pascal  Visual Basic (internet programming)
  • 7.
    Why Compilation? The conversionof high level language into low level object code and machine code. High level language can be understandable by the human , contain English words and phrases, but computer can only understand machine language (Low level language), that’s we need compilation for interaction between human and computer.
  • 8.
    Computer Languages-1  Machinelanguage (Low Level Language)  Only language computer directly understands  Defined by hardware design  Machine-dependent  Generally consist of strings of numbers  Ultimately 0s and 1s  Instruct computers to perform elementary operations  One at a time  Cumbersome for humans  Example: +1300042774 +1400593419 +1200274027 8
  • 9.
    Computer Languages-2  Assemblylanguage  English-like abbreviations representing elementary computer operations  Clearer to humans  Incomprehensible to computers  Translator programs (assemblers)  Convert to machine language  Example: LOAD BASEPAY ADD OVERPAY STORE GROSSPAY 9
  • 10.
    Computer Languages-3  High-levellanguages  Similar to everyday English, use common mathematical notations  Single statements accomplish substantial tasks  Assembly language requires many instructions to accomplish simple tasks  Translator programs (compilers)  Convert to machine language  Interpreter programs  Directly execute high-level language programs  Example: grossPay = basePay + overTimePay 10
  • 11.
    A compiler isprogram that covert high level language to Assembly languages , similarly an Assembler is a program that convert the Assembly language to Machine Level language Compiler Vs Assembler
  • 12.
    Assembler VS Compiler Itis used to translate the program written in assembly language into Machine language. An assembler perform the translation process in the similar way as compiler, but Assembler is a translator program for low level programming language , while a compiler is the translator program for high level languages
  • 13.
    Compilers and Interpreters “Compilation” Translationof a program written in a source language into a semantically equivalent program written in a target language. An important role of the compiler is to report any errors in the source program , errors are detected during the translation process. Compiler Error messages Source Program (HLL) Target Program Input Output Execution Compilation (Error Checking (A L L)
  • 14.
    Compilers and Interpreters(cont’d) Interpreter Source Program Input (Execution) Output Error messages “Interpretation”  Language processor Performing the operations implied in the source program on input supplied by users.
  • 15.
    Complier Vs Interpreter In interpreter both execution and compilation process are combined ,At first a statement is compiled and then executed.  In compiler , A program is compiled one time and it can be execute again and again until you need.  While in interpreter A program/statement will be compiled and executed again and again (depend on number of execution)  e.g: if you need to execute a program 100 times, it requires 100 time compilation and 100 times execution in interpreter  Note : 1) Compiler is fast than interpreter (Why) 2) Error diagnosis of interpreter is better than compiler
  • 16.
    Interpreter Compiler Translates programstatement by statement at a time. Scans the entire program and translates it as a whole into machine code It takes less amount of time to analyze the source code but the overall execution time is slower.(Time consuming) It takes large amount of time to analyze the source code but the overall execution time is comparatively faster. Continues translating the program until the first error is met, in which case it stops. Hence debugging is easy. It generates the error message only after scanning the whole program. Hence debugging is comparatively hard. Programming language like Python, Ruby use interpreters. Programming language like C, C++ use compilers.
  • 17.
    Compiler History 1952: Firstcompiler (linker/loader) written by Grace Hopper for A-0 programming language 1957: First complete compiler for FORTRAN by John Backus and team 1960: COBOL compilers for multiple architectures 1962: First self-hosting compiler for LISP
  • 18.
    Compiler Construction requirements Programminglanguages (parameter passing, variable scoping, memory allocation, etc) theory (automata, context-free languages, etc) Algorithms and data structures (hash tables, graph algorithms, dynamic programming, etc) Computer architecture (assembly programming) Software Engineering.
  • 19.
    Compiler Qualities Make theCode Accurate Compiler runs fast Compile time proportional to program size Support for separate compilation Efficient to check syntax errors Works well with the debugger Good diagnostics for flow anomalies Cross language calls --- interface compatibilities Consistent, predictable optimization
  • 20.
    A General languageprocessing System Preprocessor HLL Pure HLL Compiler Assembler Linker/Loader Assembly Language Relocatable machine Code Target Machine Code #include < ?? . h > # define Const variable File Inclusion Macro Expansion NO preprocessor Directives MOV a, R1 ADD #2, R1 MOV R1, b i.e. b:= a +2 0001 01 00 00000000 0011 01 10 00000010 0010 01 00 00000100 Loading the data(relocatable machine Code) and instruction to the proper locations
  • 21.
    Phases of Compiler LexicalAnalyzer Syntax Analyzer Semantic Analyzer Intermediate Code Generator Code Optimizer Target Code Generator Character Stream (H L L) Token Stream Syntax Tree Syntax Tree (semantically verified meaningful) 3 Address Code Assembly code Symbol Table Reduce size of program (number of lines) Error Handler
  • 23.
    Phase Output Sample ProgrammerSource string A=B+C; Scanner (performs lexical analysis) Token string ‘A’, ‘=’, ‘B’, ‘+’, ‘C’, ‘;’ And symbol table for identifiers Parser (performs syntax analysis based on the grammar of the programming language) Parse tree or abstract syntax tree ; | = / A + / B C Semantic analyzer (type checking, etc) Parse tree or abstract syntax tree Intermediate code generator Three-address code int2fp B t1 + t1 C t2 := t2 A Optimizer Three-address code int2fp B t1 + t1 #2.3 A Code generator Assembly code MOVF #2.3,r1 ADDF2 r1,r2 MOVF r2,A Peephole optimizer Assembly code ADDF2 #2.3,r2 MOVF r2,A

Editor's Notes