Programming Language A programming language is a vocabulary and set of grammatical rules for instructing a computer or computing device to perform specific tasks. The term programming language usually refers to high-level languages, such as BASIC, C, C++, COBOL, Java, FORTRAN, Ada, and Pascal. Each programming language has a unique set of keywords (words that it understands) and a special syntax for organizing program instructions. High-Level Programming Languages High-level programming languages, while simple compared to human languages, are more complex than the languages the computer actually understands, called machine languages. Each different type of CPU has its own unique machine language. Lying between machine languages and high-level languages are languages called assembly languages. Assembly languages are similar to machine languages, but they are much easier to program in because they allow a programmer to substitute names for numbers. Machine languages consist of numbers only. Lying above high-level languages are languages called fourth-generation languages (usually abbreviated 4GL). 4GLs are far removed from machine languages and represent the class of computer languages closest to human languages. Converting to Machine Language Regardless of what language you use, you eventually need to convert your program into machine language so that the computer can understand it. There are two ways to do this: • Compile the program. • Interpret the program. The question of which language is best is one that consumes a lot of time and energy among computer professionals. Every language has its strengths and weaknesses. For example, FORTRAN is a particularly good language for processing numerical data, but it does not lend itself very well to organizing large programs. Pascal is very good for writing well-structured and readable programs, but it is not as flexible as the C programming language. C++ embodies powerful object-oriented features, but it is complex and difficult to learn. The Top Programming Languages? According to IEEE Spectrum's interactive ranking, Python is the top programming language of 2017, followed by C, Java and C++. Of course, the choice of which language to use depends on the type of computer the program is to run on, what sort of program it is, and the expertise of the programmer. Interpreter Versus Compiler An interpreter translates high-level instructions into an intermediate form, which it then executes. In contrast, a compiler translates high-level instructions directly into machine language. Compiled programs generally run faster than interpreted programs. The advantage of an interpreter, however, is that it does not need to go through
the compilation stage during which machine instructions are generated. This process can be time-consuming if the program is long. The interpreter, on the other hand, can immediately execute high-level programs. For this reason, interpreters are sometimes used during the development of a program, when a programmer wants to add small sections at a time and test them quickly. In addition, interpreters are often used in education because they allow students to program interactively. Both interpreters and compilers are available for most high-level languages. However, BASIC and LISP are especially designed to be executed by an interpreter. In addition, page description languages, such as PostScript, use an interpreter. Every PostScript printer, for example, has a built-in interpreter that executes PostScript instructions. What is Compiler? A compiler is a program that translates source code into object code to be understood by a specific central processing unit (CPU). The act of translating source code into object code is known as compilation. Compilation is typically used for programs that translate source code from a high-level programming language (such as C++) to a low-level programming language (such as machine code) to create an executable program. Likewise, when a low-level language is converted into a high-level language, the process is called decompilation. Phases of a compiler A compiler executes its processes in phases to promote efficient design and correct transformations of source input to target output. The phases are as follows: 1. Lexical Analyzer It is also called a scanner. The compiler converts the sequence of characters that appear in the source code into a series of string characters known as tokens. These tokens are defined by regular expressions which are understood by the lexical analyzer. It also removes lexical errors, comments, and whitespace. 2. Syntax Analyzer The syntax analyzer constructs the parse tree, which is constructed to check for ambiguity in the given grammar. The syntax analyzer takes all tokens one by one and uses Context Free Grammar to construct the parse tree. Syntax error can be detected if the input is not in accordance with the grammar. 3. Semantic Analyzer The semantic analyzer verifies the parse tree constructed by the syntax analyzer. It also does type checking, label checking, and flow control checking. 4. Intermediate Code Generator The intermediate code generator generates intermediate code for execution by a machine. Intermediate code is converted into machine language using the last two phases, which are platform dependent. 5. Code Optimizer The code optimizer transforms the code so that it consumes fewer resources and produces more speed. The meaning of the code that is being transformed is not altered.
6. Target Code Generator This is the final step in the final stage of compilation. The target code generator writes code that a machine can understand and also registers allocation, instruction, and selection. The output is dependent on the type of assembler. The optimized code is then converted into machine code, forming the input to the linker and loader. Types of compilers There are many types of compilers, such as: • Cross compiler: The compiled program runs on a computer that has a different operating system or CPU from the one which the compiler runs on. It's capable of creating code for a platform other than the one on which the compiler is running • Source-to-source compiler: Also known as a transcompiler, it translates source code written in one programming language into source code of another programming language. • Just-in-time (JIT) compiler: A compiler that defers compilation until runtime. This compiler is used for languages such as Python and JavaScript, and it generally runs inside an interpreter. Language Types Machine and assembly languages A machine language consists of the numeric codes for the operations that a particular computer can execute directly. The codes are strings of 0s and 1s, or binary digits (“bits”), which are frequently converted both from and to hexadecimal (base 16) for human viewing and modification. Machine language instructions typically use some bits to represent operations, such as addition, and some to represent operands, or perhaps the location of the next instruction. Machine language is difficult to read and write, since it does not resemble conventional mathematical notation or human language, and its codes vary from computer to computer. Assembly language is one level above machine language. It uses short mnemonic codes for instructions and allows the programmer to introduce names for blocks of memory that hold data. One might thus write “add pay, total” instead of “0110101100101000” for an instruction that adds two numbers. Assembly language is designed to be easily translated into machine language. Although blocks of data may be referred to by name instead of by their machine addresses, assembly language does not provide more sophisticated means of organizing complex information. Like machine language, assembly language requires detailed knowledge of internal computer architecture. It is useful when such details are important, as in programming a computer to interact with peripheral devices (printers, scanners, storage devices, and so forth). Algorithmic languages Algorithmic languages are designed to express mathematical or symbolic computations. They can express algebraic operations in notation similar to mathematics and allow the use of subprograms that package commonly used operations for reuse. They were the first high-level languages.
Object-oriented languages Object-oriented languages help to manage complexity in large programs. Objects package data and the operations on them so that only the operations are publicly accessible and internal details of the data structures are hidden. This information hiding made large-scale programming easier by allowing a programmer to think about each part of the program in isolation. In addition, objects may be derived from more general ones, “inheriting” their capabilities. Such an object hierarchy made it possible to define specialized objects without repeating all that is in the more general ones. Object-oriented programming began with the Simula language (1967), which added information hiding to ALGOL. Another influential object-oriented language was Smalltalk (1980), in which a program was a set of objects that interacted by sending messages to one another. Document formatting languages Document formatting languages specify the organization of printed text and graphics. They fall into several classes: text formatting notation that can serve the same functions as a word processing program, page description languages that are interpreted by a printing device, and, most generally, markup languages that describe the intended function of portions of a document. Scripting languages Scripting languages are sometimes called little languages. They are intended to solve relatively small programming problems that do not require the overhead of data declarations and other features needed to make large programs manageable. Scripting languages are used for writing operating system utilities, for special- purpose file-manipulation programs, and, because they are easy to learn, sometimes for considerably larger programs. Declarative languages Declarative languages, also called nonprocedural or very high level, are programming languages in which (ideally) a program specifies what is to be done rather than how to do it. In such languages there is less difference between the specification of a program and its implementation than in the procedural languages described so far. The two common kinds of declarative languages are logic and functional languages. Business-oriented languages COBOL: Common Business Oriented Language has been heavily used by businesses since its inception in 1959. A committee of computer manufacturers and users and U.S. government organizations established CODASYL (Committee on Data Systems and Languages) to develop and oversee the language standard in order to ensure its portability across diverse systems. COBOL uses an English-like notation—novel when introduced. Business computations organize and manipulate large quantities of data, and COBOL introduced the record data structure for such tasks. A record clusters heterogeneous data—such as a name, an ID number, an age, and an address—into a single unit. This
contrasts with scientific languages, in which homogeneous arrays of numbers are common. Records are an important example of “chunking” data into a single object, and they appear in nearly all modern languages. SQL: Structured Query Language is a language for specifying the organization of databases (collections of records). Databases organized with SQL are called relational, because SQL provides the ability to query a database for information that falls in a given relation. For example, a query might be “find all records with both last name Smith and city New York.” Commercial database programs commonly use an SQL-like language for their queries.

2 Programming Language.pdf

  • 1.
    Programming Language A programminglanguage is a vocabulary and set of grammatical rules for instructing a computer or computing device to perform specific tasks. The term programming language usually refers to high-level languages, such as BASIC, C, C++, COBOL, Java, FORTRAN, Ada, and Pascal. Each programming language has a unique set of keywords (words that it understands) and a special syntax for organizing program instructions. High-Level Programming Languages High-level programming languages, while simple compared to human languages, are more complex than the languages the computer actually understands, called machine languages. Each different type of CPU has its own unique machine language. Lying between machine languages and high-level languages are languages called assembly languages. Assembly languages are similar to machine languages, but they are much easier to program in because they allow a programmer to substitute names for numbers. Machine languages consist of numbers only. Lying above high-level languages are languages called fourth-generation languages (usually abbreviated 4GL). 4GLs are far removed from machine languages and represent the class of computer languages closest to human languages. Converting to Machine Language Regardless of what language you use, you eventually need to convert your program into machine language so that the computer can understand it. There are two ways to do this: • Compile the program. • Interpret the program. The question of which language is best is one that consumes a lot of time and energy among computer professionals. Every language has its strengths and weaknesses. For example, FORTRAN is a particularly good language for processing numerical data, but it does not lend itself very well to organizing large programs. Pascal is very good for writing well-structured and readable programs, but it is not as flexible as the C programming language. C++ embodies powerful object-oriented features, but it is complex and difficult to learn. The Top Programming Languages? According to IEEE Spectrum's interactive ranking, Python is the top programming language of 2017, followed by C, Java and C++. Of course, the choice of which language to use depends on the type of computer the program is to run on, what sort of program it is, and the expertise of the programmer. Interpreter Versus Compiler An interpreter translates high-level instructions into an intermediate form, which it then executes. In contrast, a compiler translates high-level instructions directly into machine language. Compiled programs generally run faster than interpreted programs. The advantage of an interpreter, however, is that it does not need to go through
  • 2.
    the compilation stageduring which machine instructions are generated. This process can be time-consuming if the program is long. The interpreter, on the other hand, can immediately execute high-level programs. For this reason, interpreters are sometimes used during the development of a program, when a programmer wants to add small sections at a time and test them quickly. In addition, interpreters are often used in education because they allow students to program interactively. Both interpreters and compilers are available for most high-level languages. However, BASIC and LISP are especially designed to be executed by an interpreter. In addition, page description languages, such as PostScript, use an interpreter. Every PostScript printer, for example, has a built-in interpreter that executes PostScript instructions. What is Compiler? A compiler is a program that translates source code into object code to be understood by a specific central processing unit (CPU). The act of translating source code into object code is known as compilation. Compilation is typically used for programs that translate source code from a high-level programming language (such as C++) to a low-level programming language (such as machine code) to create an executable program. Likewise, when a low-level language is converted into a high-level language, the process is called decompilation. Phases of a compiler A compiler executes its processes in phases to promote efficient design and correct transformations of source input to target output. The phases are as follows: 1. Lexical Analyzer It is also called a scanner. The compiler converts the sequence of characters that appear in the source code into a series of string characters known as tokens. These tokens are defined by regular expressions which are understood by the lexical analyzer. It also removes lexical errors, comments, and whitespace. 2. Syntax Analyzer The syntax analyzer constructs the parse tree, which is constructed to check for ambiguity in the given grammar. The syntax analyzer takes all tokens one by one and uses Context Free Grammar to construct the parse tree. Syntax error can be detected if the input is not in accordance with the grammar. 3. Semantic Analyzer The semantic analyzer verifies the parse tree constructed by the syntax analyzer. It also does type checking, label checking, and flow control checking. 4. Intermediate Code Generator The intermediate code generator generates intermediate code for execution by a machine. Intermediate code is converted into machine language using the last two phases, which are platform dependent. 5. Code Optimizer The code optimizer transforms the code so that it consumes fewer resources and produces more speed. The meaning of the code that is being transformed is not altered.
  • 3.
    6. Target CodeGenerator This is the final step in the final stage of compilation. The target code generator writes code that a machine can understand and also registers allocation, instruction, and selection. The output is dependent on the type of assembler. The optimized code is then converted into machine code, forming the input to the linker and loader. Types of compilers There are many types of compilers, such as: • Cross compiler: The compiled program runs on a computer that has a different operating system or CPU from the one which the compiler runs on. It's capable of creating code for a platform other than the one on which the compiler is running • Source-to-source compiler: Also known as a transcompiler, it translates source code written in one programming language into source code of another programming language. • Just-in-time (JIT) compiler: A compiler that defers compilation until runtime. This compiler is used for languages such as Python and JavaScript, and it generally runs inside an interpreter. Language Types Machine and assembly languages A machine language consists of the numeric codes for the operations that a particular computer can execute directly. The codes are strings of 0s and 1s, or binary digits (“bits”), which are frequently converted both from and to hexadecimal (base 16) for human viewing and modification. Machine language instructions typically use some bits to represent operations, such as addition, and some to represent operands, or perhaps the location of the next instruction. Machine language is difficult to read and write, since it does not resemble conventional mathematical notation or human language, and its codes vary from computer to computer. Assembly language is one level above machine language. It uses short mnemonic codes for instructions and allows the programmer to introduce names for blocks of memory that hold data. One might thus write “add pay, total” instead of “0110101100101000” for an instruction that adds two numbers. Assembly language is designed to be easily translated into machine language. Although blocks of data may be referred to by name instead of by their machine addresses, assembly language does not provide more sophisticated means of organizing complex information. Like machine language, assembly language requires detailed knowledge of internal computer architecture. It is useful when such details are important, as in programming a computer to interact with peripheral devices (printers, scanners, storage devices, and so forth). Algorithmic languages Algorithmic languages are designed to express mathematical or symbolic computations. They can express algebraic operations in notation similar to mathematics and allow the use of subprograms that package commonly used operations for reuse. They were the first high-level languages.
  • 4.
    Object-oriented languages Object-oriented languageshelp to manage complexity in large programs. Objects package data and the operations on them so that only the operations are publicly accessible and internal details of the data structures are hidden. This information hiding made large-scale programming easier by allowing a programmer to think about each part of the program in isolation. In addition, objects may be derived from more general ones, “inheriting” their capabilities. Such an object hierarchy made it possible to define specialized objects without repeating all that is in the more general ones. Object-oriented programming began with the Simula language (1967), which added information hiding to ALGOL. Another influential object-oriented language was Smalltalk (1980), in which a program was a set of objects that interacted by sending messages to one another. Document formatting languages Document formatting languages specify the organization of printed text and graphics. They fall into several classes: text formatting notation that can serve the same functions as a word processing program, page description languages that are interpreted by a printing device, and, most generally, markup languages that describe the intended function of portions of a document. Scripting languages Scripting languages are sometimes called little languages. They are intended to solve relatively small programming problems that do not require the overhead of data declarations and other features needed to make large programs manageable. Scripting languages are used for writing operating system utilities, for special- purpose file-manipulation programs, and, because they are easy to learn, sometimes for considerably larger programs. Declarative languages Declarative languages, also called nonprocedural or very high level, are programming languages in which (ideally) a program specifies what is to be done rather than how to do it. In such languages there is less difference between the specification of a program and its implementation than in the procedural languages described so far. The two common kinds of declarative languages are logic and functional languages. Business-oriented languages COBOL: Common Business Oriented Language has been heavily used by businesses since its inception in 1959. A committee of computer manufacturers and users and U.S. government organizations established CODASYL (Committee on Data Systems and Languages) to develop and oversee the language standard in order to ensure its portability across diverse systems. COBOL uses an English-like notation—novel when introduced. Business computations organize and manipulate large quantities of data, and COBOL introduced the record data structure for such tasks. A record clusters heterogeneous data—such as a name, an ID number, an age, and an address—into a single unit. This
  • 5.
    contrasts with scientificlanguages, in which homogeneous arrays of numbers are common. Records are an important example of “chunking” data into a single object, and they appear in nearly all modern languages. SQL: Structured Query Language is a language for specifying the organization of databases (collections of records). Databases organized with SQL are called relational, because SQL provides the ability to query a database for information that falls in a given relation. For example, a query might be “find all records with both last name Smith and city New York.” Commercial database programs commonly use an SQL-like language for their queries.