RDBMS - Unit III Chapter 20 Database System Architectures Prepared By Dr. S.Murugan, Associate Professor Department of Computer Science, AlagappaGovernment Arts College, Karaikudi. (Affiliated by AlagappaUniversity) Mailid: muruganjit@gmail.com Reference Book: Database System Concepts by Abraham Silberschatz, Henry F.Korth , S. Sudharshan
Database System Architectures – Client Server Database System ➢ Networking of computers allows some tasks to be executed on a server system and some tasks to be executed on client systems. ➢ This division of work has led to client- server database systems.
Database System Architectures – Parallel Processing System ➢ Parallel processing is a method of simultaneously breaking up and running program tasks on multiple microprocessors, thereby reducing processing time. ➢ Parallel processing may be accomplished via a computer with two or more processors or via a computer network. ➢ Parallel processing is also called parallel computing
Database System Architectures – Parallel Processing System
Database System Architectures – Distributed Data Processing System ➢ Distributed data processing is a computer-networking method in which multiple computers across different locations share computer-processing capability.
2O.1 CentraIized and Client - Server Architectures ➢ A modern, general-purpose computer system consists of one to a few CPUs and a number of device controllers that are connected through a common bus that provides accessto shared memory as shown in Figure 20.1.
2O.1 CentraIized and Client - Server Architectures ➢ A computer system may be single user or multi user. ➢ A typical single-user system is a desktop unit used by a single person, usually with only one CPU and one or two hard disks, and usually only one person using the machine at a time. ➢ A typical multiuser system, on the other hand, has more disks and more memory, may have multiple CPUs and has a multiuser operating system. ➢ It serves a large number of users who are connected to the system via terminals.
20.1.2 Client-Server Systems ➢ The centralized systems today act as server systems that satisfy requests generated by client systems. Figure 20.2 shows the general structure of a client- server system.
20.1.2 Client-Server Systems ➢ The function of database can be broadly divided into two parts. One is front-end and another one is Back- end as shown in Figure 20.3. ➢ The front-end of a database system consists of tools such as SQL user interface, forms interfaces, report Generation. ➢ The back end manages access structures and query Evaluation. ➢ The interface between the front end and the back end is through SQL, or through an application program.
20.1.2 Client-Server Systems
2O.2 Server System Architecture ➢ Server systems can be broadly categorized as transaction servers and data servers. ➢ Transaction-server systems, also called query- server which receives the answer from the server based on the client request. ➢ Data-server systems allow clients to interact with the servers by making requests to read or update data.
20.2.1 Transaction-Server Process Structure ➢ A transaction-server system consists of multiple processes accessing data in shared memory, as in Figure 20.4. The processes that form part of the database system include. ➢ Server processes: These are processes that receive user queries (transactions), execute them, and send the results back to the client. ➢ Lock manager process: It includes lock grant, lock release, and deadlock detection.
20.2.1 Transaction-Server Process Structure ➢ Database writer process: The output can be stored from spool into hard disk and user records. ➢ Log writer process: outputs log records from the log record buffer to stable storage. ➢ Process monitor process: It takes recovery actions.
20.2.1 Transaction-Server Process Structure
2O.2 Server System Architecture ➢ Data-server systems are used in local-area networks, where there is a high-speed connection between the clients and the server, the client machines are comparable in processing power to the server machine, and the tasks to be executed are computation intensive. ➢ In such an environment, it makes sense to ship data to client machines, to perform all processing at the client machine (which may take a while), and then to ship the data back to the server machine. ➢ Note that this architecture requires full back-end functionality at the clients. ➢ Data-server architectures have been particularly popular in object-oriented database systems.
2O.3 Parallel Systems ➢ Parallel systems improve processing and I/O speeds by using multiple CPUs and disks in parallel. ➢ In parallel processing, many operations are performed simultaneously, as opposed to serial processing, in which the computational steps are performed sequentially. ➢ A coarse-grain parallel machine consists of a small number of powerful processors. ➢ A massively parallel or fine-grain parallel machine uses thousands of smaller processors.
2O.3 Parallel Systems There are two main measures of performance of a database system: (1) throughput, the number of tasks that can be completed in a given time interval, and (2) response time, the amount of time it takes to complete a single task from the time it is submitted.
20.3.1 Speedup and Scaleup ➢ Two important issues in studying parallelism are speedup and scaleup. ➢ Running a given task in less time by increasing the degree of parallelism is called speedup. ➢ Handling larger tasks by increasing the degree of parallelism is called scaleup.
20.3.1 Speedup and Scaleup
20.3.1 Speedup and Scaleup
20.3.2 lnterconnection Networks ➢ Parallel systems consist of a set of components (processors, memory, and disks) that can communicate with each other via an interconnection network ➢ Figure 20.7 shows three commonly used types of interconnection networks: (i) Bus (ii) Mesh (iii)Hybercube
20.3.2 lnterconnection Networks - Bus ➢ AII the system components can send data on and receive data from a single communication bus. This type of interconnection is shown in Figure (a). ➢ The bus could be an Ethernet or a parallel interconnect. ➢ Bus architectures work well for small numbers of processors.
20.3.2 lnterconnection Networks - Mesh ➢ The components are nodes in a grid, and each component connects to all its adjacent components in the grid. ➢ In a two-dimensional mesh each node connects to four adjacent nodes, Figure (b) shows a two-dimensional mesh.
20.3.2 lnterconnection Networks - Hybercube ➢ The components are numbered in binary, and a component is connected to another if the binary representations of their numbers differ in exactly one bit.
20.3.3 Parallel Database Architectures ➢ There are several architectural models for parallel machines. ➢ Among the most prominent ones are those in Figure 20.8 (in the figure, M denotes memory, P denotes a processor, and disks are shown as cylinders):
20.3.3 Parallel Database Architectures ➢ Shared memory. All the processors share a common memory (Figure 20.8a). ➢ Shared disk. All the processors share a common set of disks (Figure 20.8b). Shared-disk systems are sometimes called clusters. ➢ Shared nothing. The processors share neither a common memory nor common disk (Figure 20.8c). ➢ Hierarchical. This model is a hybrid of the preceding three architectures (Figure 20.8d)
20.3.3 Parallel Database Architectures
2O.4 Distributed Systems ➢ In a distributed database system, the database is stored on several computers. ➢ The computers in a distributed system communicate with one another through various communication media, such as high-speed networks or telephone lines. The general structure of a distributed system appears in Figure 20.9.
2O.4 Distributed Systems S.No. Shared Nothing Parallel DB Distributed DB 1. Parallel databases are not geographically separated, not separately administrated and have a faster interconnection. Distributed databases are geographically separated, separately administered, and have a slower interconnection. 2. There is no difference in local transaction and global transaction. There is a difference between local transaction and global transaction. In local transaction, accesses data only from sites where the transaction was initiated. In global transaction, accesses data in several different sites.
2O.4 Distributed Systems ➢ Distributed systems are developed for the purpose of sharing data, autonomy and availability. ➢ Sharing data: The user may able to access the data from other site. ➢ Autonomy: Local database administrator is available for every site. ➢ Availability: If one site fails in a distributed system, the remaining sites may be able to continue operating.
20.4.1 An Example of Distributed Database ➢ Consider a banking system consisting of four branches in four different cities. ➢ Each branch has its own computer, with a database of all the accounts maintained at that branch. ➢ There also exists one single site that maintains information about all the branches of the bank. ➢ Each branch maintains a relation account (Account_schema), where Account_schema = (account_number, branch_name, ba lance)
20.4.1 An Example of Distributed Database ➢ The site containing information about all the branches of the bank maintains the reIation branch (Branch_schema), where Branch_schema= (branch_name, branch_city, assets) ➢ If the data were accessed from single site for the transaction is called local transaction. For ex, fund transfer from Account A to Account B in the same branch with the same city. ➢ If the data were accessed from multiple site for the transaction is called global transaction. For ex, fund transfer from Account A to Account B in the same branch with the different city.
20.4.2 lmplementation lssues ➢ Atomicity of transactions is an important issue in building a distributed database system. ➢ If a transaction runs across two sites, it may commit at one site and abort at another, leading to an inconsistent state. ➢ This problem solved by the two-phase commit protocol (2PC). ➢ The coordinator decides to commit the transaction only if the transaction reaches the ready state at every site where it executed; otherwise (for example, if the transaction aborts at any site), the coordinator decides to abort the transaction.
20.4.2 Implementation lssues ➢ Concurrency control is another issue in a distributed database. ➢ Since a transaction may access data items at several sites, transaction managers at several sites may need to coordinate to implement concurrency control. ➢ If locking is used, locking can be performed locally at the sites containing accessed data items, but there is also a possibility of deadlock involving transactions originating at multiple sites. ➢ Therefore deadlock detection needs to be carried out across multiple sites.
20.4.2 Implementation lssues ➢ The primary disadvantage of distributed database systems are: ➢ Software-development cost is high. ➢ Greater potential for bugs. ➢ Increased processing overhead.
2O.5 Network Types ➢ Distributed databases and client-server systems are built around communication networks. ➢ There are basically two types of networks: local-area networks and wide area networks. S.No . Local Area Network Wide Area Network 1. processors are distributed over small geographical areas, such as a single building or a number of adjacent buildings. a number of autonomous processors are distributed over a large geographical area (such as the United States or the entire world)
20.5.1 Local-Area Networks ➢ Local-area networks (LANs) (Figure 20.10) emerged in the early 1970s as a way for computers to communicate and to share data with one another.
20.5.1 Local-Area Networks ➢ LANs are generally used in an office environment ➢ LANs have a higher speed and lower error rate than WAN ➢ The most common links in a local-area network are twisted pair, coaxial cable, fiber optics. ➢ Communication speeds range from a few megabits per second to gigabits per second.
20.5.1 Local-Area Networks ➢ A storage-area network (SAN) is a special type of high-speed local-area network designed to connect large banks of storage devices (disks) to computers that use the data (see Figure 20.11).
20.5.2 Wide-Area Networks ➢ The first WAN to be designed and developed was the Arpanet. Work on the Arpanet began in 1968. ➢ The Arpanet has grown from a four-site experimental network to a worldwide network of networks, the Internet comprising hundreds of millions of computer systems. ➢ Data rates for wide-area links typically range from a few megabits per second to hundreds of gigabits per second. ➢ The last link, to end user sites, is often based on digital subscriber line (DSL) technology or cable modem or dial-up modem
20.5.2 Wide-Area Networks ➢ WANs can be classified into two types: ➢ In discontinuous connection WANs, such as those based on wireless connections, hosts are connected to the network only part of the time. ➢ In continuous connection WANs, such as the wired Internet, hosts are connected to the network at all times.

Lecture Notes Unit3 chapter20 - Database System Architectures

  • 1.
    RDBMS - UnitIII Chapter 20 Database System Architectures Prepared By Dr. S.Murugan, Associate Professor Department of Computer Science, AlagappaGovernment Arts College, Karaikudi. (Affiliated by AlagappaUniversity) Mailid: muruganjit@gmail.com Reference Book: Database System Concepts by Abraham Silberschatz, Henry F.Korth , S. Sudharshan
  • 2.
    Database System Architectures– Client Server Database System ➢ Networking of computers allows some tasks to be executed on a server system and some tasks to be executed on client systems. ➢ This division of work has led to client- server database systems.
  • 3.
    Database System Architectures– Parallel Processing System ➢ Parallel processing is a method of simultaneously breaking up and running program tasks on multiple microprocessors, thereby reducing processing time. ➢ Parallel processing may be accomplished via a computer with two or more processors or via a computer network. ➢ Parallel processing is also called parallel computing
  • 4.
    Database System Architectures– Parallel Processing System
  • 5.
    Database System Architectures– Distributed Data Processing System ➢ Distributed data processing is a computer-networking method in which multiple computers across different locations share computer-processing capability.
  • 6.
    2O.1 CentraIized andClient - Server Architectures ➢ A modern, general-purpose computer system consists of one to a few CPUs and a number of device controllers that are connected through a common bus that provides accessto shared memory as shown in Figure 20.1.
  • 7.
    2O.1 CentraIized andClient - Server Architectures ➢ A computer system may be single user or multi user. ➢ A typical single-user system is a desktop unit used by a single person, usually with only one CPU and one or two hard disks, and usually only one person using the machine at a time. ➢ A typical multiuser system, on the other hand, has more disks and more memory, may have multiple CPUs and has a multiuser operating system. ➢ It serves a large number of users who are connected to the system via terminals.
  • 8.
    20.1.2 Client-Server Systems ➢The centralized systems today act as server systems that satisfy requests generated by client systems. Figure 20.2 shows the general structure of a client- server system.
  • 9.
    20.1.2 Client-Server Systems ➢The function of database can be broadly divided into two parts. One is front-end and another one is Back- end as shown in Figure 20.3. ➢ The front-end of a database system consists of tools such as SQL user interface, forms interfaces, report Generation. ➢ The back end manages access structures and query Evaluation. ➢ The interface between the front end and the back end is through SQL, or through an application program.
  • 10.
  • 11.
    2O.2 Server SystemArchitecture ➢ Server systems can be broadly categorized as transaction servers and data servers. ➢ Transaction-server systems, also called query- server which receives the answer from the server based on the client request. ➢ Data-server systems allow clients to interact with the servers by making requests to read or update data.
  • 12.
    20.2.1 Transaction-Server ProcessStructure ➢ A transaction-server system consists of multiple processes accessing data in shared memory, as in Figure 20.4. The processes that form part of the database system include. ➢ Server processes: These are processes that receive user queries (transactions), execute them, and send the results back to the client. ➢ Lock manager process: It includes lock grant, lock release, and deadlock detection.
  • 13.
    20.2.1 Transaction-Server ProcessStructure ➢ Database writer process: The output can be stored from spool into hard disk and user records. ➢ Log writer process: outputs log records from the log record buffer to stable storage. ➢ Process monitor process: It takes recovery actions.
  • 14.
  • 15.
    2O.2 Server SystemArchitecture ➢ Data-server systems are used in local-area networks, where there is a high-speed connection between the clients and the server, the client machines are comparable in processing power to the server machine, and the tasks to be executed are computation intensive. ➢ In such an environment, it makes sense to ship data to client machines, to perform all processing at the client machine (which may take a while), and then to ship the data back to the server machine. ➢ Note that this architecture requires full back-end functionality at the clients. ➢ Data-server architectures have been particularly popular in object-oriented database systems.
  • 16.
    2O.3 Parallel Systems ➢Parallel systems improve processing and I/O speeds by using multiple CPUs and disks in parallel. ➢ In parallel processing, many operations are performed simultaneously, as opposed to serial processing, in which the computational steps are performed sequentially. ➢ A coarse-grain parallel machine consists of a small number of powerful processors. ➢ A massively parallel or fine-grain parallel machine uses thousands of smaller processors.
  • 17.
    2O.3 Parallel Systems Thereare two main measures of performance of a database system: (1) throughput, the number of tasks that can be completed in a given time interval, and (2) response time, the amount of time it takes to complete a single task from the time it is submitted.
  • 18.
    20.3.1 Speedup andScaleup ➢ Two important issues in studying parallelism are speedup and scaleup. ➢ Running a given task in less time by increasing the degree of parallelism is called speedup. ➢ Handling larger tasks by increasing the degree of parallelism is called scaleup.
  • 19.
  • 20.
  • 21.
    20.3.2 lnterconnection Networks ➢Parallel systems consist of a set of components (processors, memory, and disks) that can communicate with each other via an interconnection network ➢ Figure 20.7 shows three commonly used types of interconnection networks: (i) Bus (ii) Mesh (iii)Hybercube
  • 22.
    20.3.2 lnterconnection Networks- Bus ➢ AII the system components can send data on and receive data from a single communication bus. This type of interconnection is shown in Figure (a). ➢ The bus could be an Ethernet or a parallel interconnect. ➢ Bus architectures work well for small numbers of processors.
  • 23.
    20.3.2 lnterconnection Networks- Mesh ➢ The components are nodes in a grid, and each component connects to all its adjacent components in the grid. ➢ In a two-dimensional mesh each node connects to four adjacent nodes, Figure (b) shows a two-dimensional mesh.
  • 24.
    20.3.2 lnterconnection Networks- Hybercube ➢ The components are numbered in binary, and a component is connected to another if the binary representations of their numbers differ in exactly one bit.
  • 25.
    20.3.3 Parallel DatabaseArchitectures ➢ There are several architectural models for parallel machines. ➢ Among the most prominent ones are those in Figure 20.8 (in the figure, M denotes memory, P denotes a processor, and disks are shown as cylinders):
  • 26.
    20.3.3 Parallel DatabaseArchitectures ➢ Shared memory. All the processors share a common memory (Figure 20.8a). ➢ Shared disk. All the processors share a common set of disks (Figure 20.8b). Shared-disk systems are sometimes called clusters. ➢ Shared nothing. The processors share neither a common memory nor common disk (Figure 20.8c). ➢ Hierarchical. This model is a hybrid of the preceding three architectures (Figure 20.8d)
  • 27.
  • 28.
    2O.4 Distributed Systems ➢In a distributed database system, the database is stored on several computers. ➢ The computers in a distributed system communicate with one another through various communication media, such as high-speed networks or telephone lines. The general structure of a distributed system appears in Figure 20.9.
  • 29.
    2O.4 Distributed Systems S.No.Shared Nothing Parallel DB Distributed DB 1. Parallel databases are not geographically separated, not separately administrated and have a faster interconnection. Distributed databases are geographically separated, separately administered, and have a slower interconnection. 2. There is no difference in local transaction and global transaction. There is a difference between local transaction and global transaction. In local transaction, accesses data only from sites where the transaction was initiated. In global transaction, accesses data in several different sites.
  • 30.
    2O.4 Distributed Systems ➢Distributed systems are developed for the purpose of sharing data, autonomy and availability. ➢ Sharing data: The user may able to access the data from other site. ➢ Autonomy: Local database administrator is available for every site. ➢ Availability: If one site fails in a distributed system, the remaining sites may be able to continue operating.
  • 31.
    20.4.1 An Exampleof Distributed Database ➢ Consider a banking system consisting of four branches in four different cities. ➢ Each branch has its own computer, with a database of all the accounts maintained at that branch. ➢ There also exists one single site that maintains information about all the branches of the bank. ➢ Each branch maintains a relation account (Account_schema), where Account_schema = (account_number, branch_name, ba lance)
  • 32.
    20.4.1 An Exampleof Distributed Database ➢ The site containing information about all the branches of the bank maintains the reIation branch (Branch_schema), where Branch_schema= (branch_name, branch_city, assets) ➢ If the data were accessed from single site for the transaction is called local transaction. For ex, fund transfer from Account A to Account B in the same branch with the same city. ➢ If the data were accessed from multiple site for the transaction is called global transaction. For ex, fund transfer from Account A to Account B in the same branch with the different city.
  • 33.
    20.4.2 lmplementation lssues ➢Atomicity of transactions is an important issue in building a distributed database system. ➢ If a transaction runs across two sites, it may commit at one site and abort at another, leading to an inconsistent state. ➢ This problem solved by the two-phase commit protocol (2PC). ➢ The coordinator decides to commit the transaction only if the transaction reaches the ready state at every site where it executed; otherwise (for example, if the transaction aborts at any site), the coordinator decides to abort the transaction.
  • 34.
    20.4.2 Implementation lssues ➢Concurrency control is another issue in a distributed database. ➢ Since a transaction may access data items at several sites, transaction managers at several sites may need to coordinate to implement concurrency control. ➢ If locking is used, locking can be performed locally at the sites containing accessed data items, but there is also a possibility of deadlock involving transactions originating at multiple sites. ➢ Therefore deadlock detection needs to be carried out across multiple sites.
  • 35.
    20.4.2 Implementation lssues ➢The primary disadvantage of distributed database systems are: ➢ Software-development cost is high. ➢ Greater potential for bugs. ➢ Increased processing overhead.
  • 36.
    2O.5 Network Types ➢Distributed databases and client-server systems are built around communication networks. ➢ There are basically two types of networks: local-area networks and wide area networks. S.No . Local Area Network Wide Area Network 1. processors are distributed over small geographical areas, such as a single building or a number of adjacent buildings. a number of autonomous processors are distributed over a large geographical area (such as the United States or the entire world)
  • 37.
    20.5.1 Local-Area Networks ➢Local-area networks (LANs) (Figure 20.10) emerged in the early 1970s as a way for computers to communicate and to share data with one another.
  • 38.
    20.5.1 Local-Area Networks ➢LANs are generally used in an office environment ➢ LANs have a higher speed and lower error rate than WAN ➢ The most common links in a local-area network are twisted pair, coaxial cable, fiber optics. ➢ Communication speeds range from a few megabits per second to gigabits per second.
  • 39.
    20.5.1 Local-Area Networks ➢A storage-area network (SAN) is a special type of high-speed local-area network designed to connect large banks of storage devices (disks) to computers that use the data (see Figure 20.11).
  • 40.
    20.5.2 Wide-Area Networks ➢The first WAN to be designed and developed was the Arpanet. Work on the Arpanet began in 1968. ➢ The Arpanet has grown from a four-site experimental network to a worldwide network of networks, the Internet comprising hundreds of millions of computer systems. ➢ Data rates for wide-area links typically range from a few megabits per second to hundreds of gigabits per second. ➢ The last link, to end user sites, is often based on digital subscriber line (DSL) technology or cable modem or dial-up modem
  • 41.
    20.5.2 Wide-Area Networks ➢WANs can be classified into two types: ➢ In discontinuous connection WANs, such as those based on wireless connections, hosts are connected to the network only part of the time. ➢ In continuous connection WANs, such as the wired Internet, hosts are connected to the network at all times.