Agreement Protocols, distributed File Systems, Distributed Shared Memory

DISTRIBUTED SYSTEMS Shikha Gautam Assistant Professor KIET, Ghaziabad

UNIT-3 • Agreement Protocols • Distributed Resource Management: Issues in distributed File Systems, Mechanism for building distributed file systems, Design issues in Distributed Shared Memory, Algorithm for Implementation of Distributed Shared Memory.

Agreement Protocols • A kind of co-operation or unity or accordance among processes. • As we know that all the nodes in a distributed system are working together in a cooperation to achieve a common goals. Hence cooperation among the process is very necessary . • Agreement protocol is used to ensure that DS is able to achieve the common goal even after occurrence of various failures in Distributed system. • There are some standard agreement problem in DC. We will see each problem and try to find some protocol or algorithm to solve the agreement problem.

Why Agreement Protocol • Agreement is always required to achieve a common goal in distributed system. • When there are some types of faulty processes present in the distributed system at that time we need to make sure that performance of distributed system should not be affected at that time we implement some agreement protocol (algorithms) ,so that output of distributed should not be incorrect or should not be affected. • To achieve reliability of Distributed system. So, mainly agreement protocols are used for fault (failure) tolerance in DS

Problem which require Agreement • Leader Election • Distributed Transaction • Mutual exclusion • Any many more ….

System Model Agreement Problems have been studied under following System Model: 1. ‘n’ processors and at most ‘m’ of the processors can be faulty. 2. Processors can directly communicate with other processors by message passing. 3. Receiver knows the identity of the sender. 4. Communication medium is reliable.

Model of processor Failures Processor Can Fail in three modes: 1. Crash Failure : Processor stops and never resumes operation. 2. Send/ Receive Omission : Processor Omits to send/receive message to some processors 3. Malicious Failure : most dangerous one (i) Also known as Byzantine Failure. (ii) Processor may send fictitious values/message to other processes to confuse them. (iii) Tough to detect/correct.

Classification of Messages in DS 1. Authenticated Messages: Also known as signed Message. Processor can not forge/change a received message. Processor can verify the authenticity of the message. It is easier to reach on an agreement in this case because faulty processors are capable of doing less damage. 2. Non-Authenticated Messages: Also known as Oral Message. Processor can forge/change a received message and claims to have received it from others. Processor can not verify the authenticity of the message in this case.

We will find agreement solution in two mode of communication under various failure

Classification of Computation: 1. Synchronous mode: • Finite message delay. • This model specify that the process in the system execute step by step i.e. lockup step manner. • One step is known as round. • A process receives messages (1 round), performs a computation (2 round), and send messages to other processes (3 round). • If there is any delay between messages then computation work will slow down and decrease the performance of the system.

2. Asynchronous mode: • Infinite message delay. • Non lockup step manner. • The computation at processes does not proceed in steps. • A process can send and receive messages and perform computation at any time without any sequence of steps.

Agreement Problem Types 1. Byzantine Agreement problem. The Byzantine Agreement problem is a primitive to the other two problems. So, The other two Agreement Problems are: 2. The Consensus Problem. 3. The Interactive Consistency Problem.

Problem Who Initiates Value Final Agreement Byzantine Agreement One Processor Single Value Consensus All Processors Single Value Interactive Consistency All Processors A Vector of Values

Byzantine General Problem (Inspired from Byzantine Empire) There was byzantine empire in middle ages ….there were some army generals who were protecting the city. Now all general have to make an agreement or some negotiation to protect the city . If any general is found to be traitors they will not be able to protect the city .hence they have to agree on some common terms and then only they can protect the byzantine empire. Present part of Byzantine Empire is knowns Turkey (Istanbul).

1. Byzantine Problem in DS An arbitrarily chosen processor, called the source processor, broadcasts its initial value to all other processors. Agreement: All non-faulty processors should agree on the same value. Validity: If the source processor is non-faulty, then the common agreed upon value by all non- faulty processors must be same as the initial value of the source.

Termination: Each non-faulty processor must eventually decide on a value. • Two Important Points to be remember: 1. If source is faulty then all non- faulty processes can agree on any common value. 2. Value agreed upon by faulty processors is irrelevant.

Agreement algorithm for No-failure • Agreement can easily achieved in constant no of message exchange. • Both synchronous and asynchronous mode will always achieve agreement .Because when all process are working fine then they are eventually satisfying the property of Distributed system and with constant no of message exchange we can achieve i.e. all nodes in a distributed system are working in a cooperation to achieve some common goal.

Agreement Protocol in Crash Failure Process

Solution for Byzantine Agreement Problem Lamport et. al proposed an algorithm for byzantine agreement problem which is known as Lamport-Shostak-Pease Algorithm. • Source Broadcasts its initial value to all other processors. • Processors send their values to other processors and also received values from others. • During Execution faulty processors may confuse by sending conflicting values. • However if faulty processors dominate in number, they can prevent non-faulty processors from reaching an agreement. • So, the no of faulty processors should not exceed certain limit.

Lamport-Shostak-Pease Algorithm This algorithm is also known as Oral Message Algorithm (OM). • Considering there are ‘n’ processors and ‘m’ faulty processors. • Pease showed that in a fully connected network, it is impossible to reach an agreement if number faulty processors ‘m’ exceeds (n-1)/3 • i.e. n >= (3m+1)

Algorithm is Recursively defined as follows: • Algorithm OM(0), i.e. (m=0) Step 1: Source processor sends its values to every processor. Step 2: Each processor uses the value it receives from source (If no value is received default value 0 is used).

• Algorithm OM(m), i.e. (m>0) Step 1: The source processor sends its value to every processor. Step 2: If a processor does not receive value it uses a default value of zero. Step 3: For each processor Pi, let vi be the value processor receives from source, then it behaves like source processor. Step 4: If for a processor Pi, Vj (j!=i) is the value received from Pj, the Pi uses the majority value as agreement value.

Byzantine Agreement can not have solution when among three processors if one processor is faulty.

2. The Consensus Problem Here all process have some initial value they broadcast their initial values to all others process and satisfy the following condition: Agreement: All non-faulty processes must agree on same single values. Validity: if all non faulty processes have the same initial value , then the agreed value by all the non- faulty processes must be that same value.

Termination: Each non-faulty process must eventually decide on a value. • Two Important Points to be remember: 1. If initial value of non-faulty processors are different then all non-faulty then processors can agree on any common value. 2. Value agreed upon by faulty processors is irrelevant.

3. The Interactive Consistency Problem Every processor broadcasts its initial value to all other processors. The initial values of the processors may be different . A protocol for the interactive consistency problem should meet the following conditions: Agreement: All non-faulty processes must agree on the same array of values A[v1 : : : vn]. Validity: If processor Pi is non-faulty and its initial value is vi , then all non-faulty processes agree on vi as the ith element of the array A. If process j is faulty, then the non-faulty processes can agree on any value for A[j]. Termination: Each non-faulty process must eventually decide on the array A.

Metrics to measure performance of Agreement Protocol • Time: Time taken to reach an agreement under a protocol. The time is usually expressed as the number of rounds needed to reach an agreement. • Message Traffic: Number of messages exchanged to reach an agreement. • Storage Overhead: Amount of information that need to be stored at processors during execution of the protocol.

• There are some solution that solve agreement problems by satisfying all condition of agreement problem in case of synchronous system. • But in case of asynchronous models , agreement problem are not solvable. However we can solve agreement problem in asynchronous System after converting agreement problem in its weaker version. That means agreement problem are reduce to some weaker version : Weaker version of agreement problem are: 1.k-set consensus 2.Approximate consensus 3.Renaming problem 4.Terminating reliable broadcast ( it is a kind of problem which require consensus.)

Applications of Agreement Protocol 1. Clock Synchronization in Distributed Systems: • Distributed Systems require physical clocks to synchronized but physical clocks have drift problem. So, they must periodically resynchronized. • Such periodically synchronization becomes extremely difficult if the Byzantine failures are allowed. • This is due to the fact that faulty processors can report different clock value to different processors. • Agreement Protocols may help to reach a common clock value.

2. Atomic Commit in Distributed Database: • DDBS sites must agree whether to commit or abort the transaction. • In first Phase, sites execute their part of a distributed transaction and broadcast their decisions to all other sites. • In Second Phase, each site based on what is received from other sites in the first phase, decides whether to commit or abort.

Two-phase commit in Distributed Systems Motivation: sending money

A correct atomic commit protocol

Distributed Resource Management Issues in distributed File Systems, Mechanism for building distributed file systems, Design issues in Distributed Shared Memory, Algorithm for Implementation of Distributed Shared Memory.

• File System work as the resource management component, which manages the availability of files in distributed system. • A common file system that can be shared by all the autonomous computers in the system. i.e. files can be stored at any machine and the computation can be performed at any machine. Two important goals : 1. Network transparency – to access files distributed over a network. Ideally, users do not have to be aware of the location of files to access them. 2. High Availability - to provide high availability. Users should have the same easy access to files, irrespective of their physical location. Distributed File Systems

Figure: Architecture of a Distributed File System

Architecture • File servers and File clients interconnected by a communication network. • Two most important components: 1. Name Server: map logical names to stored object’s (files, directories) physical location. 2. Cache Manager: perform file caching. Can present on both servers and clients. 1. Cache on the client deals with network latency 2. Cache on the server deals with disk latency

• Typical steps to access data: 1. check client cache, if present, return data. 2. Check local disk, if present, load into local cache, return data. 3. Send request to file server 4. ... 5. server checks cache, if present, load into client cache, return data 6. disk read 7. load into server cache 8. load into client cache 9. return data

1. Naming and Transparency 2. Remote file access and Caching 3. Replication and Concurrent file updates 4. Availability 5. Scalability 6. Semantics Issues in distributed File Systems

• Transparency should also be achieved at various levels: 1. Structure Transparency 2. Access Transparency 3. Naming Transparency 4. Replication Transparency 5. Location Transparency 6. Mobility Transparency 7. Performance Transparency 8. Scaling Transparency

2. Remote file access and Caching

3. Concurrent file updates • Server data is replicated across multiple machines. • Need to ensure consistency of files when a file is updated by multiple clients. • Changes to a file by one client should not interfere with the operations of other clients.

4. Availability • how to keep replicas consistent. • how to detect inconsistencies among replicas. • consistency problem may decrease the availability. • Replica Management: voting mechanism to read and write to replica.

5. Scalability • How to meet the demand of a growing system? • The biggest hurdle: consistency issue

6. Semantics • What a user wants? strict consistency. • Users can usually tolerate a certain degree of errors in file handling -- no need to enforce strict consistency.

Mechanism for building Distributed File Systems 1. Mounting: A mount mechanism allows the binding together of different filename spaces to form a single hierarchically structured name space.

2. Caching • Caching is commonly employed in distributed files systems to reduce delays in the accessing of data. • In file caching, a copy of data stored at a remote file server is brought to the client when referenced by the client. • Subsequent access to the data is performed locally at the client, thereby reducing access delays due to network latency. • Caching exploits the temporal locality of reference exhibited by programs. • The temporal locality of reference refers to the fact that a file recently accessed is likely to be accessed again in the near future.

3. Hint • An alternative approach is used when cached data are not expected to be completely accurate. • However, valid cache entries improve performance substantially without incurring the cost of maintaining cost consistency. • The class of applications that can utilize hints are those which can recover after discovering that the cached data are invalid. • For example, after the name of a file or directory is mapped to the physical object, the address of the object can be stored as a hint in the cache. • If the address fails to map to the object in the following attempt, the cached address is purged from the cache. • The file server consults the name server to determine the actual location of the file or directory and updates the cache.

4. Bulk Data Transfer • Transferring data in bulk reduces the protocol processing overhead at both servers and clients. • In bulk data transfer, multiple consecutive data blocks are transferred from servers to clients instead of just the block referenced by clients. • Bulk transfers reduce file access overhead through obtaining a multiple number of blocks with a single seek; by formatting and transmitting a multiple number of large packets in a single context switch; and by reducing the number of acknowledgements that need to be sent.

5. Encryption • Encryption is used for enforcing security in distributed systems. • In this scheme, two entities wishing to communicate with each other establish a key for conversation with the help of an authentication server. • It is important to note that the conversation key is determined by the authentication server, but is never spent in plain (unencrypted) text to either of the entities.

Distributed Shared Memory • Idea of distributed shared memory is to provide an environment where computers support a shared address space that is made by physically dispersed memories. • It refers to shared memory paradigm applied to loosely coupled distributed memory systems. It gives the systems illusion of physically shared memory. • Memory mapping manager is responsible for mapping between local memories and the shared memory address space. • Any processor can access any memory location in the address space directly. • Chief responsibility is to keep the address space coherent at the times.

Architecture • Each node of the system consist of one or more CPUs and memory unit. • Nodes are connected by high speed communication network. • Simple message passing system for nodes to exchange information. • Main memory of individual nodes is used to cache pieces of shared memory space. • Shared memory exist only virtually. • Memory mapping manager routine maps local memory to shared virtual memory.

• The shared memory model provides a virtual address space which is shared by all nodes in a distributed system. • The basic unit of caching is a memory block. • Shared memory space is partitioned into blocks. • Data caching is used to reduce network latency. • The missing block is migrate from the remote node to the client process’s node and operating system maps into the application’s address space. • Data block keep migrating from one node to another on demand but no communication is visible to the user processes.

Advantages of DSM(or DSVM) 1. Simpler Abstraction: shields the application programmers from low level concern. 2. Better portability of distributed application programs: The access protocol used in case of DSM is consistent with the way sequential application access data this allows for a more natural transition from sequential to distributed application.

3. Better performance: due to Locality of data, On demand data moment, Large memory space as total memory size is the sum of the memory size of all the nodes in the system. 4. Flexible communication environment 5. On demand migration of data between processors.

Design issues in Distributed Shared Memory 1. Granularity 2. Structure of Shared memory 3. Memory coherence and access synchronization 4. Data location and access 5. Replacement strategy 6. Thrashing 7. Heterogeneity

1. Granularity: • Computation granularity refers to the size of the sharing unit. It can be a byte, a word, a page or other type of unit. • Choosing the right granularity is a major issue in distributed shared memory because it deals with the amount of computation done between synchronization or communication points. • Other issue is, moving around code and data in the networks involves latency and overhead from network protocols.

2. Structure of Shared memory • Structure refers to the layout of the shared data in memory. • Dependent on the type of applications that the distributed shared memory system is intended to support.

3. Memory coherence and access synchronization • In a DSM system that allows replication of shared data item, copies of shared data item may simultaneously be available in the main memories of a number of nodes. • To solve the memory coherence problem that deal with the consistency of a piece of shared data lying in the main memories of two or more nodes. • There might be some potential consistency problems when different processors access, cache and update the shared single memory space.

4. Data location and access • To share data in a DSM, should be possible to locate and retrieve the data accessed by a user process.

5. Replacement strategy • If the local memory of a node is full, a cache miss at that node implies not only a fetch of accessed data block from a remote node but also a replacement. • Data block must be replaced by the new data block.

6. Thrashing • Thrashing occurs when a computer's virtual memory resources are overused, this causes the performance of the computer to degrade or collapse. • The problem of thrashing may occur when data item in the same data block are being updated by multiple node at the same time. • Problem may occur with any block size, it is more likely with larger block size.

7. Heterogeneity • The DSM system built for homogeneous system need not address the heterogeneity issue.

• So most the important issues are: - How to keep track of the location of remote data - How to minimize communication overhead when accessing remote data - How to access concurrently remote data at several nodes

Algorithm for Implementation of Distributed Shared Memory 1. The Central Server Algorithm 2. The Migration Algorithm 3. The Read-Replication Algorithm 4. The Full–Replication Algorithm

1. The Central Server Algorithm • Central server maintains all shared data: – Read request: returns data item. – Write request: updates data and returns acknowledgement message.

• Implementation: – A timeout is used to resend a request if acknowledgment fails. – Associated sequence numbers can be used to detect duplicate write requests – If an application’s request to access shared data fails repeatedly, a failure condition is sent to the application – It is simpler to implement but the central server can become bottleneck and to overcome this shared data can be distributed among several servers. • Issues: performance and reliability. • Possible solutions: – Partition shared data between several servers – Use a mapping function to distribute/locate data

2. The Migration Algorithm • Every data access request is forwarded to location of data while in this data is shipped to location of data access request which allows subsequent access to be performed locally. • It allows only one node to access a shared data at a time. • The whole block containing data item migrates instead of individual item requested. • This algorithm provides an opportunity to integrate DSM with virtual memory provided by operating system at individual nodes.

• Advantages: Takes advantage of the locality of reference. • To locate a remote data object: – Use a location server – Maintain hints at each node – Broadcast query • Issues – Only one node can access a data object at a time – Thrashing can occur: to minimize it, set minimum time data object resides at a node

• This extends the migration algorithm by replicating data blocks to multiple nodes and allowing multiple nodes to have read access or one node to have both read write access. • After a write, all copies are invalidated or updated. • DSM has to keep track of locations of all copies of data objects. • Advantage:  The read-replication can lead to substantial performance improvements if the ratio of reads to writes is large.  It improves system performance by allowing multiple nodes to access data concurrently. 3. The Read-Replication Algorithm

The write operation in this is expensive as all copies of a shared block at various nodes will either have to invalidated or updated with the current value to maintain consistency of shared data block.

• It is an extension of read replication algorithm which allows multiple nodes to have both read and write access to shared data blocks. • Issue: Since many nodes can write shared data concurrently, the access to shared data must be controlled to maintain it’s consistency. 4. The Full–Replication Algorithm

• Solution: use of gap-free sequencer  All writes sent to sequencer.  all nodes wishing to modify shared data will send the modification to sequencer which will then assign a sequence number and multicast the modification with sequence number to all nodes that have a copy of shared data item.  Each node performs writes according to sequence numbers.  A gap in sequence numbers indicates a missing write request: node asks for retransmission of missing write requests.

Agreement Protocols, distributed File Systems, Distributed Shared Memory

Agreement Protocols, distributed File Systems, Distributed Shared Memory

In this document

More Related Content

What's hot

Similar to Agreement Protocols, distributed File Systems, Distributed Shared Memory

More from SHIKHA GAUTAM

Recently uploaded

Agreement Protocols, distributed File Systems, Distributed Shared Memory