COMM 527: Distributed Systems
INTRODUCTION TO DISTRIBUTED
SYSTEMS AND
CHARACTERIZATION
Presentation Outline
Introduction
Defining Distributed Systems
Characteristics of Distributed Systems
Example Distributed Systems
Challenges of Distributed Systems
Summary
2
Defining Distributed Systems
“A system in which hardware or software components located at
networked computers communicate and coordinate their actions
only by message passing.”
“A distributed system is a collection of independent computers
that appear to the users of the system as a single computer.”
Example Distributed Systems:
Cluster:
“A type of parallel or distributed processing system, which consists of
a collection of interconnected stand-alone computers cooperatively
working together as a single, integrated computing resource” .
Cloud:
“a type of parallel and distributed system consisting of a collection of
interconnected and virtualized computers that are dynamically
provisioned and presented as one or more unified computing
resources based on service-level agreements established through
negotiation between the service provider and consumers”.
3
Networks vs. Distributed Systems
Networks: A media for interconnecting local and
wide area computers and exchange messages
based on protocols. Network entities are visible
and they are explicitly addressed (e.g., IP
address).
Distributed System: existence of multiple
autonomous computers is transparent
However,
many problems (e.g., openness, reliability) in
Distributed Systems
common, but at different levels.
Networks focuses on packets, routing, etc.,
whereas distributed systems focus on applications.
Computer Networks
Every distributed system relies on services
provided by a computer network.
4
Reasons for Distributed Systems
Functional Separation:
Existence of computers with different capabilities and purposes:
Clients and Servers
Data collection and data processing
Inherent distribution:
Information:
Different information is created and maintained by different people (e.g., Web
pages)
People
Computer supported collaborative work (virtual teams, engineering, virtual
surgery)
Retail store and inventory systems for supermarket chains
Reliability:
Long term preservation and data backup (replication) at different locations.
Economies:
Sharing a printer by many users and reduce the cost of ownership.
Building a supercomputer out of a network of computers.
5
Characteristics of Distributed Systems
Parallel activities
Autonomous components executing concurrent
tasks
Communication via message passing
No shared memory
Resource sharing
Printer, database, other services
No global state
No single process can have knowledge of the
current global state of the system
No global clock
Only limited precision for processes to synchronize
their clocks
6
Goals of Distributed Systems
Connecting Users and Resources
Transparency
Openness
Scalability
Enhanced Availability
7
Differentiation with Parallel Systems
Multiprocessor systems
Shared memory
Bus-based interconnection network
E.g. SMPs (Symmetric Multi-Processors) with two or more
CPUs
Multicomputer systems / Clusters
No shared memory
Homogeneous in hardware and software
Massively Parallel Processors (MPP)
Tightly coupled high-speed network
PC/Workstation clusters
High-speed networks/switches-based connection.
8
Differentiation with Parallel Systems is blurring
Extensibility of clusters leads to heterogeneity
Adding additional nodes as requirements grow
Extending clusters to include user desktops by
harnessing their idle resources
Leading to the rapid convergence of various
concepts of parallel and distributed systems
9
Examples of Distributed Systems
They (DS) are based on familiar and widely used
computer networks:
Internet
Intranets, and
Wireless networks
Example DS:
Web (and many of its applications like Facebook)
Data Centers and Clouds
Wide area storage systems
Banking Systems
10
The Internet as a Distributed System
Intranet
ISP
backbone
satellite link
desktop computer:
server:
network link:
The Internet is a vast collection of computer networks of many
different types and hosts various types of services.
11
Intranet as a Distributed System
Desktop
computers
Print and other servers email server
Local area
Web server network
email server
print
File server
other servers
the rest of
the Internet
router/firewall
12
Business Example and Challenges
Online bookstore (e.g. amazon.com)
Customers can connect their computer to the
amazon.com servers (web server):
Browse their inventory
Place orders
…
13
Business Example – Challenges I
What if
Your customer uses a completely different hardware? (PC,
MAC,…)
… a different operating system? (Windows, Unix,…)
… a different way of representing data? (ASCII, UNICODE,
…)
Heterogeneity
Or
You want to move your business and computers to
Canada?
Your client moves to the Australia?
Distribution Transparency
14
Business Example – Challenges II
What if
Two customers want to order the same item at the
same time?
Concurrency
Or
The database with your inventory information
crashes?
Your customer’s computer crashes in the middle
of an order?
Fault Tolerance
15
Business Example – Challenges III
What if
Someone tries to break into your system to steal
data?
… sniffs for information?
Security
Or
You are so successful that millions of people are
visiting your online store at the same time?
Scalability
16
Business Example – Challenges IV
When building the system…
Do you want to write the whole software on your
own (network, database,…)?
What about updates, new technologies?
Reuse and Openness (Standards)
17
Overview Challenges I
Heterogeneity
Heterogeneous components must be able to interoperate
Distribution transparency
Distribution should be hidden from the user as much as possible
Fault tolerance
Failure of a component (partial failure) should not result in failure
of the whole system
Scalability
System should work efficiently with an increasing number of
users
System performance should increase with inclusion of additional
resources
18
Overview Challenges II
Concurrency
Shared access to resources must be possible
Openness
Interfaces should be publicly available to ease
inclusion of new components
Security
The system should only be used in the way
intended
19
Heterogeneity
Heterogeneous components must be able to
interoperate across different:
Operating systems
Hardware architectures
Communication architectures
Programming languages
Software interfaces
Security measures
Information representation
20
Distribution Transparency I
To hide from the user and the application programmer the
separation/distribution of components, so that the system is
perceived as a whole rather than a collection of independent
components.
ISO Reference Model for Open Distributed Processing (ODP)
identifies the following forms of transparencies:
Access transparency
Access to local or remote resources is identical
E.g. Network File System / Dropbox
Location transparency
Access without knowledge of location
E.g. separation of domain name from
machine address.
Failure transparency
Tasks can be completed despite failures
E.g. message retransmission, failure of a
Web server node should not bring down the website.
21
Distribution Transparency II
Replication transparency
Access to replicated resources as if there was just one.
And provide enhanced reliability and performance without
knowledge of the replicas by users or application
programmers.
Migration (mobility/relocation) transparency
Allow the movement of resources and clients within a
system without affecting the operation of users or
applications.
E.g. switching from one name server to another at runtime;
migration of an agent/process from one node to another.
22
Distribution Transparency III
Concurrency transparency
A process should not notice that there are other
sharing the same resources
Performance transparency:
Allows the system to be reconfigured to improve
performance as loads vary
E.g., dynamic addition/deletion of components,
switching from linear structures to hierarchical
structures when the number of users increases
Scaling transparency:
Allows the system and applications to expand in scale
without changes in the system structure or the
application algorithms.
23
Fault Tolerance
Failure: an offered service no longer complies
with its specification
Fault: cause of a failure (e.g. crash of a
component)
Fault tolerance: no failure despite faults
24
Fault Tolerance Mechanisms
Fault detection
Checksums, heartbeat, …
Fault masking
Retransmission of corrupted messages,
redundancy, …
Fault toleration
Exception handling, timeouts,…
Fault recovery
Rollback mechanisms,…
25
Scalability
System should work efficiently at many different
scales, ranging from a small Intranet to the Internet
Remains effective when there is a significant
increase in the number of resources and the number
of users
26
Concurrency
Provide and manage concurrent access to
shared resources:
Fair scheduling
Preserve dependencies (e.g. distributed
transactions)
Avoid deadlocks
27
Openness and Interoperability
Open system:
"... a system that implements sufficient open
specifications for interfaces, services, and supporting
formats to enable properly engineered applications
software to be ported across a wide range of systems
with minimal changes, to interoperate with other
applications on local and remote systems, and to interact
with users in a style which facilitates user portability"
(Guide to the POSIX Open Systems Environment, IEEE
POSIX 1003.0)
Open spec/standard developers - communities:
ANSI, IETF, W3C, ISO, IEEE, OMG ... etc.
28
Security I
Resources are accessible to authorized users and
used in the way they are intended
Confidentiality
Protection against disclosure to unauthorized individual
information
E.g. ACLs (access control lists) to provide authorized
access to information
Integrity
Protection against alteration or corruption
E.g. changing the account number or amount value in a
money order
29
Security II
Availability
Protection against interference targeting access to
the resources.
E.g. denial of service (DoS, DDoS) attacks
Non-repudiation
Proof of sending / receiving
an information
E.g. digital signature
30
Security Mechanisms
Encryption
E.g. DES, AES, RSA
Authentication
E.g. password, public key authentication
Authorization
E.g. access control lists
31
Summary
Distributed Systems are everywhere
The Internet enables users throughout the world to
access its services wherever they are located
Resource sharing is the main motivating factor for
constructing distributed systems
Construction of DS produces many challenges:
Heterogeneity, Openness, Security, Scalability, Failure
handling, Concurrency, and Transparency
Distributed systems enable globalization:
Community (Virtual teams, organizations, social networks)
Science (e-Science)
Business (e-Bussiness)
32