0% found this document useful (0 votes)

88 views116 pages

Chapter 1 Computer Architecture and Organization

The document provides an overview of computer architecture and organization, detailing the basic concepts, performance, and structure of computer systems. It discusses the evolution of computers from vacuum tubes to microprocessors, highlighting key architectures such as IBM System/370 and System/360. Additionally, it explains the hierarchical nature of computer systems, their main components, and the fundamental functions they perform.

Uploaded by

rohobotkolaso787

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views116 pages

Chapter 1 Computer Architecture and Organization

Uploaded by

rohobotkolaso787

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 116

Computer Architecture and Organization

[ECEg - 4163]

Chapter One:
Overview of Computer Architecture and
Organization

Prepared by Amanuel Z. & Satenaw S.

Outline
 Basic Concepts and Computer Evolution
 Performance
 Computer system
1.1 Basic Concepts and Computer Evolution
1.1.1 Organization and Architecture
Computer architecture

Refers to those attributes of a system visible to a programmer or,

Those attributes that have a direct impact on the logical execution of a
program.

It defines:
➔
Instruction sets
➔
Data representation
➔
Techniques for addressing memory
➔
I/O mechanisms
3
Cont’d...

Computer organization refers to the operational units and their
interconnections that realize the architectural specifications.
➔
Control signals;
➔
Interfaces between the computer and peripherals; and
➔
The memory technology used.

4
Cont’d...
IBM System/370 Architecture

Was introduced in 1970

Included a number of models

Could upgrade to a more expensive, faster model without having to
abandon original software

New models are introduced with improved technology, but retain the same
architecture so that the customer’s software investment is protected

Architecture has survived to this day as the architecture of IBM’s
mainframe product line

5
1.1.2 Structure and Function

A computer is a complex system; contemporary computers contain
millions of elementary electronic components.

How can one clearly describe them?

The key to clearly describe them is to recognize the hierarchical nature of
most complex systems, including the computer [SIMO96].

A hierarchical system is a set of interrelated subsystems, each of the latter,
in turn, hierarchical in structure until we reach some lowest level of
elementary subsystem.

The hierarchical nature of complex systems is essential to both their design
and their description.
6
Cont’d…

The designer need only deal with a particular level of the system at a time.
At each level, the system consists of a set of components and their
interrelationships.

The behavior at each level depends only on a simplified, abstracted
characterization of the system at the next lower level.

At each level, the designer is concerned with structure and function:
➔
Structure: The way in which the components are interrelated.
➔
Function: The operation of each individual component as part of the
structure.

7
Cont’d…

In terms of description, we have two choices:
➔
Starting at the bottom and building up to a complete description, or
➔
Beginning with a top view and decomposing the system into its sub
parts.

Evidence from a number of fields suggests that the top down approach is
the clearest and most effective.

8
Cont’d...
Function
There are four basic functions that a computer can perform:

Data processing: Data may take a wide variety of forms and the range of processing
requirements is broad

Data storage: Short-term/Long-term

Data movement
➔
Input-output (I/O) - when data are received from or delivered to a device (peripheral)
that is directly connected to the computer
➔
Data communications – when data are moved over longer distances, to or from a
remote device

Control
➔
A control unit manages the computer’s resources and orchestrates the performance of
its functional parts in response to instructions
9
Cont’d…
Operating environment (source and destination of data)

Figure 1.1 depicts the basic

functions that a computer can
perform.

Figure 1.1 A Functional View of the Computer 10

Cont’d…
Figure 1.2 depicts the four possible types of operations.

Figure 1.2 Possible Computer Operations 11

Cont’d…

Figure 1.2 Possible Computer Operations 12

Cont’d…
Structure:

Figure 1.3 is the simplest possible depiction of a computer.

The computer interacts in some fashion with its external environment.
 All of its linkages to the external
environment can be classified as
peripheral devices or communication
lines.

Figure 1.3 The Computer

13
Cont’d…
 The greater concern in this
course is the internal structure
of the computer itself, which
is shown in Figure 1.4.

Figure 1.4 The Computer: Top-

Level Structure
14
Cont’d…
There are four main structural components:

Central processing unit (CPU): Controls the operation of the computer and
performs its data processing functions; often simply referred to as processor.

Main memory: Stores data.

I/O: Moves data between the computer and its external environment.

System interconnection: Some mechanism that provides for communication
among CPU, main memory, and I/O.
➔
A common example of system interconnection is by means of a system bus,
consisting of a number of conducting wires to which all the other
components attach.
15
Cont’d…
CPU
Its major structural components are as follows:

Control unit: Controls the operation of the CPU and hence the computer.

Arithmetic and logic unit (ALU): Performs the computer’s data
processing functions.

Registers: Provides storage internal to the CPU.

CPU interconnection: Some mechanism that provides for communication
among the control unit, ALU, and registers.

16
Cont’d…
Multicore Computer
Structure

Figure 1.5 Simplified View of Major Elements of a Multicore Computer 17

Cont’d…
 Central processing unit (CPU)
➔
Portion of the computer that fetches and executes instructions
➔
Consists of an ALU, a control unit, and registers
➔
Referred to as a processor in a system with a single processing unit
 Core
➔
An individual processing unit on a processor chip
➔
May be equivalent in functionality to a CPU on a single-CPU system
➔
Specialized processing units are also referred to as cores
 Processor
➔
A physical piece of silicon containing one or more cores
➔
Is the computer component that interprets and executes instructions
➔
Referred to as a multicore processor if it contains multiple cores
18
Cont’d…
Cache Memory
 Multiple layers of memory between the processor and main memory
 Is smaller and faster than main memory
 Used to speed up memory access by placing in the cache data from main
memory that is likely to be used in the near future
 A greater performance improvement may be obtained by using multiple
levels of cache, with level 1 (L1) closest to the core and additional levels
(L2, L3, etc.) progressively farther from the core

19
1.1.3 A Brief History of Computers
The First Generation:Vacuum Tubes

Vacuum tubes were used for digital logic elements and memory

IAS computer
➔
Fundamental design approach was the stored program concept
✔
Attributed to the mathematician John von Neumann
✔
First publication of the idea was in 1945 for the EDVAC
➔
In 1946 design began at the Princeton Institute for Advanced Studies
➔
Completed in 1952
➔
Prototype of all subsequent general-purpose computers
20
Cont’d… Figure 1.6 IAS structure

AC: Accumulator register

MQ: multiply-quotient register
MBR: memory buffer register
IBR: instruction buffer register
PC: program counter
MAR: memory address register
IR: instruction register
21
Cont’d…

Figure 1.7 IAS Memory Format

22
Cont’d…
Registers
• •Contains a word to be stored in memory or sent to the I/O unit
Memory buffer register (MBR) Contains a word to be stored in memory or sent to the I/O unit
• •Or is used to receive a word from memory or from the I/O unit
Or is used to receive a word from memory or from the I/O unit

••Specifies the address in memory of the word to be written

Memory address register (MAR) Specifies the address in memory of the word to be written
from
fromororread
readinto
intothe
theMBR
MBR

Instruction register (IR) • •Contains the 8-bit opcode instruction being executed
Contains the 8-bit opcode instruction being executed

• •Employed to temporarily hold the right-hand instruction from a

Instruction buffer register (IBR) Employed to temporarily hold the right-hand instruction from a
word
wordininmemory
memory

• •Contains the address of the next instruction pair to be fetched

Program counter (PC) Contains the address of the next instruction pair to be fetched
from
frommemory
memory

Accumulator (AC) and multiplier • •Employed to temporarily hold operands and results of ALU
Employed to temporarily hold operands and results of ALU
quotient (MQ) operations
operations
23
Cont’d…

M(X) = contents of memory

location whose address is X
(i:j) = bits i through j

Figure 1.8 The IAS Instruction Set 24

Cont’d…

Table 1.1 Partial Flowchart

of IAS Operation

25
Cont’d…
Second Generation: Transistors

Smaller

Cheaper

Dissipates less heat than a vacuum tube

Is a solid state device made from silicon

Was invented at Bell Labs in 1947

It was not until the late 1950’s that fully transistorized computers were
commercially available

26
Cont’d…
Table 1.2 Computer Generations

27
Cont’d…
Second Generation
Introduced:

More complex arithmetic and logic units and control units

The use of high-level programming languages

Provision of system software which provided the ability to:
➔
Load programs
➔
Move data to peripherals
➔
Libraries perform common computations
28
Cont’d…

Figure 1.9 An IBM 7094

Configuration
29
Cont’d…
 Discrete component
➔
Single, self-contained transistor
➔
Manufactured separately, packaged in their own containers, and
soldered or wired together onto Masonite-like circuit boards
➔
Manufacturing process was expensive and cumbersome

30
Cont’d…
Third Generation: Integrated Circuits

1958 – the invention of the integrated circuit

Microelectronics
➔
Small electronics

The two most important members of the third generation were the IBM
System/360 and the DEC PDP-8

31
Cont’d…

(b) Memory cell

(a) Gate

Figure 1.10 Fundamental Computer Elements

32
Cont’d…
Integrated Circuits

Data storage – provided by memory cells

Data processing – provided by gates

Data movement – the paths among components are used to move data
from memory to memory and from memory through gates to memory

Control – the paths among components can carry control signals

33
Cont’d…
Integrated Circuits

A computer consists of gates, memory cells, and interconnections among these
elements

The gates and memory cells are constructed of simple digital electronic
components

Exploits the fact that such components as transistors, resistors, and conductors
can be fabricated from a semiconductor such as silicon

Many transistors can be produced at the same time on a single wafer of silicon

Transistors can be connected with a processor metallization to form circuits
34
Cont’d…

Packaged chip

Figure 1.11 Relationship among Wafer, Chip, and Gate

35
Cont’d…

Figure 1.12 Growth in Transistor Count on Integrated Circuits

36
Cont’d…
Moore’s Law

1965; Gordon Moore – co-founder of Intel

Observed number of transistors that could be put on a single chip was

doubling every year

Consequences of Moore’s law:

The pace slowed to a
doubling every 18 months
in the 1970’s but has The cost of Computer becomes
sustained that rate ever The electrical path
computer logic and smaller and is more Reduction in power
since length is shortened, Fewer interchip
memory circuitry convenient to use in and cooling
increasing operating connections
has fallen at a a variety of requirements
speed
dramatic rate environments

37
Cont’d…
IBM System/360

Announced in 1964

Product line was incompatible with older IBM machines

Was the success of the decade and cemented IBM as the overwhelmingly
dominant computer vendor

The architecture remains to this day the architecture of IBM’s mainframe
computers

Was the industry’s first planned family of computers
➔
Models were compatible in the sense that a program written for one model
should be capable of being executed by another model in the series
38
Cont’d…
Family Characteristics

Similar or identical instruction set

Similar or identical operating system

Increasing speed

Increasing number of I/O ports

Increasing memory size

Increasing cost

Figure 1.13 PDP-8 Bus Structure

39
Cont’d…

Later Generations

LSI Large Scale Integration

VLSI Very Large Scale Integration

ULSI Ultra Large Scale Integration

Two of the most important of developments in later generations


Semiconductor Memory

Microprocessors
40
Cont’d…

Semiconductor Memory

In 1970 Fairchild produced the first relatively capacious semiconductor
memory
➔
Chip was about the size of a single core
➔
Could hold 256 bits of memory
➔
Non-destructive
➔
Much faster than core

41
Cont’d…
 In 1974 the price per bit of semiconductor memory dropped below the price per bit of
core memory
➔
There has been a continuing and rapid decline in memory cost accompanied by a
corresponding increase in physical memory density
➔
Developments in memory and processor technologies changed the nature of
computers in less than a decade


Since 1970 semiconductor memory has been through 13 generations
➔
1k, 4k, 16k, 64k, 256k, 1M, 4M, 16M, 64M, 256M, 1G, 4G, and, as of this writing, 8
Gb on a single chip (1 k = 210, 1 M = 220, 1 G = 230).
➔
Each generation has provided four times the storage density of the previous
generation, accompanied by declining cost per bit and declining access time.
42
Cont’d…

Microprocessors

The density of elements on processor chips continued to rise
➔
More and more elements were placed on each chip so that fewer and
fewer chips were needed to construct a single computer processor

1971 Intel developed 4004
➔
First chip to contain all of the components of a CPU on a single chip
➔
Birth of microprocessor

43
Cont’d…
 1972 Intel developed 8008
➔
First 8-bit microprocessor

1974 Intel developed 8080
➔
First general purpose microprocessor
➔
Faster, has a richer instruction set, has a large addressing capability

44
Cont’d…
Table 1.3 Evolution of Intel Microprocessors (a) 1970s Processors

(b) 1980s Processors

45
Cont’d…
Table 1.3 Evolution of Intel Microprocessors (c) 1990s Processors

(d) Recent Processors

46
Cont’d…
The Evolution of the Intel x86 Architecture

Two processor families are the Intel x86 and the ARM architectures

Current x86 offerings represent the results of decades of design effort on
complex instruction set computers (CISCs)

An alternative approach to processor design is the reduced instruction set
computer (RISC)

ARM architecture is used in a wide variety of embedded systems and is one
of the most powerful and best-designed RISC-based systems on the market

47
Cont’d…
Highlights of the Evolution of the Intel Product Line:
8080 8086 80286 80386 80486
• World’s first general- • A more powerful 16- • Extension of the 8086 • Intel’s first 32-bit • Introduced the use of
purpose bit machine enabling addressing a machine much more
microprocessor • Has an instruction 16-MB memory sophisticated and
instead of just 1MB • First Intel processor
cache, or queue, that powerful cache
• 8-bit machine, 8-bit to support technology and
prefetches a few multitasking
data path to memory instructions before sophisticated
• Was used in the first they are executed instruction
• The first appearance pipelining
personal computer
(Altair) of the x86 architecture • Also offered a built-
• The 8088 was a in math coprocessor
variant of this
processor and used in
IBM’s first personal
computer (securing
the success of Intel

48
Cont’d…
Highlights of the Evolution of the Intel Product Line:
Pentium

Intel introduced the use of superscalar techniques, which allow multiple
instructions to execute in parallel
Pentium II

An alternative approach to processor design is the reduced instruction set
computer (RISC)
Pentium III

Incorporated additional floating-point instructions

Streaming SIMD Extensions (SSE)
49
Cont’d…
Highlights of the Evolution of the Intel Product Line:
Pentium 4

Includes additional floating-point and other enhancements for multimedia
Core

First Intel x86 micro-core
Core 2

Extends the Core architecture to 64 bits

Core 2 Quad provides four cores on a single chip

More recent Core offerings have up to 10 cores per chip

An important addition to the architecture was the Advanced Vector Extensions
instruction set
50
Cont’d…
ARM
 Refers to a processor architecture that has evolved from RISC design principles
and is used in embedded systems
 Family of RISC-based microprocessors and microcontrollers designed by ARM
Holdings, Cambridge, England.
 Chips are high-speed processors that are known for their small die size and low
power requirements.
 Probably the most widely used embedded processor architecture and indeed the
most widely used processor architecture of any kind in the world.
 Acorn RISC Machine/Advanced RISC Machine.
51
Cont’d…

ARM Products

Cortex-M
• Cortex-M0
Cortex-R • Cortex-M0+
• Cortex-M3
Cortex-A/Cortex- • Cortex-M4
A50

52
1.2 Performance Issues
 Designing for Performance
➔ Microprocessor Speed

➔ Performance Balance

➔ Improvements in Chip Organization and Architecture

 Multicore, MICs, and GPGPUs

 Two Laws that Provide Insight: Amdahl’s Law and Little’s Law
 Basic Measures of Computer Performance
➔ Clock Speed, & Instruction Execution Rate

 Calculating the Mean

➔ Arithmetic, Harmonic, and Geometric Mean

 Benchmarks and SPEC

➔ Benchmark Principles, and SPEC Benchmarks

53
1.2.1 Designing for Performance
 The cost of computer systems continues to drop dramatically, while the performance and capacity of
those systems continue to rise equally dramatically.

Today’s laptops have the computing power of an IBM mainframe from 10 or 15 years ago.

Processors are so inexpensive that we now have microprocessors we throw away.

Desktop applications that require the great power of today’s microprocessor-based systems include:
➔
Image processing ➔ Multimedia authoring
➔
Three-dimensional rendering ➔ Voice and video annotation of files
➔
Speech recognition ➔ Simulation modeling
➔
Video conferencing

Businesses are relying on increasingly powerful servers to handle transaction and database processing
and to support massive client/server networks that have replaced the huge mainframe computer centers
of yesteryear.

Cloud service providers use massive high-performance banks of servers to satisfy high-volume, high-
transaction-rate applications for a broad spectrum of clients.
54
Microprocessor Speed
Techniques built into contemporary processors include:

Pipelining: Processor moves data or instructions into a conceptual pipe with all stages of
the pipe processing simultaneously

Branch prediction: Processor looks ahead in the instruction code fetched from memory
and predicts which branches, or groups of instructions, are likely to be processed next

Superscalar execution: This is the ability to issue more than one instruction in every
processor clock cycle. (In effect, multiple parallel pipelines are used.)

Data flow analysis: Processor analyzes which instructions are dependent on each other’s
results, or data, to create an optimized schedule of instructions

Speculative execution: Using branch prediction and data flow analysis, some processors
speculatively execute instructions ahead of their actual appearance in the program
execution, holding the results in temporary locations, keeping execution engines as busy as
possible
55
Performance Balance

Adjust the organization and architecture to compensate for the mismatch among the
capabilities of the various components
Architectural examples include:

Increase the number of bits that are retrieved at one time by making DRAMs “wider” rather
than “deeper” and by using wide bus data paths

Change the DRAM interface to make it more efficient by including a cache or other
buffering scheme on the DRAM chip.

Reduce the frequency of memory access by incorporating increasingly complex and
efficient cache structures between the processor and main memory. This includes the
incorporation of one or more caches on the processor chip as well as on an off-chip cache
close to the processor chip.

Increase the interconnect bandwidth between processors and memory by using higher-speed
buses and a hierarchy of buses to buffer and structure data flow.
56
Cont’d…

Figure 1.14 Typical I/O Device Data Rates

57
Improvements in Chip Organization and Architecture
There are three approaches to achieving increased processors peed:
1. Increase hardware speed of processor

Fundamentally due to shrinking logic gate size
✔
More gates, packed more tightly, increasing clock rate
✔
Propagation time for signals reduced
2. Increase size and speed of caches

Dedicating part of processor chip
✔
Cache access times drop significantly
3. Change processor organization and architecture

Increase effective speed of instruction execution
✔
Parallelism
58
Cont’d…
Problems with Clock Speed and Login Density
 Power
➔
Power density increases with density of logic and clock speed
➔
Dissipating heat

RC delay
➔
Speed at which electrons flow limited by resistance and capacitance of metal wires
connecting them
➔
Delay increases as the RC product increases
➔
As components on the chip decrease in size, the wire interconnects become thinner,
increasing resistance
➔
Also, the wires are closer together, increasing capacitance

Memory latency and throughput
➔
Memory access speed (latency) and transfer speed (throughput) lag processor speeds 59
Cont’d…

Figure 1.15 Processor Trends 60

1.2.2 Multicore, MICs, and GPGPUs
Multicore

The use of multiple processors on the same chip provides the potential to
increase performance without increasing the clock rate

Strategy is to use two simpler processors on the chip rather than one more
complex processor

With two processors larger caches are justified

As caches became larger it made performance sense to create two and then
three levels of cache on a chip

61
Cont’d…
Many Integrated Core (MIC))

Leap in performance as well as the challenges in developing software to exploit
such a large number of cores

The multicore and MIC strategy involves a homogeneous collection of general
purpose processors on a single chip

Graphics Processing Unit (GPU)


Core designed to perform parallel operations on graphics data

Traditionally found on a plug-in graphics card, it is used to encode and render
2D and 3D graphics as well as process video

Used as vector processors for a variety of applications that require repetitive
computations
62
Cont’d…
 Since GPUs perform parallel operations on multiple sets of data, they are
increasingly being used as vector processors for a variety of applications
that require repetitive computations.
 This blurs the line between the GPU and the CPU.
 When a broad range of applications are supported by such a processor, the
term general-purpose computing on GPUs (GPGPU) is used.

63
1.2.3 Two Laws that Provide Insight: Amdahl’s Law and Little’s Law
Amdahl’s Law
 Gene Amdahl

Deals with the potential speedup of a program using multiple processors
compared to a single processor

Illustrates the problems facing industry in the development of multi-core
machines
➔
Software must be adapted to a highly parallel execution environment to
exploit the power of parallel processing


Can be generalized to evaluate and design technical improvement in a
computer system
64
Cont’d…

Figure 1.16 Illustration of Amdahl’s Law 65

Cont’d…
Amdahl’s Law

Speedup = _Time to execute program on a single processor__

Time to execute program on N parallel processors
= T(1 - f ) + Tf = ____1____
T(1 - f ) + Tf (1 - f ) + _f_
N N
Two important conclusions can be drawn:
1. When f is small, the use of parallel processors has little effect.
2. As N approaches infinity, speedup is bound by 1/(1 - f ), so that there are
diminishing returns for using more processors.
66
Cont’d…

Figure 1.17 Amdahl’s Law for Multiprocessors 67

Cont’d…
Little’s Law

Fundamental and simple relation with broad applications

Can be applied to almost any system that is statistically in steady state, and in which there is
no leakage.

Queuing system
➔
If server is idle an item is served immediately, otherwise an arriving item joins a queue
➔
There can be a single queue for a single server or for multiple servers, or multiple queues
with one being for each of multiple servers

Average number of items in a queuing system equals the average rate at which items arrive
multiplied by the time that an item spends in the system
➔
Relationship requires very few assumptions
➔
Because of its simplicity and generality it is extremely useful
68
1.2.4 Basic Measures of Computer Performance
Clock Speed

Figure 1.18 System Clock 69

Cont’d…
Table 1.4 Performance Factors and System Attributes

70
1.2.5 Calculating the Mean

The three common

The use of benchmarks to compare systems formulas used for
involves calculating the mean value of a set of
data points related to execution time calculating a mean
are:

• Arithmetic
• Geometric
• Harmonic
71
Cont’d…
(a) Constant (11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11)
(b) Clustered around a central value (3, 5, 6, 6, 7, 7, 7,
8, 8, 9, 11)
(c) Uniform distribution (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
(d) Large-number bias (1, 4, 4, 7, 7, 9, 9, 10, 10, 11, 11)
(e) Small-number bias(1, 1, 2, 2, 3, 3, 5, 5, 8, 8, 11)
(f) Upper outlier (11, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
(g) Lower outlier (1, 11, 11, 11, 11, 11, 11, 11, 11, 11,
11)

MD = median
AM = arithmetic mean
GM = geometric mean
HM = harmonic mean

Figure 1.19 Comparison of Means on Various Data Sets (each set has a maximum data point value of 11)
72
Cont’d…

An Arithmetic Mean (AM) is an appropriate measure if the sum of all the measurements
is a meaningful and interesting value

The AM is a good candidate for comparing the execution time performance of several
systems
For example, suppose we were interested in using a system for large-scale simulation studies and
wanted to evaluate several alternative products. On each system we could run the simulation
multiple times with different input values for each run, and then take the average execution time
across all runs. The use of multiple runs with different inputs should ensure that the results are not
heavily biased by some unusual feature of a given input set. The AM of all the runs is a good
measure of the system’s performance on simulations, and a good number to use for system
comparison.
 The AM used for a time-based variable, such as program execution time, has the
important property that it is directly proportional to the total time
➔ If the total time doubles, the mean value doubles
73
Cont’d…
Table 1.5 A Comparison of Arithmetic and Harmonic Means for Rates

74
Cont’d…
Table 1.6 A Comparison of Arithmetic and Geometric Means for Normalized Results
(a) Results normalized to Computer A

(b) Results normalized to Computer B

75
Cont’d…
Table 1.7 Another Comparison of Arithmetic and Geometric Means for Normalized Results
(a) Results normalized to Computer A

(b) Results normalized to Computer B

76
1.2.5 Benchmarks and SPEC
Benchmark Principles
 Desirable characteristics of a benchmark program:
1. It is written in a high-level language, making it portable across different
machines
2. It is representative of a particular kind of programming domain or paradigm,
such as systems programming, numerical programming, or commercial
programming
3. It can be measured easily
4. It has wide distribution

77
System Performance Evaluation Corporation (SPEC)
 Benchmark suite
➔ A collection of programs, defined in a high-level language
➔ Together attempt to provide a representative test of a computer in a
particular application or system programming area

 SPEC
➔ An industry consortium
➔ Defines and maintains the best known collection of benchmark suites
aimed at evaluating computer systems
➔ Performance measurements are widely used for comparison and
research purposes
78
Cont’d…
SPEC CPU2006
 Best known SPEC benchmark suite
 Industry standard suite for processor intensive applications
 Appropriate for measuring performance for applications that spend most of their
time doing computation rather than I/O
 Consists of 17 floating point programs written in C, C++, and Fortran and 12
integer programs written in C and C++
 Suite contains over 3 million lines of code
 Fifth generation of processor intensive suites from SPEC

79
Cont’d…
Table 1.8 SPEC CPU2006
Integer Benchmarks

80
Cont’d…
Table 1.9 SPEC CPU2006
Floating-Point Benchmarks

81
Cont’d…
Terms Used in SPEC Documentation

Benchmark  Peak metric
➔
A program written in a high-level language that 
This enables users to attempt to optimize system
can be compiled and executed on any computer performance by optimizing the compiler output
that implements the compiler  Speed metric

System under test 
This is simply a measurement of the time it takes to
➔
This is the system to be evaluated execute a compiled benchmark

Used for comparing the ability of a computer to

Reference machine
complete single tasks
➔
This is a system used by SPEC to establish a  Rate metric
baseline performance for all benchmarks

This is a measurement of how many tasks a computer
➔
Each benchmark is run and measured on this can accomplish in a certain amount of time
machine to establish a reference time for that
benchmark

This is called a throughput, capacity, or rate measure

Base metric

Allows the system under test to execute simultaneous
tasks to take advantage of multiple processors
➔
These are required for all reported results and
have strict guidelines for compilation
82
Cont’d…

Figure 1.20 SPEC Evaluation Flowchart

83
Cont’d…
Table 1.10 Some SPEC CINT2006 Results
(a) Sun Blade 1000
Cont’d…
1.3 Computer system
A Top-Level View of Computer Function and Interconnection

Computer Components

Computer Function
➔
Instruction Fetch and Execute
➔
Interrupts
➔
I/O Function

Interconnection Structures

Bus Interconnection

86
Cont’d…
 At a top level, a computer consists of CPU (central processing unit),
memory, and I/O components, with one or more modules of each type.

These components are interconnected in some fashion to achieve the basic
function of the computer, which is to execute programs.

Thus, at a top level, we can characterize a computer system by describing
1.The external behavior of each component, that is, the data and control
signals that it exchanges with other components, and
2.The interconnection structure and the controls required to manage the
use of the interconnection structure.

87
1.3.1 Computer Components
 Contemporary computer designs are based on concepts developed by John von
Neumann at the Institute for Advanced Studies, Princeton

Referred to as the von Neumann architecture and is based on three key concepts:
➔
Data and instructions are stored in a single read-write memory
➔
The contents of this memory are addressable by location, without regard to the
type of data contained there
➔
Execution occurs in a sequential fashion (unless explicitly modified) from one
instruction to the next

Hardwired program
➔
The result of the process of connecting the various components in the desired
configuration
88
Cont’d…

(a) Programming in hardware

(b) Programming in software

Figure 1. Hardware and Software Approaches

89
Cont’d…
Software

A sequence of codes or instructions

Part of the hardware interprets each instruction and generates control
signals

Provide a new sequence of codes for each new program instead of rewiring
the hardware

90
Cont’d…
Major components:

CPU
➔
Instruction interpreter
➔
Module of general-purpose arithmetic and logic functions

I/O Components
➔
Input module
✔
Contains basic components for accepting data and instructions and
converting them into an internal form of signals usable by the system
➔
Output module
✔
Means of reporting result
91
Memory address register Memory buffer register
(MAR) (MBR) MEMORY
• Specifies the address in • Contains the data to be written
memory for the next read or into memory or receives the
write data read from memory

MAR
I/O address register I/O buffer register
(I/OAR) (I/OBR)
• Specifies a particular I/O device • Used for the exchange of data
between an I/O module and the
CPU

MBR

92
Cont’d…

Figure 1.21 Hardware and Software Approaches 93

1.3.2 Computer Function
 The basic function performed by a computer is execution of a program,
which consists of a set of instructions stored in memory.
 The processor does the actual work by executing instructions specified in
the program.
 Instruction processing consists of two steps:
➔ The processor reads (fetches) instructions from memory one at a time and

executes each instruction.

Figure 1.22 Basic Instruction Cycle 94

Cont’d…
Instruction Fetch and Execute

At the beginning of each instruction cycle the processor fetches an
instruction from memory

The program counter (PC) holds the address of the instruction to be fetched
next

The processor increments the PC after each instruction fetch so that it will
fetch the next instruction in sequence

The fetched instruction is loaded into the instruction register (IR)

The processor interprets the instruction and performs the required action

95
Cont’d…
These actions fall into four categories:

Processor-memory: Data transferred from processor to memory or from
memory to processor.

Processor-I/O: Data transferred to or from a peripheral device by
transferring between the processor and an I/O module.

Data processing: The processor may perform some arithmetic or logic
operation on data.

Control: An instruction may specify that the sequence of execution be
altered.

96
Cont’d…

(a) Instruction format

(b) Integer format

Program counter (PC) = Address of instruction 0001 = Load AC from memory

Instruction register (IR) = Instruction being executed 0010 = Store AC to memory

Accumulator (AC) = Temporary storage 0101 = Add to AC from memory

(c) Internal CPU registers (d) Partial list of opcodes

Figure 1.23 Characteristics of a Hypothetical Machine 97

Cont’d…

Figure 1.24 Example of Program Execution (contents of memory and registers in hexadecimal) 98
Cont’d…

Figure 1.25 Instruction Cycle State Diagram

99
Interrupts
Table 1.11 Classes of Interrupts

100
Cont’d…

(a) No interrupts (b) Interrupts; short I/O wait (c) Interrupts; long I/O wait
= interrupt occurs during course of execution of user program
Figure 1.25 Program Flow of Control without and with Interrupts 101
Cont’d…

Figure 1.26 Transfer of Control via Interrupts 102

Cont’d…

Figure 1.27 Instruction Cycle with Interrupts

103
Cont’d…

Figure 1.28 Program Timing: Short I/O Wait

104
Cont’d…

Figure 1.29 Program Timing: Short I/O Wait

105
Cont’d…

Figure 1.30 Instruction Cycle State Diagram, with Interrupts

106
Cont’d…

Figure 1.31 Transfer of Control with Multiple Interrupts 107

Cont’d…

Figure 1.32 Example Time Sequence of Multiple Interrupts 108

I/O Function

I/O module can exchange data directly with the processor

Processor can read data from or write data to an I/O module
➔
Processor identifies a specific device that is controlled by a particular I/O
module
➔
I/O instructions rather than memory referencing instructions

In some cases it is desirable to allow I/O exchanges to occur directly with
memory
➔
The processor grants to an I/O module the authority to read from or write to
memory so that the I/O memory transfer can occur without tying up the
processor
➔
The I/O module issues read or write commands to memory relieving the
processor of responsibility for the exchange
➔
This operation is known as direct memory access (DMA)
109
1.3.3 Interconnection Structures

Figure 1.33 Computer Modules 110

Cont’d…
The interconnection structure must support the following types of transfers:

Memory to Processor to I/O to Processor to I/O to or from

processor memory processor I/O memory

An I/O
module is
allowed to
Processor exchange
Processor Processor Processor data directly
reads an reads data
writes a unit sends data to with memory
instruction or a from an I/O
unit of data of data to the I/O without going
device via an through the
from memory memory device
I/O module processor
using direct
memory
access
111
1.3.4 Bus Interconnection

A communication pathway connecting two or more devices
➔
Key characteristic is that it is a shared transmission medium

Signals transmitted by any one device are available for reception by all other devices
attached to the bus
➔
If two devices transmit during the same time period their signals will overlap and
become garbled

Typically consists of multiple communication lines
➔
Each line is capable of transmitting signals representing binary 1 and binary 0

Computer systems contain a number of different buses that provide pathways between
components at various levels of the computer system hierarchy

System bus
➔
A bus that connects major computer components (processor, memory, I/O)

The most common computer interconnection structures are based on the use of one or
more system buses 112
Cont’d…
Data Bus
 Data lines that provide a path for moving data among system modules
 May consist of 32, 64, 128, or more separate lines
 The number of lines is referred to as the width of the data bus
 The number of lines determines how many bits can be transferred at a time
 The width of the data bus is a key factor in determining overall system
performance

113
Address Bus Control Bus

Used to designate the source or  Used to control the access and the use of
destination of the data on the data bus the data and address lines
➔
If the processor wishes to read a word  Because the data and address lines are
of data from memory it puts the shared by all components there must be a
address of the desired word on the means of controlling their use
address lines  Control signals transmit both command

Width determines the maximum possible and timing information among system
memory capacity of the system modules

Also used to address I/O ports  Timing signals indicate the validity of
➔
The higher order bits are used to select data and address information
a particular module on the bus and the  Command signals specify operations to
lower order bits select a memory be performed
location or I/O port within the module
Cont’d…

Figure 1.32 Bus Interconnection Scheme 115

Thank You !

Computer Organization and Architecture
67% (3)
Computer Organization and Architecture
111 pages
II BSC Computer Architecture
No ratings yet
II BSC Computer Architecture
76 pages
Computer Architecture
No ratings yet
Computer Architecture
112 pages
Computer Organization AND Architecture: Badrinath M. Kadam
100% (1)
Computer Organization AND Architecture: Badrinath M. Kadam
56 pages
Lecture2 - Number System, Operation, and Codes (Autosaved)
No ratings yet
Lecture2 - Number System, Operation, and Codes (Autosaved)
61 pages
EC8552 Computer Architecture and Organization Notes 1
No ratings yet
EC8552 Computer Architecture and Organization Notes 1
106 pages
Fundamentals of Computers
No ratings yet
Fundamentals of Computers
2 pages
Operating System
No ratings yet
Operating System
7 pages
Computer Architecture & Organisation Unit-1
No ratings yet
Computer Architecture & Organisation Unit-1
22 pages
Computer Architecture - Memory System
100% (1)
Computer Architecture - Memory System
22 pages
CS6303 Computer Architecture ACT Notes
No ratings yet
CS6303 Computer Architecture ACT Notes
76 pages
COMPUTER ORGANIZATION Unit 1
No ratings yet
COMPUTER ORGANIZATION Unit 1
19 pages
Computer Architecture Fundamentals
No ratings yet
Computer Architecture Fundamentals
40 pages
Introduction To Von Neumann Architecture
No ratings yet
Introduction To Von Neumann Architecture
8 pages
?registers & Bus System
No ratings yet
?registers & Bus System
7 pages
Chapter 1 - Introduction To Computer Architecture and Organization
No ratings yet
Chapter 1 - Introduction To Computer Architecture and Organization
21 pages
Co Unit 1 Notes
100% (1)
Co Unit 1 Notes
51 pages
Coa Previous Q Papers
No ratings yet
Coa Previous Q Papers
8 pages
PC Components Guide for Beginners
No ratings yet
PC Components Guide for Beginners
48 pages
Computer Architecture Unit 1
No ratings yet
Computer Architecture Unit 1
12 pages
Lecture #3 - Logic Gates and Boolean Algebra
No ratings yet
Lecture #3 - Logic Gates and Boolean Algebra
39 pages
Input Output Organization
100% (22)
Input Output Organization
30 pages
Advanced Computer Architecture: CSE-401 E
No ratings yet
Advanced Computer Architecture: CSE-401 E
71 pages
DIT 202 Computer Architecture & Organization
No ratings yet
DIT 202 Computer Architecture & Organization
2 pages
Microprocessor and Interfacing Techniques: (Course Code: CET208A) Credits-3
No ratings yet
Microprocessor and Interfacing Techniques: (Course Code: CET208A) Credits-3
147 pages
Intro to Computer Basics
No ratings yet
Intro to Computer Basics
38 pages
I Bcom Ca C PRG
No ratings yet
I Bcom Ca C PRG
17 pages
CS6402 Design and Analysis of Algorithms Question Bank
No ratings yet
CS6402 Design and Analysis of Algorithms Question Bank
27 pages
Clean Copy Software and Web Development 1-1 - 023757
No ratings yet
Clean Copy Software and Web Development 1-1 - 023757
23 pages
Computer Organization ... Bits
100% (1)
Computer Organization ... Bits
68 pages
Unit 2.3
No ratings yet
Unit 2.3
30 pages
Lesson 03 Introduction To Java & Basic Programming Concepts
No ratings yet
Lesson 03 Introduction To Java & Basic Programming Concepts
35 pages
Computational Thinking: Slide-1
No ratings yet
Computational Thinking: Slide-1
21 pages
Computer Architecture & Organization
No ratings yet
Computer Architecture & Organization
111 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
61 pages
Computer Organization and Architecture by William Stalling
No ratings yet
Computer Organization and Architecture by William Stalling
34 pages
Computer Architecture 3rd Edition by Moris Mano CH 12
No ratings yet
Computer Architecture 3rd Edition by Moris Mano CH 12
21 pages
Basics of COA: 1. (MCQ) (GATE-2023: 2M)
No ratings yet
Basics of COA: 1. (MCQ) (GATE-2023: 2M)
69 pages
Computer Organization Course Guide
No ratings yet
Computer Organization Course Guide
137 pages
Ca Short Notes 2nd Sem Bca
No ratings yet
Ca Short Notes 2nd Sem Bca
16 pages
Computer Organization and Architecture Notes
No ratings yet
Computer Organization and Architecture Notes
15 pages
System Software Anna University Question Bank With Key
No ratings yet
System Software Anna University Question Bank With Key
4 pages
Computer Number System Number System: I. Ii. Iii. IV
No ratings yet
Computer Number System Number System: I. Ii. Iii. IV
7 pages
CA Virtual Memory
No ratings yet
CA Virtual Memory
3 pages
Computer Architecture Q&A Guide
No ratings yet
Computer Architecture Q&A Guide
129 pages
Basic Concepts and Computer Evolution
No ratings yet
Basic Concepts and Computer Evolution
85 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
61 pages
COA Lec1
No ratings yet
COA Lec1
6 pages
Computer Architecture Course Guide
No ratings yet
Computer Architecture Course Guide
85 pages
U 1 Newcoa
No ratings yet
U 1 Newcoa
23 pages
Introduction
No ratings yet
Introduction
25 pages
Computer Organization
No ratings yet
Computer Organization
23 pages
Arch Leacture 1
No ratings yet
Arch Leacture 1
52 pages
Lec 1
No ratings yet
Lec 1
27 pages
01 Introduction Nya
No ratings yet
01 Introduction Nya
27 pages
CH01 COA10e
No ratings yet
CH01 COA10e
47 pages
1 Week
No ratings yet
1 Week
43 pages
Chapter 1
No ratings yet
Chapter 1
59 pages
CH01 COA11e
No ratings yet
CH01 COA11e
45 pages
Lecture1 All
No ratings yet
Lecture1 All
37 pages
Design Rules (Auto-Saved)
No ratings yet
Design Rules (Auto-Saved)
11 pages
Final
No ratings yet
Final
268 pages
Sub Netting
No ratings yet
Sub Netting
15 pages
Chapter 6 Memory Inteface
No ratings yet
Chapter 6 Memory Inteface
20 pages
Chapter 3.3 I8086 Programming
No ratings yet
Chapter 3.3 I8086 Programming
28 pages
Chapter 3.1 Instruction Sets and Fromat of I8086
No ratings yet
Chapter 3.1 Instruction Sets and Fromat of I8086
57 pages
3 Relational Data Modell
No ratings yet
3 Relational Data Modell
14 pages
Recent Trends in Parallel Computing
No ratings yet
Recent Trends in Parallel Computing
12 pages
Mas Survey
No ratings yet
Mas Survey
57 pages
L12-Principles of Message Passing1
No ratings yet
L12-Principles of Message Passing1
10 pages
On Chip Network by Natalie PDF
100% (1)
On Chip Network by Natalie PDF
141 pages
Cloud Computing Unit-1
No ratings yet
Cloud Computing Unit-1
51 pages
2 Computer System Architecture
50% (2)
2 Computer System Architecture
12 pages
Sharma - Blurring The Lines Between Blockchains and Database Systems - The Case Hyperledger Fabric
No ratings yet
Sharma - Blurring The Lines Between Blockchains and Database Systems - The Case Hyperledger Fabric
18 pages
PRIMA - CPP: Speeding Up 70B-Scale LLM Inference On Low-Resource Everyday Home Clusters
No ratings yet
PRIMA - CPP: Speeding Up 70B-Scale LLM Inference On Low-Resource Everyday Home Clusters
23 pages
Seminar Report On Palm Vein Technology
No ratings yet
Seminar Report On Palm Vein Technology
23 pages
Ibm Datastage - Training Day1
No ratings yet
Ibm Datastage - Training Day1
77 pages
Master of Science Applied Computer Science and Society (Acs)
No ratings yet
Master of Science Applied Computer Science and Society (Acs)
4 pages
CCL All
No ratings yet
CCL All
68 pages
Parallel Multi-Core Verilog HDL Simulation Based On Domain Partitioning
No ratings yet
Parallel Multi-Core Verilog HDL Simulation Based On Domain Partitioning
6 pages
6CS5 DS Unit-4
No ratings yet
6CS5 DS Unit-4
64 pages
Cloud Computing for Tech Enthusiasts
No ratings yet
Cloud Computing for Tech Enthusiasts
25 pages
Matrix Square Root Computation Algorithm
No ratings yet
Matrix Square Root Computation Algorithm
12 pages
Abstract - The Hardware Architecture Presented In: Hardware Implementation of Real Time Image Processing On FPGA
No ratings yet
Abstract - The Hardware Architecture Presented In: Hardware Implementation of Real Time Image Processing On FPGA
6 pages
Resource Sharing & DS Design Goals
No ratings yet
Resource Sharing & DS Design Goals
18 pages
FPGA Based Artificial Neural Network
No ratings yet
FPGA Based Artificial Neural Network
11 pages
Osu 4
No ratings yet
Osu 4
7 pages
P&DC Course Information Sheet
No ratings yet
P&DC Course Information Sheet
4 pages
Automated Performance Modeling of HPC Applications Using Machine Learning
No ratings yet
Automated Performance Modeling of HPC Applications Using Machine Learning
15 pages
Supercomputer
No ratings yet
Supercomputer
2 pages
Embedded Hardware Design Guide
No ratings yet
Embedded Hardware Design Guide
4 pages
An Approximate Algorithm For Maximum Inner Product Search Over Streaming Sparse Vectors
No ratings yet
An Approximate Algorithm For Maximum Inner Product Search Over Streaming Sparse Vectors
44 pages
Analytical Modeling of Parallel Programs S. Oliveira
No ratings yet
Analytical Modeling of Parallel Programs S. Oliveira
13 pages
ISW-BMC1-S135X InstallationGuide CSCZ F3140401675
No ratings yet
ISW-BMC1-S135X InstallationGuide CSCZ F3140401675
2 pages
Parallel Matlab: The Next Generation
No ratings yet
Parallel Matlab: The Next Generation
32 pages
TensorFlow for AI Researchers
No ratings yet
TensorFlow for AI Researchers
240 pages

Chapter 1 Computer Architecture and Organization

Uploaded by

Chapter 1 Computer Architecture and Organization

Uploaded by

Computer Architecture and Organization

Prepared by Amanuel Z. & Satenaw S.

Figure 1.1 depicts the basic

Figure 1.1 A Functional View of the Computer 10

Figure 1.2 Possible Computer Operations 11

Figure 1.2 Possible Computer Operations 12

Figure 1.3 The Computer

Figure 1.4 The Computer: Top-

Figure 1.5 Simplified View of Major Elements of a Multicore Computer 17

AC: Accumulator register

Figure 1.7 IAS Memory Format

••Specifies the address in memory of the word to be written

• •Employed to temporarily hold the right-hand instruction from a

• •Contains the address of the next instruction pair to be fetched

M(X) = contents of memory

Figure 1.8 The IAS Instruction Set 24

Table 1.1 Partial Flowchart

Figure 1.9 An IBM 7094

(b) Memory cell

Figure 1.10 Fundamental Computer Elements

Figure 1.11 Relationship among Wafer, Chip, and Gate

Figure 1.12 Growth in Transistor Count on Integrated Circuits

1965; Gordon Moore – co-founder of Intel

Observed number of transistors that could be put on a single chip was

Consequences of Moore’s law:

Figure 1.13 PDP-​­8 Bus Structure

Two of the most important of developments in later generations

(b) 1980s Processors

(d) Recent Processors

➔ Improvements in Chip Organization and Architecture

 Multicore, MICs, and GPGPUs

 Calculating the Mean

 Benchmarks and SPEC

Figure 1.14 Typical I/O Device Data Rates

Figure 1.15 Processor Trends 60

Graphics Processing Unit (GPU)

Figure 1.16 Illustration of Amdahl’s Law 65

Speedup = _Time to execute program on a single processor__

Figure 1.17 Amdahl’s Law for Multiprocessors 67

Figure 1.18 System Clock 69

The three common

(b) Results normalized to Computer B

(b) Results normalized to Computer B

Figure 1.20 SPEC Evaluation Flowchart

(a) Programming in hardware

(b) Programming in software

Figure 1. Hardware and Software Approaches

Figure 1.21 Hardware and Software Approaches 93

executes each instruction.

Figure 1.22 Basic Instruction Cycle 94

(a) Instruction format

(b) Integer format

Program counter (PC) = Address of instruction 0001 = Load AC from memory

Instruction register (IR) = Instruction being executed 0010 = Store AC to memory

Accumulator (AC) = Temporary storage 0101 = Add to AC from memory

(c) Internal CPU registers (d) Partial list of opcodes

Figure 1.23 Characteristics of a Hypothetical Machine 97

Figure 1.25 Instruction Cycle State Diagram

Figure 1.26 Transfer of Control via Interrupts 102

Figure 1.27 Instruction Cycle with Interrupts

Figure 1.28 Program Timing: Short I/O Wait

Figure 1.29 Program Timing: Short I/O Wait

Figure 1.30 Instruction Cycle State Diagram, with Interrupts

Figure 1.31 Transfer of Control with Multiple Interrupts 107

Figure 1.32 Example Time Sequence of Multiple Interrupts 108

Figure 1.33 Computer Modules 110

Memory to Processor to I/O to Processor to I/O to or from

Figure 1.32 Bus Interconnection Scheme 115

You might also like

Figure 1.13 PDP-8 Bus Structure