Computer System Architecture Lecture Notes
Computer System Architecture Lecture Notes
• A control unit, which interprets the instructions in memory and causes them to be
executed
• Input and output (I/O) equipment operated by the control unit
This structure was outlined in von Neumann’s earlier proposal, which is worth quoting at
this point:
First: Because the device is primarily a computer, it will have to perform the elementary
operations of arithmetic most frequently. At any rate a central arithmetical part of the
device will probably have to exist and this constitutes the first specific part: CA.
Second: The logical control of the device, that is, the proper sequencing of its operations,
can be most efficiently carried out by a central control organ. By the central control and the
organs which perform it form the second specific part: CC
Third: Any device which is to carry out long and complicated sequences of operations
(specifically of calculations) must have a considerable memory . . . At any rate, the total
memory constitutes the third specific part of the device: M.
Fourth: The device must have organs to transfer . . . information from R into its specific parts
C and M. These organs form its input, the fourth specific part: I
Fifth: The device must have organs to transfer . . . from its specific parts C and M into R.
These organs form its output, the fifth specific part: O.
The control unit operates the IAS by fetching instructions from memory and executing them
one at a time. A more detailed structure diagram is shown in Figure 1.2. This figure reveals
that both the control unit and the ALU contain storage locations, called registers, defined as
follows:
Year by year, the cost of computer systems continues to drop dramatically, while the
performance and capacity of those systems continue to rise equally dramatically. Desktop
applications that require the great power of today’s microprocessor-based systems include
• Image processing
• Speech recognition
• Videoconferencing
• Multimedia authoring
• Voice and video annotation of files
• Simulation modeling
Microprocessor Speed
The evolution of Microprocessors continues to bear out Moore’s law. So long as this law
holds, chipmakers can unleash a new generation of chips every three years—with four times
as many transistors. In microprocessors, the addition of new circuits, and the speed boost
that comes from reducing the distances between them, has improved performance four- or
fivefold every three years or so since Intel launched its x86 family in 1978. The more
elaborate techniques for feeding the monster into contemporary processors are the
following:
• Branch prediction: The processor looks ahead in the instruction code fetched from
memory and predicts which branches, or groups of instructions, are likely to be processed
next
• Data flow analysis: The processor analyzes which instructions are dependent on
each other’s results, or data, to create an optimized schedule of instructions
• Speculative execution: Using branch prediction and data flow analysis, some
processors speculatively execute instructions ahead of their actual appearance in the
program execution, holding the results in temporary locations.
Performance Balance :
While processor power has raced ahead at breakneck speed, other critical components of
the computer have not kept up.The result is a need to look for performance balance: an
adjusting of the organization and architecture to compensate for the mismatch among the
capabilities of the various components.
The interface between processor and main memory is the most crucial pathway in the entire
computer because it is responsible for carrying a constant flow of program instructions and
data between memory chips and the processor.
There are a number of ways that a system architect can attack this problem, all of which are
reflected in contemporary computer designs. Consider the following examples:
• Increase the number of bits that are retrieved at one time by making DRAMs “wider”
rather than “deeper” and by using wide bus data paths.
• Change the DRAM interface to make it more efficient by including a cache7 or other
buffering scheme on the DRAM chip.
• Reduce the frequency of memory access by incorporating increasingly complex and
efficient cache structures between the processor and main memory.
• Increase the interconnect bandwidth between processors and memory by using
higherspeed buses and by using a hierarchy of buses to buffer and structure data flow.
However, as clock speed and logic density increase, a number of obstacles become more
significant:
• Power: As the density of logic and the clock speed on a chip increase, so does the
power density.
• RC delay: The speed at which electrons can flow on a chip between transistors is
limited by the resistance and capacitance of the metal wires connecting them;
specifically, delay increases as the RC product increases. As components on the chip
decrease in size, the wire interconnects become thinner, increasing resistance. Also,
the wires are closer together, increasing capacitance.
• Memory latency: Memory speeds lag processor speeds.
Beginning in the late 1980s, and continuing for about 15 years, two main strategies have
been used to increase performance beyond what can be achieved simply by increasing clock
speed. First, there has been an increase in cache capacity. Second, the instruction execution
logic within a processor has become increasingly complex to enable parallel execution of
instructions within the processor.
Two noteworthy design approaches have been pipelining and superscalar. A pipeline works
much as an assembly line in a manufacturing plant enabling different stages of execution of
different instructions to occur at the same time along the pipeline. A superscalar approach
in essence allows multiple pipelines within a single processor so that instructions that do not
depend on one another can be executed in parallel.
COMPUTER COMPONENTS
Virtually all contemporary computer designs are based on concepts developed by John von
Neumann at the Institute for Advanced Studies, Princeton. Such a design is referred to as
the von Neumann architecture and is based on three key concepts:
• The contents of this memory are addressable by location, without regard to the type
of data contained there.
• Execution occurs in a sequential fashion (unless explicitly modified) from one
instruction to the next.
COMPUTER FUNCTIONS:
At the beginning of each instruction cycle, the processor fetches an instruction from
memory. The program counter (PC) holds the address of the instruction to be fetched next,
the processor always increments the PC after each instruction fetch so that it will fetch the
next instruction in sequence.
For example, consider a computer in which each instruction occupies one 16-bit word of
memory. If the program counter is set to location 300. The processor will next fetch the
instruction at location 300. On next instruction cycles, it will fetch instructions from
locations 301,302,303,and so on.
The fetched instruction is loaded into a register in the processor known as the instruction
register (IR). The processor interprets the instruction and performs the required action. In
general, these actions fall into four categories:
that the next instruction be from location 182. The processor will remember this fact
by setting the program counter to 182.Thus,on the next fetch cycle, the instruction
will be fetched from location 182 rather than 150.