Comp 255 - Computer Organization - Processors

(12/12/2003)

"A computer is an electronic (?) device operating under control of instructions stored in its own memory unit that
  1. accepts data (input)
  2. processes data arithmetically and logically
  3. displays information (output) from the processing and/or
  4. stores the results for future use"

Processors or Central Processing Unit

The Central Processing Unit is the brain of the computer. It fetches and executes instructions stored in Main Memory. It is made up of a number of sub-components.

Instruction Excution

Fetch-Decode-Execute Cycle
  1. Fetch next instruction from memory to instruction register
  2. Increment program counter
  3. Decode instruction
  4. Fetch operand(s) (optional)
  5. Execute Instruction
  6. Write Back Results (optional)
and repeat


Von Neuman Architectures

G. Blaauw and F. Brooks in their book Computer Architecture list seven criteria for von-Neumann architectures (p. 590).
  1. Single stream instructions sequenced by an instruction counter
  2. Instructions stored in memory as addressable memory
  3. Instructions encoded as numbers - modifiable by arithmetic operations
  4. Radix 2
  5. Word length long enough for scientific calculation
  6. Single address - single operation instructions
  7. Single accumulator with Multiplier-Quotient register

CPU Organization - The Data Path

             registers    parallel
        bus  +-------+     buses
         +-> |       | -> |  -> |     Data flows clockwise
         |   +-------+    |     |     
         |   |       | -> |  -> |
         |   +-------+    |     |     Data is gated from
         |   |       | -> |  -> |     the registers thru
         |   +-------+    |     |     the parallel buses
         |     . . .      |     |     to the ALU. The
         |                |     |     result is stored back
         |   +-------+    |     |     to a register
         |-> | M A R |    |     | 
         |   +-------+    |     |     MAR and MBR registers
         |-> | M B R | -> |  -> |     provide access to  
         |   +-------+    |     |     Main Memory
         |               +-+   +-+ 
         |               | '---' |    Instructions control 
         |               \ A L U /    flow of data thru 
         |                +-----+     data path
         |                   |
         +<------------------+


RISC versus CISC

The "semantic gap" is the gap between machine code (i.e what the Instruction Set of a computer could do) and high-level languages (what the programmer wanted). During the 60's and 70's the approach to close this gap was to add more and more complex instructions using micro-code to implement them. Micro-coding kept the cost down (the hardware could be simpler), provided flexibility (bugs were easy to correct and new instructions could be added) and the cost of slower individual instruction execution (instructions were interpreted by the micro-code) was offset by faster execution of programs.

However, in the 80's people experimented with another approach. Close the gap by designing very fast, very simple machines which would promote the writing of very efficient compilers.In other words, instead of raising the hardware, lower the software.

The Instruction Set (of a level) is the set of all instructions available to the programmer at that level. Instruction Set Architectures (level 2), fall into one of two categories. CISC or Complete Instruction Set Computers typically have many instructions which are complex. They are usually implemented in micro-code. RISC or Reduced Instruction Set Computers sometimes called (Reduced Instruction Set Complexity) tend to have fewer instructions or instructions which are less complex. They are implemented directly.

RISC Design Principles (or Design Principles for Modern Computers)


Instruction-Level Parallelism

Pipelining

Diagram - 5 stage pipeline

    s1          s2           s3          s4          s5
 +------+    +------+    +-------+    +------+    +------+
 |instr |    |instr |    |operand|    |instr |    |write |
 |fetch |--->|decode|--->| fetch |--->| exe  |--->| back |
 |unit  |    | unit |    |  unit |    | unit |    | unit |
 +------+    +------+    +-------+    +------+    +------+
 

A five stage pipeline showing how instructions [1] - [5] progress through pipeline

s1: |[1]|[2]|[3]|[4]|[5]|[6]|[7]|[8]|[9]|
s2: |   |[1]|[2]|[3]|[4]|[5]|[6]|[7]|[8]|
s3: |   |   |[1]|[2]|[3]|[4]|[5]|[6]|[7]|
s4: |   |   |   |[1]|[2]|[3]|[4]|[5]|[6]|
s5: |   |   |   |   |[1]|[2]|[3]|[4]|[5]|

time  1   2   3   4   5   6   7

Note a five fold increase in instruction execution once the pipeline is full.
 

Superscalar Architecture : If one pipeline is good, two is better, and four is even better! But having four is too expensive - so the idea is to have one pipeline with multiple function units - Stage S4 in the above diagram. This makes sense in that the Instruction Execution Units are usually the slowest.
                                          s4
                                       +------+
                                       | ALU  |
                                       +------+
                                       +------+
                                       | ALU  |
                                       +------+
    s1          s2           s3        +------+        s5
 +------+    +------+    +-------+     | LOAD |     +------+
 |instr |    |instr |    |operand|     +------+     |write |
 |fetch |--->|decode|--->| fetch |-->  +------+  -->| back |
 |unit  |    | unit |    |  unit |     |STORE |     | unit |
 +------+    +------+    +-------+     +------+     +------+
                                       +------+
                                       | FPU  |
                                       +------+


Processor Level Parallelism

Flynn's Categories (1972)

Return to Comp 255 Home Page