[CS:APP-ch4] Processor Architecture

CS:APP-ch4 Processor Architecture

ISA --> SEQ --> PIPE

4.1 The Y86-64 Instruction Set Architecture (ISA)

Programmer-Visible State

  • Program registers
  • Condition codes
  • Program Counter(PC)
  • Memory
  • Status code

Y86-64 Instructions

  1. type: code function
  2. register specifier bytes: rA rB
  3. additional 8-byte constant word: immediate data, displacement, destination

Aside RISC & CISC

  • Y86-64 include both

  • CISC: condition codes, variable length instructions, stack to store return address

  • RISC: use load/store architecture and regular instruction encoding, pass argument through registers

4.2 Logical Design and the Hardware Control Language HCL

4.3 Sequential Y86-64 Implementations

Organizing processing into stages

  • Fetch: icode, ifun, valC = 8-byte constant, valP = next PC
  • Decode: register valA = rA valB = rB
  • Execute: valE = valA OP valB, CC
  • Memory: valM = read/write from memory
  • Write back: two result to register file
  • PC Update: PC = valP

SEQ Hardware Structure

SEQ Timing

SEQ:

  • combinational logic
  • two memory devices
    1. clocked register ( PC, CC reg )
    2. random access memory ( register file, instruction memory, data memory )

Combinational logic does not require sequencing or control

Instruction memory read only therefore also not required

Required explicit control over sequencing

  • Program Counter: loaded with new instruction address every clock cycle
  • Condition Code register: loaded when integer operation
  • register file: two ports, allow two program registers be updated on every cycle
  • data memory: written only when rmmovq, pushq, call is executed

PRINCIPLE

never need to read back the state updated by an instruction in order to complete this instruction

states are loaded during the start of next cycle

4.4 General Principle of Pipelining

Throughput: number of instructions served per unit time

Latency: Total time required to perform a single instruction from beginning to end

Limitations

  • Nonuniform partitioning ( 每个part的执行时间不同造成delay )
  • Diminishing Returns of Deep Pipelining (分太多了)
  • Feedback (下一条指令要等上一条执行完)

4.5 Pipelined Y86-64 Implementations

SEQ+

  • PC update stage comes at the beginning

PIPE-

  • insert registers

Rearranging and Relabeling Signals

Next PC Prediction

Pipeline Hazard

  • stalling
  • forwarding
  • load/use data hazard
  • control hazard
  • exception