SlideShare une entreprise Scribd logo
1  sur  118
Télécharger pour lire hors ligne
Instruction Set Processor
Ajit Pal
Professor
Department of Computer Science and Engineering
Indian Institute of Technology Kharagpur
INDIA-721302
Computer Architecture and Organization
Introduction
 Starting point:
■ The specification of the MIPS instruction set drives the design of the
hardware.
■ Will restrict design to integer type instructions
 Identify common functions to all instructions, and within instruction
classes – easy to do in a RISC architecture
■ Instruction fetch
■ Access one or more registers
■ Use ALU
 Asserted signals – a high or low level of a signal which implies a logically
“true” condition … an “action” level. The text will only assert a logically
high level, ie., a “1”.
 Clocking
■ Assume “edge triggered” clocking (as opposed to level sensitive).
■ A storage circuit or flip-flop stores a value on the clock transition
edge.
■ Model is flip-flops with combinational logic between them
■ Propagation delay through combinations logic between storage
elements determines clock cycle length.
■ Single clock cycle vs. multi-clock cycle design approach
Single Versus Multi-clock Cycle Design
 Start out with a single “long” clock cycle for each instruction .
■ Entire instruction gets executed in a single clock pulse
■ Controller is pure combinational logic
■ Design is simple
■ You would think that a single clock cycle per instruction
execution would give us super high performance – but not so:
Slowest instruction determines speed of all instructions.
 Because various phases of the instructions need the same
hardware resource, and all is needed at the same time (clock
pulse)
■ Some hardware is redundant – another disadvantage of
single phase
Examples:
2 memories: instruction and data memory
2 adders and an ALU
Design Summary
 Has a performance bottleneck
■ The clock cycle time is determined by the longest path
in the machine
■ The simple jmp instruction will take as long as the load
word (lw)
■ The instruction which uses the longest data path
dictates the time for all others.
 What about a variable time clock design?
■ Still a single clock
■ Clock pulse interval is a function of the opcode
■ Average time for instruction theoretically improves
But
■ It difficult to implement - lots of overhead to overcome
 Let’s start simple with a single clock cycle design for
simplicity reasons and later convert to multi-clock cycle.
Basic Abstract View of the Data Path
Shows common functions for most instructions
Registers
Register #
Data
Register #
Data
memory
Address
Data
Register #
PC Instruction ALU
Instruction
memory
Address
Data Path for Instruction Fetching
PC
Instruction
memory
Read
address
Instruction
4
Add
Ajit Pal, IIT Kharagpur
Animating the Datapath
Instruction <- MEM[PC]
PC <- PC + 4
RD
Memory
ADDR
PC
Instruction
4
ADD
Basic Data Path for R-type Instruction
Red lines are for control signals
generated by the controller
Instruction
Registers
Write
register
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Write
data
ALU
result
ALU
Zero
RegWrite
ALU operation
3
Ajit Pal, IIT Kharagpur
Animating the Datapath
add rd, rs, rt
R[rd] <- R[rs] + R[rt];
5 5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
op rs rt rd functshamt
Operation
ALU Zero
Instruction
3
Adding the Data Path for lw & sw Instruction
Implements:
lw $t1, offset_value($t2)
sw $t1, offset_value($t2)
The offset value is a 16-bit signed immediate
field & must be sign extended to 32 bits
Immediate
offset data
→
Instruction
16 32
Registers
Write
register
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Data
memory
Write
data
Read
data
Write
data
Sign
extend
ALU
result
Zero
ALU
Address
MemRead
MemWrite
RegWrite
ALU operation3
Ajit Pal, IIT Kharagpur
Animating the Datapath
lw rt, offset(rs)
R[rt] <- MEM[R[rs] + s_extend(offset)];
Ajit Pal, IIT Kharagpur
Animating the Datapath
sw rt, offset(rs)
MEM[R[rs] + sign_extend(offset)] <- R[rt]
Adding the Data Path for beq Instruction
Implements beq $t1, $t2, offset
Offset is a signed 16 bit
immediate field, & thus must be
sign extended. In addition we
shift left by 2 (make low bits are 00)
to address to a word boundary
To PC
16 32
Sign
extend
ZeroALU
Sum
Shift
left 2
To branch
control logic
Branch target
PC + 4 from instruction datapath
Instruction
Add
Registers
Write
register
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Write
data
ALU operation
3
RegWrite
Ajit Pal, IIT Kharagpur
Animating the Datapath
beq rs, rt, offset
if (R[rs] == R[rt]) then
PC <- PC+4 + s_extend(offset<<2)
Putting It All Together
j instruction to be added later
Need controls circuits to drive control lines in red.
Two control units will be designd: ALU Control & “Main Control
Incremented PC
or beq branch
address
unsuccessful branch
Successful branch
PC
Instruction
memory
Read
address
Instruction
16 32
Add ALU
result
M
u
x
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Shift
left 2
4
M
u
x
ALU operation3
RegWrite
MemRead
MemWrite
PCSrc
ALUSrc
MemtoReg
ALU
result
Zero
ALU
Data
memory
Address
Write
data
Read
data M
u
x
Sign
extend
Add
Ajit Pal, IIT Kharagpur
MIPS Datapath I: Single-Cycle
Input is either register (R-type) or sign-extended
lower half of instruction (load/store)
Combining the datapaths for R-type instructions
and load/stores using two multiplexors
Data is either
from ALU (R-type)
or memory (load)
Ajit Pal, IIT Kharagpur
Animating the Datapath: R-type
Instruction
add rd,rs,rt
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
M
U
XALUSrc
MemtoReg
Ajit Pal, IIT Kharagpur
Animating the Datapath:
Load Instruction
lw rt,offset(rs)
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
M
U
XALUSrc
MemtoReg
Ajit Pal, IIT Kharagpur
Animating the Datapath:
Store Instruction
sw rt,offset(rs)
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
M
U
XALUSrc
MemtoReg
Ajit Pal, IIT Kharagpur
MIPS Datapath II: Single-Cycle
PC
Instruction
memory
Read
address
Instruction
16 32
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
ALU
result
Zero
Data
memory
Address
Write
data
Read
data
M
u
x
4
Add
M
u
x
ALU
RegWrite
ALU operation
3
MemRead
MemWrite
ALUSrc
MemtoReg
Adding instruction fetch
Separate instruction memory
as instruction and data read
occur in the same clock cycle
Separate adder as ALU operations and PC
increment occur in the same clock cycle
Ajit Pal, IIT Kharagpur
MIPS Datapath III: Single-Cycle
PC
Instruction
memory
Read
address
Instruction
16 32
Add ALU
result
M
u
x
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Shift
left 2
4
M
u
x
ALU operation3
RegWrite
MemRead
MemWrite
PCSrc
ALUSrc
MemtoReg
ALU
result
Zero
ALU
Data
memory
Address

Write
data
Read
data M
u
x
Sign
extend
Add
Adding branch capability and another multiplexor
Instruction address is either
PC+4 or branch target address
Extra adder needed as both
adders operate in each cycle
New
multiplexor
Important note: in a single-cycle implementation data cannot be stored
during an instruction – it only moves through combinational logic
Question: is the MemRead signal really needed?! Think of RegWrite…!
Ajit Pal, IIT Kharagpur
Datapath Executing add
add rd, rs, rt
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
Ajit Pal, IIT Kharagpur
Datapath Executing lw
lw rt,offset(rs)
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
Ajit Pal, IIT Kharagpur
Datapath Executing sw
sw rt,offset(rs)
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
Ajit Pal, IIT Kharagpur
Datapath Executing beq
beq r1,r2,offset
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
Ajit Pal, IIT Kharagpur
Summary
 Techniques described here to design datapaths and control are
at the core of all modern computer architecture
 Multicycle datapaths offer two great advantages over single-cycle
■ functional units can be reused within a single instruction if
they are accessed in different cycles – reducing the need to
replicate expensive logic
■ instructions with shorter execution paths can complete
quicker by consuming fewer cycles
 Modern computers, in fact, take the multicycle paradigm to a
higher level to achieve greater instruction throughput:
■ pipelining (next topic) where multiple instructions execute
simultaneously by having cycles of different instructions
overlap in the datapath
■ the MIPS architecture was designed to be pipelined
Ajit Pal, IIT Kharagpur
Control
 Control unit takes input from
■ the instruction opcode bits
 Control unit generates
■ ALU control input
■ write enable (possibly, read enable also)
signals for each storage element
■ selector controls for each multiplexor
Ajit Pal, IIT Kharagpur
ALU Control
 Plan to control ALU: main control sends a 2-bit ALUOp control
field to the ALU control. Based on ALUOp and funct field of
instruction the ALU control generates the 3-bit ALU control field
■ ALU control Func-
field tion
000 and
001 or
010 add
110 sub
111 slt
 ALU must perform
■ add for load/stores (ALUOp 00)
■ sub for branches (ALUOp 01)
■ one of and, or, add, sub, slt for R-type instructions, depending on
the instruction’s 6-bit funct field (ALUOp 10)
Main
Control
ALU
Control
2
ALUOp
6
Instruction
funct field
3
ALU
control
input
To
ALU
Ajit Pal, IIT Kharagpur
Setting ALU Control Bits
Instruction AluOp Instruction Funct Desired ALU
opcode operation Field ALU action control input
LW 00 load word xxxxxx add 010
SW 00 store word xxxxxx add 010
Branch eq 01 branch eq xxxxxx subtract 110
R-type 10 add 100000 add 010
R-type 10 subtract 100010 subtract 110
R-type 10 AND 100100 and 000
R-type 10 OR 100101 or 001
R-type 10 set on less 101010 set on less 111
Truth table for ALU control bits*
ALUOp Funct field Operation
ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0
0 0 X X X X X X 010
0 1 X X X X X X 110
1 X X X 0 0 0 0 010
1 X X X 0 0 1 0 110
1 X X X 0 1 0 0 000
1 X X X 0 1 0 1 001
1 X X X 1 0 1 0 111
Setting ALU Control Bits
Ajit Pal, IIT Kharagpur
Designing the Main Control
 Observations about MIPS instruction format
■ opcode is always in bits 31-26
■ two registers to be read are always rs (bits 25-21) and rt
(bits 20-16)
■ base register for load/stores is always rs (bits 25-21)
■ 16-bit offset for branch equal and load/store is always
bits 15-0
■ destination register for loads is in bits 20-16 (rt) while for
R-type instructions it is in bits 15-11 (rd) (will require
multiplexor to select)
31-26 25-21 20-16 15-11 10-6 5-0
31-26 25-21 20-16 15-0
opcode
opcode
rs
rs
rt
rt address
rd shamt functR-type
Load/store
or branch
Ajit Pal, IIT Kharagpur
Datapath with Control I
MemtoReg
MemRead
MemWrite
ALUOp
ALUSrc
RegDst
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [20– 16]
Instruction [25– 21]
Add
Instruction [5– 0]
RegWrite
4
16 32Instruction [15– 0]
0
Registers
Write
register
Write
data
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
ALU
result
Zero
Data
memory
Address Read
data
M
u
x
1
0
M
u
x
1
0
M
u
x
1
0
M
u
x
1
Instruction [15– 11]
ALU
control
Shift
left 2
PCSrc
ALU
Add
ALU
result
Adding control to the MIPS Datapath III (and a new multiplexor to select field to
specify destination register): what are the functions of the 9 control signals?
New
multiplexor
Ajit Pal, IIT Kharagpur
Control Signals
Signal Name Effect when deasserted Effect when asserted
RegDst The register destination number for the The register destination number for the
Write register comes from the rt field (bits 20-16) Write register comes from the rd field (bits 15-11)
RegWrite None The register on the Write register input is written
with the value on the Write data input
AlLUSrc The second ALU operand comes from the The second ALU operand is the sign-extended,
second register file output (Read data 2) lower 16 bits of the instruction
PCSrc The PC is replaced by the output of the adder The PC is replaced by the output of the adder
that computes the value of PC + 4 that computes the branch target
MemRead None Data memory contents designated by the address
input are put on the first Read data output
MemWrite None Data memory contents designated by the address
input are replaced by the value of the Wrt data input
MemtoReg The value fed to the register Write data input The value fed to the register Write data input
comes from the ALU comes from the data memory
Effects of the seven control signals
Ajit Pal, IIT Kharagpur
Datapath with Control II
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [20 16]
Instruction [25 21]
Add
Instruction [5 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31 26]
4
16 32
Instruction [15 0]
0
0M
u
x
0
1
Control
Add
ALU
result
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
M
u
x
1
ALU
result
Zero
PCSrc
Data
memory
Write
data
Read
data
M
u
x
1
Instruction [15 11]
ALU
control
Shift
left 2
ALU
Address
MIPS datapath with the control unit: input to control is the 6-bit instruction
opcode field, output is seven 1-bit signals and the 2-bit ALUOp signal
Ajit Pal, IIT Kharagpur
Datapath with
Control II (cont.)
Instruction RegDst ALUSrc
Memto-
Reg
Reg
Write
Mem
Read
Mem
Write Branch ALUOp1 ALUp0
R-format 1 0 0 1 0 0 0 1 0
lw 0 1 1 1 1 0 0 0 0
sw X 1 X 0 0 1 0 0 0
beq X 0 X 0 0 0 1 0 1
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [20 16]
Instruction [25 21]
Add
Instruction [5 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31 26]
4
16 32
Instruction [15 0]
0
0M
u
x
0
1
Control
Add
ALU
result
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
M
u
x
1
ALU
result
Zero
PCSrc
Data
memory
Write
data
Read
data
M
u
x
1
Instruction [15 11]
ALU
control
Shift
left 2
ALU
Address
Determining control signals for the MIPS datapath based on instruction opcode
PCSrc cannot be
set directly from the
opcode: zero test
outcome is required
Ajit Pal, IIT Kharagpur
Control Signals: R-Type Instruction
Control signals
shown in blue
1
0
0
0
1
???
Value depends on
funct
0
0
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
I32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
MUX RegDst
5
rd
I[15:11]
rt
I[20:16]
rs
I[25:21]
immediate/
offset
I[15:0]
0
1
0
1
1
0
10
Ajit Pal, IIT Kharagpur
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
I32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
MUX RegDst
5
rd
I[15:11]
rt
I[20:16]
rs
I[25:21]
immediate/
offset
I[15:0]
0
1
0
1
1
0
10
Control Signals: lw Instruction
0
Control signals
shown in blue
0
010
1
1
1
0
1
Ajit Pal, IIT Kharagpur
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
I32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
MUX RegDst
5
rd
I[15:11]
rt
I[20:16]
rs
I[25:21]
immediate/
offset
I[15:0]
0
1
0
1
1
0
10
Control Signals: sw Instruction
0
Control signals
shown in blue
X
010
1
X
0
1
0
Ajit Pal, IIT Kharagpur
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
I32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
MUX RegDst
5
rd
I[15:11]
rt
I[20:16]
rs
I[25:21]
immediate/
offset
I[15:0]
0
1
0
1
1
0
10
Control Signals: beq Instruction
Control signals
shown in blue
X
110
0
X
0
0
0
1 if Zero=1
Ajit Pal, IIT Kharagpur
Datapath with Control III
Shift
left 2
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Data
memory
Read
data
Write
data
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Instruction [15– 11]
Instruction [20– 16]
Instruction [25– 21]
Add
ALU
result
Zero
Instruction [5– 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
Jump
RegDst
ALUSrc
Instruction [31– 26]
4
M
u
x
Instruction [25– 0] Jump address [31– 0]
PC+4 [31– 28]
Sign
extend
16 32
Instruction [15– 0]
1
M
u
x
1
0
M
u
x
0
1
M
u
x
0
1
ALU
control
Control
Add ALU
result
M
u
x
0
1 0
ALU
Shift
left 2
26 28
Address
31-26 25-0
opcode addres
s
Jump
MIPS datapath extended to jumps:
control unit generates new Jump
control bit
New multiplexor with additional
control bit Jump
Composing jump
target address
Ajit Pal, IIT Kharagpur
Datapath Executing j
Ajit Pal, IIT Kharagpur
R-type Instruction: Step 1
add $t1, $t2, $t3 (active = bold)
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [20– 16]
Instruction [25– 21]
Add
Instruction [5– 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31– 26]
4
16 32
Instruction [15– 0]
0
0M
u
x
0
1
Control
Add ALU
result
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
Shift
left 2
M
u
x
1
ALU
result
Zero
Data
memory
Write
data
Read
data
M
u
x
1
Instruction [15– 11]
ALU
control
ALU
Address
Fetch instruction and increment PC count
Ajit Pal, IIT Kharagpur
R-type Instruction: Step 2
add $t1, $t2, $t3 (active = bold)
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [20– 16]
Instruction [25– 21]
Add
Instruction [5– 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31– 26]
4
16 32
Instruction [15– 0]
0
0M
u
x
0
1
Control
Add
ALU
result
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
Shift
left 2
M
u
x
1
ALU
result
Zero
Data
memory
Write
data
Read
data
M
u
x
1
Instruction [15– 11]
ALU
control
ALU
Address
Read two source registers from the register file
Ajit Pal, IIT Kharagpur
R-type Instruction: Step 3
add $t1, $t2, $t3 (active = bold)
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [20 16]
Instruction [25 21]
Add
Instruction [5 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31 26]
4
16 32
Instruction [15 0]
0
0M
u
x
0
1
ALU
control
Control
Add ALU
result
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
M
u
x
1
ALU
result
Zero
Data
memory
Read
dataAddress
Write
data
M
u
x
1
Instruction [15 11]
ALU
Shift
left 2
ALU operates on the two register operands
Ajit Pal, IIT Kharagpur
R-type Instruction: Step 4
add $t1, $t2, $t3 (active = bold)
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [20 16]
Instruction [25 21]
Add
Instruction [5 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31 26]
4
16 32
Instruction [15 0]
0
0M
u
x
0
1
ALU
control
Control
Shift
left 2
Add ALU
result
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
M
u
x
1
ALU
result
Zero
Data
memory
Write
data
Read
data
M
u
x
1
Instruction [15 11]
ALU
Address
Write result to register
Ajit Pal, IIT Kharagpur
Single-cycle Implementation Notes
 The steps are not really distinct as each instruction completes
in exactly one clock cycle – they simply indicate the sequence
of data flowing through the datapath
 The operation of the datapath during a cycle is purely
combinational – nothing is stored during a clock cycle
 Therefore, the machine is stable in a particular state at the
start of a cycle and reaches a new stable state only at the end
of the cycle
 Very important for understanding single-cycle computing
Ajit Pal, IIT Kharagpur
1. Fetch instruction and increment PC
2. Read base register from the register file: the base
register ($t2) is given by bits 25-21 of the instruction
3. ALU computes sum of value read from the register
file and the sign-extended lower 16 bits (offset) of
the instruction
4. The sum from the ALU is used as the address for the
data memory
5. The data from the memory unit is written into the
register file: the destination register ($t1) is given by
bits 20-16 of the instruction
Load Instruction Steps
lw $t1, offset ($t2)
Ajit Pal, IIT Kharagpur
Load Instruction
lw $t1, offset($t2)
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [15– 11]
Instruction [20– 16]
Instruction [25– 21]
Add
Instruction [5– 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31– 26]
4
16 32Instruction [15– 0]
0
0M
u
x
0
1
ALU
control
Control
Shift
left 2
Add ALU
result
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
M
u
x
1
ALU
result
Zero
Data
memory
Write
data
Read
data
M
u
x
1
ALU

Address
Ajit Pal, IIT Kharagpur
1. Fetch instruction and increment PC
2. Read two register ($t1 and $t2) from the register
file
3. ALU performs a subtract on the data values from
the register file; the value of PC+4 is added to the
sign-extended lower 16 bits (offset) of the
instruction shifted left by two to give the branch
target address
4. The Zero result from the ALU is used to decide
which adder result (from step 1 or 3) to store in
the PC
Branch Instruction Steps
beq $t1, $t2, offset
Ajit Pal, IIT Kharagpur
Branch Instruction
beq $t1, $t2, offset
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [15– 11]
Instruction [20– 16]
Instruction [25– 21]
Add
Instruction [5– 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31– 26]
4
16 32
Instruction [15– 0]
Shift
left 2
0
M
u
x
0
1
ALU
control
Control
Registers
Write
register
Write
data
Read
data 1
Read
register 1
Read
register 2
Sign
extend
1
ALU
result
Zero
Data
memory
Write
data
Read
dataM
u
x
Read
data 2
Add
ALU
result
M
u
x
0
1
M
u
x
1
0
ALU
Address
Ajit Pal, IIT Kharagpur
Implementation: ALU Control Block
Operation2
Operation1
Operation0
Operation
ALUOp1
F3
F2
F1
F0
F (5– 0)
ALUOp0
ALUOp
ALU control block
ALU control logic
Truth table for ALU control bits
ALUOp Funct field Operation
ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0
0 0 X X X X X X 010
0 1 X X X X X X 110
1 X X X 0 0 0 0 010
1 X X X 0 0 1 0 110
1 X X X 0 1 0 0 000
1 X X X 0 1 0 1 001
1 X X X 1 0 1 0 111
Ajit Pal, IIT Kharagpur
Implementation: Main Control Block
R-format Iw sw beq
Op0
Op1
Op2
Op3
Op4
Op5
Inputs
Outputs
RegDst
ALUSrc
MemtoReg
RegWrite
MemRead
MemWrite
Branch
ALUOp1
ALUOpO
Signal R-format lw sw beq
name
Op5 0 1 1 0
Op4 0 0 0 0
Op3 0 0 1 0
Op2 0 0 0 1
Op1 0 1 1 0
Op0 0 1 1 0
RegDst 1 0 x x
ALUSrc 0 1 1 0
MemtoReg 0 1 x x
RegWrite 1 1 0 0
MemRead 0 1 0 0
MemWrite 0 0 1 0
Branch 0 0 0 1
ALUOp1 1 0 0 0
ALUOP2 0 0 0 1
InputsOutputs
Truth table for main control signals
Main control PLA (programmable
logic array): principle underlying
PLAs is that any logical expression
can be written as a sum-of-products
Ajit Pal, IIT Kharagpur
 Assuming fixed-period clock every instruction
datapath uses one clock cycle implies:
■ CPI = 1
■ cycle time determined by length of the longest
instruction path (load)
●but several instructions could run in a
shorter clock cycle: waste of time
●consider if we have more complicated
instructions like floating point!
■ resources used more than once in the same
cycle need to be duplicated
●waste of hardware and chip area
Single-Cycle Design Problems
Ajit Pal, IIT Kharagpur
Example: Fixed-period clock vs. variable-period
clock in a single-cycle implementation
 Consider a machine with an additional floating point unit. Assume
functional unit delays as follows
■ memory: 2 ns., ALU and adders: 2 ns., FPU add: 8 ns., FPU multiply: 16
ns., register file access (read or write): 1 ns.
■ multiplexors, control unit, PC accesses, sign extension, wires: no delay
 Assume instruction mix as follows
■ all loads take same time and comprise 31%
■ all stores take same time and comprise 21%
■ R-format instructions comprise 27%
■ branches comprise 5%
■ jumps comprise 2%
■ FP adds and subtracts take the same time and totally comprise 7%
■ FP multiplys and divides take the same time and totally comprise 7%
 Compare the performance of (a) a single-cycle implementation using a fixed-
period clock with (b) one using a variable-period clock where each instruction
executes in one clock cycle that is only as long as it needs to be (not really
practical but pretend it’s possible!)
Ajit Pal, IIT Kharagpur
Solution
 Clock period for fixed-period clock = longest instruction time = 20
ns.
 Average clock period for variable-period clock = 8 × 31% +
7 × 21% + 6 × 27% + 5 × 5% + 2 × 2% + 20 × 7% + 12 × 7%
= 7.0 ns.
 Therefore, performancevar-period /performancefixed-period = 20/7 = 2.9
Instruction Instr. Register ALU Data Register FPU FPU Total
class mem. read oper. mem. write add/ mul/ time
sub div ns.
Load word 2 1 2 2 1 8
Store word 2 1 2 2 7
R-format 2 1 2 0 1 6
Branch 2 1 2 5
Jump 2 2
FP mul/div 2 1 1 16 20
FP add/sub 2 1 1 8 12
Ajit Pal, IIT Kharagpur
Fixing the problem with single-cycle
designs
 One solution: a variable-period clock with different cycle
times for each instruction class
■ unfeasible, as implementing a variable-speed clock is
technically difficult
 Another solution:
■ use a smaller cycle time…
■ …have different instructions take different numbers of
cycles
by breaking instructions into steps and fitting each step into
one cycle
■ feasible: multicyle approach!
Ajit Pal, IIT Kharagpur
 Break up the instructions into steps
■ each step takes one clock cycle
■ balance the amount of work to be done in each step/cycle
so that they are about equal
■ restrict each cycle to use at most once each major
functional unit so that such units do not have to be
replicated
■ functional units can be shared between different cycles
within one instruction
 Between steps/cycles
■ At the end of one cycle store data to be used in later cycles
of the same instruction
● need to introduce additional internal (programmer-
invisible) registers for this purpose
■ Data to be used in later instructions are stored in
programmer-visible state elements: the register file, PC,
memory
Multicycle Approach
Ajit Pal, IIT Kharagpur
 Our goal is to break up the instructions into steps
so that
■ each step takes one clock cycle
■ the amount of work to be done in each
step/cycle is about equal
■ each cycle uses at most once each major
functional unit so that such units do not have to
be replicated
■ functional units can be shared between
different cycles within one instruction
 Data at end of one cycle to be used in next must
be stored !!
Breaking Instructions into Steps
Ajit Pal, IIT Kharagpur
Breaking Instructions into Steps
 We break instructions into the following potential
execution steps – not all instructions require all the
steps – each step takes one clock cycle
1.Instruction fetch and PC increment (IF)
2.Instruction decode and register fetch (ID)
3.Execution, memory address computation, or branch
completion (EX)
4.Memory access or R-type instruction completion
(MEM)
5.Memory read completion (WB)
 Each MIPS instruction takes from 3 – 5 cycles (steps)
Ajit Pal, IIT Kharagpur
 Use PC to get instruction and put it in the instruction
register.
Increment the PC by 4 and put the result back in the
PC.
 Can be described succinctly using RTL (Register-
Transfer Language):
IR = Memory[PC];
PC = PC + 4;
Step 1: Instruction Fetch & PC
Increment (IF)
Ajit Pal, IIT Kharagpur
 Read registers rs and rt in case we need them.
Compute the branch address in case the
instruction is a branch.
 RTL:
A = Reg[IR[25-21]];
B = Reg[IR[20-16]];
ALUOut = PC + (sign-extend(IR[15-0]) << 2);
Step 2: Instruction Decode and
Register Fetch (ID)
Ajit Pal, IIT Kharagpur
 ALU performs one of four functions depending on
instruction type
■ memory reference:
ALUOut = A + sign-extend(IR[15-0]);
■ R-type:
ALUOut = A op B;
■ branch (instruction completes):
if (A==B) PC = ALUOut;
■ jump (instruction completes):
PC = PC[31-28] || (IR(25-0) << 2)
Step 3: Execution, Address Computation
or Branch Completion (EX)
Ajit Pal, IIT Kharagpur
 Again depending on instruction type:
 Loads and stores access memory
■ load
MDR = Memory[ALUOut];
■ store (instruction completes)
Memory[ALUOut] = B;
 R-type (instructions completes)
Reg[IR[15-11]] = ALUOut;
Step 4: Memory access or R-type
Instruction Completion (MEM)
Ajit Pal, IIT Kharagpur
 Again depending on instruction type:
 Load writes back (instruction completes)
Reg[IR[20-16]]= MDR;
 Important: There is no reason from a datapath (or
control) point of view that Step 5 cannot be
eliminated by performing
 Reg[IR[20-16]]= Memory[ALUOut];
 for loads in Step 4. This would eliminate the MDR
as well.
 The reason this is not done is that, to keep steps
balanced in length, the design restriction is to allow
each step to contain at most one ALU operation, or
one register access, or one memory access.
Step 5: Memory Read Completion (WB)
Ajit Pal, IIT Kharagpur
Summary of Instruction Execution
Step name
Action for R-type
instructions
Action for memory-reference
instructions
Action for
branches
Action for
jumps
Instruction fetch IR = Memory[PC]
PC = PC + 4
Instruction A = Reg [IR[25-21]]
decode/register fetch B = Reg [IR[20-16]]
ALUOut = PC + (sign-extend (IR[15-0]) << 2)
Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] II
computation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)
jump completion
Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]
completion ALUOut or
Store: Memory [ALUOut] = B
Memory read completion Load: Reg[IR[20-16]] = MDR
1: IF
2: ID
3: EX
4: MEM
5: WB
Step
Ajit Pal, IIT Kharagpur
Multicycle Execution Step (1):
Instruction Fetch
IR = Memory[PC];
PC = PC + 4;
4PC + 4
Ajit Pal, IIT Kharagpur
Multicycle Execution Step (2):
Instruction Decode & Register Fetch
A = Reg[IR[25-21]]; (A = Reg[rs])
B = Reg[IR[20-15]]; (B = Reg[rt])
ALUOut = (PC + sign-extend(IR[15-0]) << 2)
Branch
Target
Address
Reg[rs]
Reg[rt]
PC + 4
Ajit Pal, IIT Kharagpur
Multicycle Execution Step (3):
Memory Reference Instructions
ALUOut = A + sign-extend(IR[15-0]);
Mem.
Address
Reg[rs]
Reg[rt]
PC + 4
Ajit Pal, IIT Kharagpur
Multicycle Execution Step (3):
ALU Instruction (R-Type)
ALUOut = A op B
R-Type
Result
Reg[rs]
Reg[rt]
PC + 4
Ajit Pal, IIT Kharagpur
Multicycle Execution Step (3):
Branch Instructions
if (A == B) PC = ALUOut;
Branch
Target
Address
Reg[rs]
Reg[rt]
Branch
Target
Address
Ajit Pal, IIT Kharagpur
Multicycle Execution Step (3):
Jump Instruction
PC = PC[31-28] concat (IR[25-0] << 2)
Jump
Address
Reg[rs]
Reg[rt]
Branch
Target
Address
Ajit Pal, IIT Kharagpur
Multicycle Execution Step (4):
Memory Access - Read (lw)
MDR = Memory[ALUOut];
Mem.
Data
PC + 4
Reg[rs]
Reg[rt]
Mem.
Address
Ajit Pal, IIT Kharagpur
Multicycle Execution Step (4):
Memory Access - Write (sw)
Memory[ALUOut] = B;
PC + 4
Reg[rs]
Reg[rt]
Ajit Pal, IIT Kharagpur
Multicycle Execution Step (4):
ALU Instruction (R-Type)
Reg[IR[15:11]] = ALUOUT
R-Type
Result
Reg[rs]
Reg[rt]
PC + 4
Ajit Pal, IIT Kharagpur
Multicycle Execution Step (5):
Memory Read Completion (lw)
Reg[IR[20-16]] = MDR;
PC + 4
Reg[rs]
Reg[rt]Mem.
Data
Mem.
Address
Ajit Pal, IIT Kharagpur
 Note particularities of
multicyle vs. single-
diagrams
■ single memory for
data
and instructions
■ single ALU, no extra
adders
■ extra registers to
hold data between
clock cycles
Datapath Comparison
PC
Instruction
memory
Read
address
Instruction
16 32
Add ALU
result
M
u
x
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Shift
left 2
4
M
u
x
ALU operation3
RegWrite
MemRead
MemWrite
PCSrc
ALUSrc
MemtoReg
ALU
result
Zero
ALU
Data
memory
Address

Write
data
Read
data M
u
x
Sign
extend
Add
PC
Memory
Address
Instruction
or data
Data
Instruction
register
Registers
Register #
Data
Register #
Register #
ALU
Memory
data 
register
A
B
ALUOut
Single-cycle datapath
Multicycle datapath (high-level view)
Ajit Pal, IIT Kharagpur
Multicycle Datapath
Shift
left 2
PC
Memory

MemData
Write
data
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
M
u
x
0
1
M
u
x
0
1
4
Instruction
[15– 0]
Sign
extend
3216
Instruction
[25– 21]
Instruction
[20– 16]
Instruction
[15– 0]
Instruction
register
1 M
u
x
0
3
2
M
u
x
ALU
result
ALU
Zero
Memory
data
register
Instruction
[15– 11]

A
B
ALUOut
0
1
Address
Basic multicycle MIPS datapath handles R-type instructions and load/stores:
new internal register in red ovals, new multiplexors in blue ovals
Ajit Pal, IIT Kharagpur
Multicycle Datapath with Control I
Shift
left 2
MemtoReg
IorD MemRead MemWrite
PC
Memory
MemData
Write
data
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Instruction
[15– 11]
M
u
x
0
1
M
u
x
0
1
4
ALUOpALUSrcB
RegDst RegWrite
Instruction
[15– 0]
Instruction [5– 0]
Sign
extend
3216
Instruction
[25– 21]
Instruction
[20– 16]
Instruction
[15– 0]
Instruction
register
1 M
u
x
0
3
2
ALU
control
M
u
x
0
1
ALU
result
ALU
ALUSrcA
ZeroA
B
ALUOut
IRWrite
Address
Memory
data
register
With control lines and the ALU control block added – not all control lines are shown
Ajit Pal, IIT Kharagpur
Multicycle Datapath with Control II
Complete multicycle MIPS datapath (with branch and jump capability)
and showing the main control block and all control lines
Shift
left 2
PC
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Instruction
[15– 11]
M
u
x
0
1
M
u
x
0
1
4
Instruction
[15– 0]
Sign
extend
3216
Instruction
[25– 21]
Instruction
[20– 16]
Instruction
[15– 0]
Instruction
register
ALU
control
ALU
result
ALU
Zero
Memory
data
register

A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
ALUOp
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
Control
Outputs
Op
[5– 0]
Instruction
[31-26]
Instruction [5– 0]
M
u
x
0
2
Jump
address [31-0]Instruction [25– 0] 26 28
Shift
left 2
PC [31-28]
1
1 M
u
x
0
3
2
M
u
x
0
1
ALUOut
Memory
MemData
Write
data
Address
New multiplexorNew gates
For the jump address
Ajit Pal, IIT Kharagpur
Multicycle Control Step (1): Fetch
IR = Memory[PC];
PC = PC + 4;
1
0
1
0
1
0
X
0
X
0
010
1
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Memory
ADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
I
R
M
D
R
M
U
X
0
1
2
3
M
U
X
1
0
M
U
X
0
1A
B
ALU
OUT
0
1
2
M
U
X
<<2 CONCAT
28 32
M
U
X
0
1
ALUSrcA
jmpaddr
I[25:0]
rd
MUX
0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
Ajit Pal, IIT Kharagpur
Multicycle Control Step (2):
Instruction Decode & Register Fetch
A = Reg[IR[25-21]]; (A = Reg[rs])
B = Reg[IR[20-15]]; (B = Reg[rt])
ALUOut = (PC + sign-extend(IR[15-0]) << 2);
0
0
X
0
0
X
3
0
X
X
010
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Memory
ADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
I
R
M
D
R
M
U
X
0
1
2
3
M
U
X
1
0
M
U
X
0
1A
B
ALU
OUT
0
1
2
M
U
X
<<2 CONCAT
28 32
M
U
X
0
1
ALUSrcA
jmpaddr
I[25:0]
rd
MUX
0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
Ajit Pal, IIT Kharagpur
0
X
Multicycle Control Step (3):
Memory Reference Instructions
ALUOut = A + ssign-extend(IR[15-0]);
X
2
0
0
X
0 1
X
010
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Memory
ADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
I
R
M
D
R
M
U
X
0
1
2
3
M
U
X
1
0
M
U
X
0
1A
B
ALU
OUT
0
1
2
M
U
X
<<2 CONCAT
28 32
M
U
X
0
1
ALUSrcA
jmpaddr
I[25:0]
rd
MUX
0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
Ajit Pal, IIT Kharagpur
Multicycle Control Step (3):
ALU Instruction (R-Type)
ALUOut = A op B;
0
X
X
0
0
0
X
0 1
X
???
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Memory
ADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
I
R
M
D
R
M
U
X
0
1
2
3
M
U
X
1
0
M
U
X
0
1A
B
ALU
OUT
0
1
2
M
U
X
<<2 CONCAT
28 32
M
U
X
0
1
ALUSrcA
jmpaddr
I[25:0]
rd
MUX
0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
Ajit Pal, IIT Kharagpur
1 if
Zero=1
Multicycle Control Step (3):
Branch Instructions
if (A == B) PC = ALUOut;
0
X
X
0
0
X
0 1
1
011
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Memory
ADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
I
R
M
D
R
M
U
X
0
1
2
3
M
U
X
1
0
M
U
X
0
1A
B
ALU
OUT
0
1
2
M
U
X
<<2 CONCAT
28 32
M
U
X
0
1
ALUSrcA
jmpaddr
I[25:0]
rd
MUX
0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
Ajit Pal, IIT Kharagpur
Multicycle Execution Step (3):
Jump Instruction
PC = PC[21-28] concat (IR[25-0] << 2);
0
X
X
X
0
1
X
0 X
2
XXX
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Memory
ADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
I
R
M
D
R
M
U
X
0
1
2
3
M
U
X
1
0
M
U
X
0
1A
B
ALU
OUT
0
1
2
M
U
X
<<2 CONCAT
28 32
M
U
X
0
1
ALUSrcA
jmpaddr
I[25:0]
rd
MUX
0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
Ajit Pal, IIT Kharagpur
Multicycle Control Step (4):
Memory Access - Read (lw)
MDR = Memory[ALUOut];
0
X
X
X
1
0
1
0 X
X
XXX
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Memory
ADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
I
R
M
D
R
M
U
X
0
1
2
3
M
U
X
1
0
M
U
X
0
1A
B
ALU
OUT
0
1
2
M
U
X
<<2 CONCAT
28 32
M
U
X
0
1
ALUSrcA
jmpaddr
I[25:0]
rd
MUX
0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
Ajit Pal, IIT Kharagpur
Multicycle Execution Steps (4)
Memory Access - Write (sw)
Memory[ALUOut] = B;
0
X
X
X
0
0
1
1 X
X
XXX
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Memory
ADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
I
R
M
D
R
M
U
X
0
1
2
3
M
U
X
1
0
M
U
X
0
1A
B
ALU
OUT
0
1
2
M
U
X
<<2 CONCAT
28 32
M
U
X
0
1
ALUSrcA
jmpaddr
I[25:0]
rd
MUX
0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
Ajit Pal, IIT Kharagpur
10
0
X
0
X
0
XXX
X
X
1
1
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Memory
ADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
I
R
M
D
R
M
U
X
0
1
2
3
M
U
X
0
1
M
U
X
0
1A
B
ALU
OUT
0
1
2
M
U
X
<<2 CONCAT
28 32
M
U
X
0
1
ALUSrcA
jmpaddr
I[25:0]
rd
MUX
0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
Multicycle Control Step (4):
ALU Instruction (R-Type)
Reg[IR[15:11]] = ALUOut; (Reg[Rd] = ALUOut)
Ajit Pal, IIT Kharagpur
Multicycle Execution Steps (5)
Memory Read Completion (lw)
Reg[IR[20-16]] = MDR;
1
0
0
X
0
0
X
0 X
X
XXX
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Memory
ADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
I
R
M
D
R
M
U
X
0
1
2
3
M
U
X
0
1
M
U
X
0
1A
B
ALU
OUT
0
1
2
M
U
X
<<2 CONCAT
28 32
M
U
X
0
1
ALUSrcA
jmpaddr
I[25:0]
rd
MUX
0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
Ajit Pal, IIT Kharagpur
 Value of control signals is dependent upon:
■ what instruction is being executed
■ which step is being performed
 Use the information we have accumulated to
specify a finite state machine
■ specify the finite state machine graphically, or
■ use microprogramming
 Implementation is then derived from the
specification
Implementing Control
Ajit Pal, IIT Kharagpur
 Finite state machines (FSMs):
■ a set of states and
■ next state function, determined by current state and the input
■ output function, determined by current state and possibly
input
■ We’ll use a Moore machine – output based only on current
state
Review: Finite State Machines
Next-state
function
Current state
Clock
Output
function
Next
state
Outputs
Inputs
Ajit Pal, IIT Kharagpur
Example: Moore Machine
 The Moore machine below, given input a binary string
terminated by “#”, will output “even” if the string has an
even number of 0’s and “odd” if the string has an odd
number of 0’s
Even state Odd state
Output even state Output odd state
No
output
No
output
Output
“even”
Output
“odd”
0
0
1
1
# #
Start
Ajit Pal, IIT Kharagpur
FSM Control: High-level View
Memory access
instructions
(Figure 5.38)
R-type instructions
(Figure 5.39)
Branch instruction
(Figure 5.40)
Jump instruction
(Figure 5.41)
Instruction fetch/decode and register fetch
(Figure 5.37)
Start
ALUSrcA = 0
ALUSrcB = 11
ALUOp = 00

MemRead
ALUSrcA = 0
IorD = 0
IRWrite
ALUSrcB = 01
ALUOp = 00
PCWrite
PCSource = 00
Instruction fetch
Instruction decode/
Register fetch
(Op = 'LW') or (Op = 'SW') (Op = R-type)
(O
p
=
'BEQ
')
(Op='JMP')
0
1
Start
Memory reference FSM
(Figure 5.38)
R-type FSM
(Figure 5.39)
Branch FSM
(Figure 5.40)
Jump FSM
(Figure 5.41)

High-level view of FSM control
Instruction fetch and decode steps of every instruction is identical
Asserted signals
shown inside
state circles
Ajit Pal, IIT Kharagpur
FSM Control: Memory Reference
MemWrite
IorD = 1
MemRead
IorD = 1
ALUSrcA = 1
ALUSrcB = 10
ALUOp = 00
RegWrite
MemtoReg = 1
RegDst = 0
Memory address computation
(Op = 'LW') or (Op = 'SW')
Memory
access
Write-back step
(O
p
=
'SW
')
(Op='LW')
4
2
53
From state 1
To state 0
(Figure 5.37)
Memory
access
FSM control for
memory-reference
has 4 states
Ajit Pal, IIT Kharagpur
FSM Control: R-type Instruction
ALUSrcA = 1
ALUSrcB = 00
ALUOp = 10

RegDst = 1
RegWrite
MemtoReg = 0
Execution
R-type completion
6
7 
(Op = R-type)
From state 1
To state 0
(Figure 5.37)
FSM control to implement R-
type instructions has 2 states
Ajit Pal, IIT Kharagpur
FSM Control: Branch Instruction
Branch completion
8
(Op = 'BEQ')
From state 1
To state 0
(Figure 5.37)
ALUSrcA = 1
ALUSrcB = 00
ALUOp = 01
PCWriteCond
PCSource = 01
FSM control to implement branches has 1 state
Ajit Pal, IIT Kharagpur
FSM Control: Jump Instruction
Jump completion
9
(Op = 'J')
From state 1
To state 0
(Figure 5.37)
PCWrite
PCSource = 10
FSM control to implement jumps has 1 state
Ajit Pal, IIT Kharagpur
FSM Control: Complete View
PCWrite
PCSource = 10
ALUSrcA = 1
ALUSrcB = 00
ALUOp = 01
PCWriteCond
PCSource = 01
ALUSrcA =1
ALUSrcB = 00
ALUOp= 10
RegDst = 1
RegWrite
MemtoReg = 0
MemWrite
IorD = 1
MemRead
IorD = 1
ALUSrcA = 1
ALUSrcB = 10
ALUOp = 00
RegDst=0
RegWrite
MemtoReg=1

ALUSrcA = 0
ALUSrcB = 11
ALUOp = 00
MemRead
ALUSrcA = 0
IorD = 0
IRWrite
ALUSrcB = 01
ALUOp = 00
PCWrite
PCSource = 00
Instruction fetch
Instruction decode/
register fetch
Jump
completion
Branch
completionExecution
Memory address
computation
Memory
access
Memory
access R-type completion
Write-back step
(Op = 'LW') or (Op = 'SW') (Op = R-type)
(Op
=
'BEQ')
(Op='J')
(O
p
=
'SW
')
(Op='LW')
4
0
1
9862
753
Start
The complete FSM control for the multicycle MIPS datapath:
refer Multicycle Datapath with Control II
Labels on arcs
are conditions
that determine
next state
IF ID
EX
MEM
WB
Ajit Pal, IIT Kharagpur
Example: CPI in a multicycle CPU
 Assume
■ the control design of the previous slide
■ An instruction mix of 22% loads, 11% stores, 49% R-type
operations, 16% branches, and 2% jumps
 What is the CPI assuming each step requires 1 clock cycle?
 Solution:
■ Number of clock cycles from previous slide for each
instruction class:
● loads 5, stores 4, R-type instructions 4, branches 3,
jumps 3
■ CPI = CPU clock cycles / instruction count
= Σ (instruction countclass i × CPIclass i) / instruction
count
= Σ (instruction countclass I / instruction count) × CPIclass
I
= 0.22 × 5 + 0.11 × 4 + 0.49 × 4 + 0.16 × 3 + 0.02 × 3
= 4.04
Ajit Pal, IIT Kharagpur
FSM Control: Implementation
High-level view of FSM implementation: inputs to the combinational logic block are
the current state number and instruction opcode bits; outputs are the next state
number and control signals to be asserted for the current state
PCWrite
PCWriteCond
IorD
MemtoReg
PCSource
ALUOp
ALUSrcB
ALUSrcA
RegWrite
RegDst
NS3
NS2
NS1
NS0Op5
Op4
Op3
Op2
Op1
Op0
S3
S2
S1
S0
State register
IRWrite
MemRead
MemWrite
Instruction register
opcode field
Outputs
Control logic
Inputs
Four state bits are required for 10 states
Ajit Pal, IIT Kharagpur
FSM Control: PLA Implementation
Op5
Op4
Op3
Op2
Op1
Op0
S3
S2
S1
S0
IorD
IRWrite
MemRead
MemWrite
PCWrite
PCWriteCond
MemtoReg
PCSource1
ALUOp1
ALUSrcB0
ALUSrcA
RegWrite
RegDst
NS3
NS2
NS1
NS0
ALUSrcB1
ALUOp0
PCSource0
 Upper half is the AND plane that
computes all the products
 The products are carried to the lower
OR plane by the vertical lines
 The sum terms for each output is given
by the corresponding horizontal line
 E.g., IorD = S0.S1.S2.S3 + S0.S1.S2.S3
Ajit Pal, IIT Kharagpur
 ROM (Read Only Memory)
■ values of memory locations are fixed ahead of time
 A ROM can be used to implement a truth table
■ if the address is m-bits, we can address 2m entries in the ROM
■ outputs are the bits of the entry the address points to
FSM Control: ROM Implementation
m n
0 0 0 0 0 1 1
0 0 1 1 1 0 0
0 1 0 1 1 0 0
0 1 1 1 0 0 0
1 0 0 0 0 0 0
1 0 1 0 0 0 1
1 1 0 0 1 1 0
1 1 1 0 1 1 1
ROM
m = 3
n = 4
The size of an m-input n-output ROM is 2m x n bits – such a ROM can
be thought of as an array of size 2m with each entry in the array being
n bits
outputaddress
Ajit Pal, IIT Kharagpur
 First improve the ROM: break the table into two parts
■ 4 state bits give the 16 output signals – 24 x 16 bits of ROM
■ all 10 input bits give the 4 next state bits – 210 x 4 bits of ROM
■ Total – 4.3K bits of ROM
 PLA is much smaller
■ can share product terms
■ only need entries that produce an active output
■ can take into account don't cares
 PLA size = (#inputs #product-terms) + (#outputs
#product-terms)
■ FSM control PLA = (10x17)+(20x17) = 460 PLA cells
 PLA cells usually about the size of a ROM cell (slightly
bigger)
FSM Control: ROM vs. PLA
Ajit Pal, IIT Kharagpur
 Microprogramming is a method of specifying FSM control
that resembles a programming language – textual rather
graphic
■ this is appropriate when the FSM becomes very large, e.g., if
the instruction set is large and/or the number of cycles per
instruction is large
■ in such situations graphical representation becomes difficult
as there may be thousands of states and even more arcs
joining them
■ a microprogram is specification : implementation is by ROM
or PLA
 A microprogram is a sequence of microinstructions
■ each microinstruction has eight fields (label + 7 functional)
● Label: used to control microcode sequencing
● ALU control: specify operation to be done by ALU
● SRC1: specify source for first ALU operand
● SRC2: specify source for second ALU operand
● Register control: specify read/write for register file
● Memory: specify read/write for memory
● PCWrite control: specify the writing of the PC
● Sequencing: specify choice of next microinstruction
Microprogramming
Ajit Pal, IIT Kharagpur
Microprogramming
 The Sequencing field value determines the execution
order of the microprogram
■ value Seq : control passes to the sequentially next
microinstruction
■ value Fetch : branch to the first microinstruction to begin
the next MIPS instruction, i.e., the first microinstruction in
the microprogram
■ value Dispatch i : branch to a microinstruction based on
control input and a dispatch table entry (called
dispatching):
● Dispatching is implemented by means of creating a table,
called dispatch table, whose entries are microinstruction
labels and which is indexed by the control input. There may
be multiple dispatch tables – the value Dispatch i in the
sequencing field indicates that the i th dispatch table is to be
used
Ajit Pal, IIT Kharagpur
Control Microprogram
 The microprogram corresponding to the FSM control
shown graphically earlier:
Label
ALU
control SRC1 SRC2
Register
control Memory
PCWrite
control Sequencing
Fetch Add PC 4 Read PC ALU Seq
Add PC Extshft Read Dispatch 1
Mem1 Add A Extend Dispatch 2
LW2 Read ALU Seq
Write MDR Fetch
SW2 Write ALU Fetch
Rformat1 Func code A B Seq
Write ALU Fetch
BEQ1 Subt A B ALUOut-cond Fetch
JUMP1 Jump address Fetch
Dispatch ROM 1
Dispatch ROM 2
Op Opcode name Value
Op Opcode name Value
000000 R-format Rformat1
100011 lw LW2000010 jmp JUMP1
101011 sw SW2
000100 beq BEQ1
100011 lw Mem1
101011 sw Mem1
Microprogram containing 10 microinstructions
Dispatch Table 2
Dispatch Table 1
Ajit Pal, IIT Kharagpur
Microcode: Trade-offs
Specification advantages
■ easy to design and write
■ typically manufacturer designs architecture and microcode in
parallel
 Implementation advantages
■ easy to change since values are in memory (e.g., off-chip ROM)
■ can emulate other architectures
■ can make use of internal registers
 Implementation disadvantages
■ control is implemented nowadays on same chip as processor so
the advantage of an off-chip ROM does not exist
■ ROM is no longer faster than on-board cache
■ there is little need to change the microcode as general-purpose
computers are used far more nowadays than computers
designed for specific applications
Instruction RegDst RegWrite ALUSrc MemRea
d
MemWri
te
MemToReg PCSrc ALU
operation
R-format 1 1 0 0 0 0 0 0000
(and)
0001
(or)
0010
(add)
0110
(sub)
lw 0 1 1 1 0 1 0 0010
(add)
sw X 0 1 0 1 X 0 0010
(add)
beq x 0 0 0 0 X 1 or 0 0110
(sub)
Control
 We next add the control unit that generates
■ write signal for each state element
■ control signals for each multiplexer
■ ALU control signal
 Input to control unit: instruction opcode and function
code
Control Unit
 Divided into two parts
■ Main Control Unit
●Input: 6-bit opcode
●Output: all control signals for Muxes, RegWrite,
MemRead, MemWrite and a 2-bit ALUOp signal
■ ALU Control Unit
●Input: 2-bit ALUOp signal generated from Main
Control Unit and 6-bit instruction function code
●Output: 4-bit ALU control signal
Truth Table for Main Control Unit
Main Control Unit
ALU Control Unit
 Must describe hardware to compute 4-bit ALU
control input given
■ 2-bit ALUOp signal from Main Control Unit
■ function code for arithmetic
 Describe it using a truth table (can turn into gates):
ALU Control bits
0010
0010
0110
0010
0110
0000
0001
0111
Putting It All Together Again
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [20 16]
Instruction [25 21]
Add
Instruction [5 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31 26]
4
16 32
Instruction [15 0]
0
0M
u
x
0
1
Control
Add
ALU
result
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
M
u
x
1
ALU
result
Zero
PCSrc
Data
memory
Write
data
Read
data
M
u
x
1
Instruction [15 11]
ALU
control
Shift
left 2
ALU
Address
Use this for R-type, memory, & beq instructions scenarios.
Addition of the Unconditional Jump
 We now add one more op code to our single cycle design:
■ Op code 2: “j”
■ The format is op field 28-31 is a “2”
■ Remaining 26 low bits is the immediate target address
 The full 32 bit target address is computed by concatenating:
■ Upper 4 bits of PC+4
■ 26 bit immediate field of the jump instruction
■ Bits 00 in the lowest positions (word boundary)
■ See text chapter 3, p. 150
 An additional control line from the main controller will have to
be generated to select this “new” instruction
 A two bit shifter is also added to get the two low order zeros
Final Design with jump Instruction
Shift
left2
PC
Instruction
memory
Read
address
Instruction
[31–0]
Data
memory
Read
data
Write
data
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Instruction [15– 11]
Instruction [20– 16]
Instruction [25– 21]
Add
ALU
result
Zero
Instruction [5– 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
Jump
RegDst
ALUSrc
Instruction [31– 26]
4
M
u
x
Instruction [25–0] Jump address[31–0]
PC+4 [31– 28]
Sign
extend
16 32
Instruction [15– 0]
1
M
u
x
1
0
M
u
x
0
1
M
u
x
0
1
ALU
control
Control
Add ALU
result
M
u
x
0
1 0
ALU
Shift
left 2
26 28
Address
Ajit Pal, IIT Kharagpur
Thanks!

Contenu connexe

Tendances

embedded systems ppt 3
embedded systems ppt 3embedded systems ppt 3
embedded systems ppt 3pavan kumar
 
Microprocessor & Micro-controller
Microprocessor & Micro-controllerMicroprocessor & Micro-controller
Microprocessor & Micro-controllerOm Bheda
 
Unit II Arm 7 Introduction
Unit II Arm 7 IntroductionUnit II Arm 7 Introduction
Unit II Arm 7 IntroductionDr. Pankaj Zope
 
ATmega32-AVR microcontrollers-Part I
ATmega32-AVR microcontrollers-Part IATmega32-AVR microcontrollers-Part I
ATmega32-AVR microcontrollers-Part IVineethMP2
 
Instruction Set Architecture
Instruction Set ArchitectureInstruction Set Architecture
Instruction Set ArchitectureDilum Bandara
 
I2C Bus (Inter-Integrated Circuit)
I2C Bus (Inter-Integrated Circuit)I2C Bus (Inter-Integrated Circuit)
I2C Bus (Inter-Integrated Circuit)Varun Mahajan
 
ASIP (Application-specific instruction-set processor)
ASIP (Application-specific instruction-set processor)ASIP (Application-specific instruction-set processor)
ASIP (Application-specific instruction-set processor)Hamid Reza
 
Assembly Language Lecture 3
Assembly Language Lecture 3Assembly Language Lecture 3
Assembly Language Lecture 3Motaz Saad
 

Tendances (20)

ARM- Programmer's Model
ARM- Programmer's ModelARM- Programmer's Model
ARM- Programmer's Model
 
embedded systems ppt 3
embedded systems ppt 3embedded systems ppt 3
embedded systems ppt 3
 
06 mips-isa
06 mips-isa06 mips-isa
06 mips-isa
 
PPC 750
PPC 750PPC 750
PPC 750
 
Arm architecture
Arm architectureArm architecture
Arm architecture
 
Microprocessor & Micro-controller
Microprocessor & Micro-controllerMicroprocessor & Micro-controller
Microprocessor & Micro-controller
 
Unit II Arm 7 Introduction
Unit II Arm 7 IntroductionUnit II Arm 7 Introduction
Unit II Arm 7 Introduction
 
ARM Architecture
ARM ArchitectureARM Architecture
ARM Architecture
 
axi protocol
axi protocolaxi protocol
axi protocol
 
ATmega32-AVR microcontrollers-Part I
ATmega32-AVR microcontrollers-Part IATmega32-AVR microcontrollers-Part I
ATmega32-AVR microcontrollers-Part I
 
Instruction Set Architecture
Instruction Set ArchitectureInstruction Set Architecture
Instruction Set Architecture
 
EPROM, PROM & ROM
EPROM, PROM & ROMEPROM, PROM & ROM
EPROM, PROM & ROM
 
I2C Bus (Inter-Integrated Circuit)
I2C Bus (Inter-Integrated Circuit)I2C Bus (Inter-Integrated Circuit)
I2C Bus (Inter-Integrated Circuit)
 
ASIP (Application-specific instruction-set processor)
ASIP (Application-specific instruction-set processor)ASIP (Application-specific instruction-set processor)
ASIP (Application-specific instruction-set processor)
 
3 jump, loop and call instructions
3 jump, loop and call instructions3 jump, loop and call instructions
3 jump, loop and call instructions
 
Assembly Language Lecture 3
Assembly Language Lecture 3Assembly Language Lecture 3
Assembly Language Lecture 3
 
Interrupt 8085
Interrupt 8085Interrupt 8085
Interrupt 8085
 
ARM Processor
ARM ProcessorARM Processor
ARM Processor
 
LCD interfacing
LCD interfacingLCD interfacing
LCD interfacing
 
Multiplexers & Demultiplexers
Multiplexers & DemultiplexersMultiplexers & Demultiplexers
Multiplexers & Demultiplexers
 

En vedette

Chapter 4 the processor
Chapter 4 the processorChapter 4 the processor
Chapter 4 the processors9007912
 
05 instruction set design and architecture
05 instruction set design and architecture05 instruction set design and architecture
05 instruction set design and architectureWaqar Jamil
 
Mips implementation
Mips implementationMips implementation
Mips implementationhoang974
 
8 bit single cycle processor
8 bit single cycle processor8 bit single cycle processor
8 bit single cycle processorDhaval Kaneria
 
Designing of 8 BIT Arithmetic and Logical Unit and implementing on Xilinx Ver...
Designing of 8 BIT Arithmetic and Logical Unit and implementing on Xilinx Ver...Designing of 8 BIT Arithmetic and Logical Unit and implementing on Xilinx Ver...
Designing of 8 BIT Arithmetic and Logical Unit and implementing on Xilinx Ver...Rahul Borthakur
 
Computer Architecture – An Introduction
Computer Architecture – An IntroductionComputer Architecture – An Introduction
Computer Architecture – An IntroductionDilum Bandara
 

En vedette (8)

Chapter 4 the processor
Chapter 4 the processorChapter 4 the processor
Chapter 4 the processor
 
8 bit alu design
8 bit alu design8 bit alu design
8 bit alu design
 
05 instruction set design and architecture
05 instruction set design and architecture05 instruction set design and architecture
05 instruction set design and architecture
 
Case study of digital camera
Case study of digital cameraCase study of digital camera
Case study of digital camera
 
Mips implementation
Mips implementationMips implementation
Mips implementation
 
8 bit single cycle processor
8 bit single cycle processor8 bit single cycle processor
8 bit single cycle processor
 
Designing of 8 BIT Arithmetic and Logical Unit and implementing on Xilinx Ver...
Designing of 8 BIT Arithmetic and Logical Unit and implementing on Xilinx Ver...Designing of 8 BIT Arithmetic and Logical Unit and implementing on Xilinx Ver...
Designing of 8 BIT Arithmetic and Logical Unit and implementing on Xilinx Ver...
 
Computer Architecture – An Introduction
Computer Architecture – An IntroductionComputer Architecture – An Introduction
Computer Architecture – An Introduction
 

Similaire à Lec 12-15 mips instruction set processor

CMPN301-Pipelining_V2.pptx
CMPN301-Pipelining_V2.pptxCMPN301-Pipelining_V2.pptx
CMPN301-Pipelining_V2.pptxNadaAAmin
 
UNIT 3 - General Purpose Processors
UNIT 3 - General Purpose ProcessorsUNIT 3 - General Purpose Processors
UNIT 3 - General Purpose ProcessorsButtaRajasekhar2
 
Control unit-implementation
Control unit-implementationControl unit-implementation
Control unit-implementationWBUTTUTORIALS
 
multi cycle in microprocessor 8086 sy B-tech
multi cycle  in microprocessor 8086 sy B-techmulti cycle  in microprocessor 8086 sy B-tech
multi cycle in microprocessor 8086 sy B-techRushikeshThorat24
 
Basic structure of computers by aniket bhute
Basic structure of computers by aniket bhuteBasic structure of computers by aniket bhute
Basic structure of computers by aniket bhuteAniket Bhute
 
Lecture1 - Computer Architecture
Lecture1 - Computer ArchitectureLecture1 - Computer Architecture
Lecture1 - Computer ArchitectureVolodymyr Ushenko
 
POLITEKNIK MALAYSIA
POLITEKNIK MALAYSIAPOLITEKNIK MALAYSIA
POLITEKNIK MALAYSIAAiman Hud
 
Computer Organization and Architechuture basics
Computer Organization and Architechuture basicsComputer Organization and Architechuture basics
Computer Organization and Architechuture basicsLucky Sithole
 
L4 speeding-up-execution
L4 speeding-up-executionL4 speeding-up-execution
L4 speeding-up-executionrsamurti
 
Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1) Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1) Subhasis Dash
 
ICT-Lecture_12(VonNeumannArchitecture).pptx
ICT-Lecture_12(VonNeumannArchitecture).pptxICT-Lecture_12(VonNeumannArchitecture).pptx
ICT-Lecture_12(VonNeumannArchitecture).pptxSirRafiLectures
 
ARM - Advance RISC Machine
ARM - Advance RISC MachineARM - Advance RISC Machine
ARM - Advance RISC MachineEdutechLearners
 
Chapter1 basic structure of computers
Chapter1  basic structure of computersChapter1  basic structure of computers
Chapter1 basic structure of computersjismymathew
 
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesPragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesMarina Kolpakova
 

Similaire à Lec 12-15 mips instruction set processor (20)

CMPN301-Pipelining_V2.pptx
CMPN301-Pipelining_V2.pptxCMPN301-Pipelining_V2.pptx
CMPN301-Pipelining_V2.pptx
 
UNIT 3 - General Purpose Processors
UNIT 3 - General Purpose ProcessorsUNIT 3 - General Purpose Processors
UNIT 3 - General Purpose Processors
 
Control unit-implementation
Control unit-implementationControl unit-implementation
Control unit-implementation
 
multi cycle in microprocessor 8086 sy B-tech
multi cycle  in microprocessor 8086 sy B-techmulti cycle  in microprocessor 8086 sy B-tech
multi cycle in microprocessor 8086 sy B-tech
 
Lec02
Lec02Lec02
Lec02
 
Basic structure of computers by aniket bhute
Basic structure of computers by aniket bhuteBasic structure of computers by aniket bhute
Basic structure of computers by aniket bhute
 
Lecture1 - Computer Architecture
Lecture1 - Computer ArchitectureLecture1 - Computer Architecture
Lecture1 - Computer Architecture
 
POLITEKNIK MALAYSIA
POLITEKNIK MALAYSIAPOLITEKNIK MALAYSIA
POLITEKNIK MALAYSIA
 
Pipelining1
Pipelining1Pipelining1
Pipelining1
 
CPU.ppd
CPU.ppdCPU.ppd
CPU.ppd
 
Instruction codes
Instruction codesInstruction codes
Instruction codes
 
Computer Organization and Architechuture basics
Computer Organization and Architechuture basicsComputer Organization and Architechuture basics
Computer Organization and Architechuture basics
 
L4 speeding-up-execution
L4 speeding-up-executionL4 speeding-up-execution
L4 speeding-up-execution
 
Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1) Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1)
 
ICT-Lecture_12(VonNeumannArchitecture).pptx
ICT-Lecture_12(VonNeumannArchitecture).pptxICT-Lecture_12(VonNeumannArchitecture).pptx
ICT-Lecture_12(VonNeumannArchitecture).pptx
 
ARM - Advance RISC Machine
ARM - Advance RISC MachineARM - Advance RISC Machine
ARM - Advance RISC Machine
 
arm
armarm
arm
 
CAO.pptx
CAO.pptxCAO.pptx
CAO.pptx
 
Chapter1 basic structure of computers
Chapter1  basic structure of computersChapter1  basic structure of computers
Chapter1 basic structure of computers
 
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesPragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
 

Dernier

Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 
Risk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectRisk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectErbil Polytechnic University
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
Robotics Group 10 (Control Schemes) cse.pdf
Robotics Group 10  (Control Schemes) cse.pdfRobotics Group 10  (Control Schemes) cse.pdf
Robotics Group 10 (Control Schemes) cse.pdfsahilsajad201
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdfAkritiPradhan2
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfalene1
 
chpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMM
chpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMMchpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMM
chpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMMNanaAgyeman13
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
Levelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodLevelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodManicka Mamallan Andavar
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating SystemRashmi Bhat
 
National Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfNational Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfRajuKanojiya4
 
Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewsandhya757531
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptJohnWilliam111370
 
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfManish Kumar
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Sumanth A
 
Crushers to screens in aggregate production
Crushers to screens in aggregate productionCrushers to screens in aggregate production
Crushers to screens in aggregate productionChinnuNinan
 
GSK & SEAMANSHIP-IV LIFE SAVING APPLIANCES .pptx
GSK & SEAMANSHIP-IV LIFE SAVING APPLIANCES .pptxGSK & SEAMANSHIP-IV LIFE SAVING APPLIANCES .pptx
GSK & SEAMANSHIP-IV LIFE SAVING APPLIANCES .pptxshuklamittt0077
 
OOP concepts -in-Python programming language
OOP concepts -in-Python programming languageOOP concepts -in-Python programming language
OOP concepts -in-Python programming languageSmritiSharma901052
 

Dernier (20)

Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
Risk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectRisk Management in Engineering Construction Project
Risk Management in Engineering Construction Project
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
Robotics Group 10 (Control Schemes) cse.pdf
Robotics Group 10  (Control Schemes) cse.pdfRobotics Group 10  (Control Schemes) cse.pdf
Robotics Group 10 (Control Schemes) cse.pdf
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
 
chpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMM
chpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMMchpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMM
chpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMM
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
Levelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodLevelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument method
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating System
 
National Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfNational Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdf
 
Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overview
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
 
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
 
Crushers to screens in aggregate production
Crushers to screens in aggregate productionCrushers to screens in aggregate production
Crushers to screens in aggregate production
 
GSK & SEAMANSHIP-IV LIFE SAVING APPLIANCES .pptx
GSK & SEAMANSHIP-IV LIFE SAVING APPLIANCES .pptxGSK & SEAMANSHIP-IV LIFE SAVING APPLIANCES .pptx
GSK & SEAMANSHIP-IV LIFE SAVING APPLIANCES .pptx
 
OOP concepts -in-Python programming language
OOP concepts -in-Python programming languageOOP concepts -in-Python programming language
OOP concepts -in-Python programming language
 

Lec 12-15 mips instruction set processor

  • 1. Instruction Set Processor Ajit Pal Professor Department of Computer Science and Engineering Indian Institute of Technology Kharagpur INDIA-721302 Computer Architecture and Organization
  • 2. Introduction  Starting point: ■ The specification of the MIPS instruction set drives the design of the hardware. ■ Will restrict design to integer type instructions  Identify common functions to all instructions, and within instruction classes – easy to do in a RISC architecture ■ Instruction fetch ■ Access one or more registers ■ Use ALU  Asserted signals – a high or low level of a signal which implies a logically “true” condition … an “action” level. The text will only assert a logically high level, ie., a “1”.  Clocking ■ Assume “edge triggered” clocking (as opposed to level sensitive). ■ A storage circuit or flip-flop stores a value on the clock transition edge. ■ Model is flip-flops with combinational logic between them ■ Propagation delay through combinations logic between storage elements determines clock cycle length. ■ Single clock cycle vs. multi-clock cycle design approach
  • 3. Single Versus Multi-clock Cycle Design  Start out with a single “long” clock cycle for each instruction . ■ Entire instruction gets executed in a single clock pulse ■ Controller is pure combinational logic ■ Design is simple ■ You would think that a single clock cycle per instruction execution would give us super high performance – but not so: Slowest instruction determines speed of all instructions.  Because various phases of the instructions need the same hardware resource, and all is needed at the same time (clock pulse) ■ Some hardware is redundant – another disadvantage of single phase Examples: 2 memories: instruction and data memory 2 adders and an ALU
  • 4. Design Summary  Has a performance bottleneck ■ The clock cycle time is determined by the longest path in the machine ■ The simple jmp instruction will take as long as the load word (lw) ■ The instruction which uses the longest data path dictates the time for all others.  What about a variable time clock design? ■ Still a single clock ■ Clock pulse interval is a function of the opcode ■ Average time for instruction theoretically improves But ■ It difficult to implement - lots of overhead to overcome  Let’s start simple with a single clock cycle design for simplicity reasons and later convert to multi-clock cycle.
  • 5. Basic Abstract View of the Data Path Shows common functions for most instructions Registers Register # Data Register # Data memory Address Data Register # PC Instruction ALU Instruction memory Address
  • 6. Data Path for Instruction Fetching PC Instruction memory Read address Instruction 4 Add
  • 7. Ajit Pal, IIT Kharagpur Animating the Datapath Instruction <- MEM[PC] PC <- PC + 4 RD Memory ADDR PC Instruction 4 ADD
  • 8. Basic Data Path for R-type Instruction Red lines are for control signals generated by the controller Instruction Registers Write register Read data 1 Read data 2 Read register 1 Read register 2 Write data ALU result ALU Zero RegWrite ALU operation 3
  • 9. Ajit Pal, IIT Kharagpur Animating the Datapath add rd, rs, rt R[rd] <- R[rs] + R[rt]; 5 5 5 RD1 RD2 RN1 RN2 WN WD RegWrite Register File op rs rt rd functshamt Operation ALU Zero Instruction 3
  • 10. Adding the Data Path for lw & sw Instruction Implements: lw $t1, offset_value($t2) sw $t1, offset_value($t2) The offset value is a 16-bit signed immediate field & must be sign extended to 32 bits Immediate offset data → Instruction 16 32 Registers Write register Read data 1 Read data 2 Read register 1 Read register 2 Data memory Write data Read data Write data Sign extend ALU result Zero ALU Address MemRead MemWrite RegWrite ALU operation3
  • 11. Ajit Pal, IIT Kharagpur Animating the Datapath lw rt, offset(rs) R[rt] <- MEM[R[rs] + s_extend(offset)];
  • 12. Ajit Pal, IIT Kharagpur Animating the Datapath sw rt, offset(rs) MEM[R[rs] + sign_extend(offset)] <- R[rt]
  • 13. Adding the Data Path for beq Instruction Implements beq $t1, $t2, offset Offset is a signed 16 bit immediate field, & thus must be sign extended. In addition we shift left by 2 (make low bits are 00) to address to a word boundary To PC 16 32 Sign extend ZeroALU Sum Shift left 2 To branch control logic Branch target PC + 4 from instruction datapath Instruction Add Registers Write register Read data 1 Read data 2 Read register 1 Read register 2 Write data ALU operation 3 RegWrite
  • 14. Ajit Pal, IIT Kharagpur Animating the Datapath beq rs, rt, offset if (R[rs] == R[rt]) then PC <- PC+4 + s_extend(offset<<2)
  • 15. Putting It All Together j instruction to be added later Need controls circuits to drive control lines in red. Two control units will be designd: ALU Control & “Main Control Incremented PC or beq branch address unsuccessful branch Successful branch PC Instruction memory Read address Instruction 16 32 Add ALU result M u x Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Shift left 2 4 M u x ALU operation3 RegWrite MemRead MemWrite PCSrc ALUSrc MemtoReg ALU result Zero ALU Data memory Address Write data Read data M u x Sign extend Add
  • 16. Ajit Pal, IIT Kharagpur MIPS Datapath I: Single-Cycle Input is either register (R-type) or sign-extended lower half of instruction (load/store) Combining the datapaths for R-type instructions and load/stores using two multiplexors Data is either from ALU (R-type) or memory (load)
  • 17. Ajit Pal, IIT Kharagpur Animating the Datapath: R-type Instruction add rd,rs,rt 5 516 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction 32 M U X M U XALUSrc MemtoReg
  • 18. Ajit Pal, IIT Kharagpur Animating the Datapath: Load Instruction lw rt,offset(rs) 5 516 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction 32 M U X M U XALUSrc MemtoReg
  • 19. Ajit Pal, IIT Kharagpur Animating the Datapath: Store Instruction sw rt,offset(rs) 5 516 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction 32 M U X M U XALUSrc MemtoReg
  • 20. Ajit Pal, IIT Kharagpur MIPS Datapath II: Single-Cycle PC Instruction memory Read address Instruction 16 32 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend ALU result Zero Data memory Address Write data Read data M u x 4 Add M u x ALU RegWrite ALU operation 3 MemRead MemWrite ALUSrc MemtoReg Adding instruction fetch Separate instruction memory as instruction and data read occur in the same clock cycle Separate adder as ALU operations and PC increment occur in the same clock cycle
  • 21. Ajit Pal, IIT Kharagpur MIPS Datapath III: Single-Cycle PC Instruction memory Read address Instruction 16 32 Add ALU result M u x Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Shift left 2 4 M u x ALU operation3 RegWrite MemRead MemWrite PCSrc ALUSrc MemtoReg ALU result Zero ALU Data memory Address Write data Read data M u x Sign extend Add Adding branch capability and another multiplexor Instruction address is either PC+4 or branch target address Extra adder needed as both adders operate in each cycle New multiplexor Important note: in a single-cycle implementation data cannot be stored during an instruction – it only moves through combinational logic Question: is the MemRead signal really needed?! Think of RegWrite…!
  • 22. Ajit Pal, IIT Kharagpur Datapath Executing add add rd, rs, rt 5 516 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction 32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc
  • 23. Ajit Pal, IIT Kharagpur Datapath Executing lw lw rt,offset(rs) 5 516 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction 32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc
  • 24. Ajit Pal, IIT Kharagpur Datapath Executing sw sw rt,offset(rs) 5 516 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction 32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc
  • 25. Ajit Pal, IIT Kharagpur Datapath Executing beq beq r1,r2,offset 5 516 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction 32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc
  • 26. Ajit Pal, IIT Kharagpur Summary  Techniques described here to design datapaths and control are at the core of all modern computer architecture  Multicycle datapaths offer two great advantages over single-cycle ■ functional units can be reused within a single instruction if they are accessed in different cycles – reducing the need to replicate expensive logic ■ instructions with shorter execution paths can complete quicker by consuming fewer cycles  Modern computers, in fact, take the multicycle paradigm to a higher level to achieve greater instruction throughput: ■ pipelining (next topic) where multiple instructions execute simultaneously by having cycles of different instructions overlap in the datapath ■ the MIPS architecture was designed to be pipelined
  • 27. Ajit Pal, IIT Kharagpur Control  Control unit takes input from ■ the instruction opcode bits  Control unit generates ■ ALU control input ■ write enable (possibly, read enable also) signals for each storage element ■ selector controls for each multiplexor
  • 28. Ajit Pal, IIT Kharagpur ALU Control  Plan to control ALU: main control sends a 2-bit ALUOp control field to the ALU control. Based on ALUOp and funct field of instruction the ALU control generates the 3-bit ALU control field ■ ALU control Func- field tion 000 and 001 or 010 add 110 sub 111 slt  ALU must perform ■ add for load/stores (ALUOp 00) ■ sub for branches (ALUOp 01) ■ one of and, or, add, sub, slt for R-type instructions, depending on the instruction’s 6-bit funct field (ALUOp 10) Main Control ALU Control 2 ALUOp 6 Instruction funct field 3 ALU control input To ALU
  • 29. Ajit Pal, IIT Kharagpur Setting ALU Control Bits Instruction AluOp Instruction Funct Desired ALU opcode operation Field ALU action control input LW 00 load word xxxxxx add 010 SW 00 store word xxxxxx add 010 Branch eq 01 branch eq xxxxxx subtract 110 R-type 10 add 100000 add 010 R-type 10 subtract 100010 subtract 110 R-type 10 AND 100100 and 000 R-type 10 OR 100101 or 001 R-type 10 set on less 101010 set on less 111 Truth table for ALU control bits*
  • 30. ALUOp Funct field Operation ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0 0 0 X X X X X X 010 0 1 X X X X X X 110 1 X X X 0 0 0 0 010 1 X X X 0 0 1 0 110 1 X X X 0 1 0 0 000 1 X X X 0 1 0 1 001 1 X X X 1 0 1 0 111 Setting ALU Control Bits
  • 31. Ajit Pal, IIT Kharagpur Designing the Main Control  Observations about MIPS instruction format ■ opcode is always in bits 31-26 ■ two registers to be read are always rs (bits 25-21) and rt (bits 20-16) ■ base register for load/stores is always rs (bits 25-21) ■ 16-bit offset for branch equal and load/store is always bits 15-0 ■ destination register for loads is in bits 20-16 (rt) while for R-type instructions it is in bits 15-11 (rd) (will require multiplexor to select) 31-26 25-21 20-16 15-11 10-6 5-0 31-26 25-21 20-16 15-0 opcode opcode rs rs rt rt address rd shamt functR-type Load/store or branch
  • 32. Ajit Pal, IIT Kharagpur Datapath with Control I MemtoReg MemRead MemWrite ALUOp ALUSrc RegDst PC Instruction memory Read address Instruction [31– 0] Instruction [20– 16] Instruction [25– 21] Add Instruction [5– 0] RegWrite 4 16 32Instruction [15– 0] 0 Registers Write register Write data Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend ALU result Zero Data memory Address Read data M u x 1 0 M u x 1 0 M u x 1 0 M u x 1 Instruction [15– 11] ALU control Shift left 2 PCSrc ALU Add ALU result Adding control to the MIPS Datapath III (and a new multiplexor to select field to specify destination register): what are the functions of the 9 control signals? New multiplexor
  • 33. Ajit Pal, IIT Kharagpur Control Signals Signal Name Effect when deasserted Effect when asserted RegDst The register destination number for the The register destination number for the Write register comes from the rt field (bits 20-16) Write register comes from the rd field (bits 15-11) RegWrite None The register on the Write register input is written with the value on the Write data input AlLUSrc The second ALU operand comes from the The second ALU operand is the sign-extended, second register file output (Read data 2) lower 16 bits of the instruction PCSrc The PC is replaced by the output of the adder The PC is replaced by the output of the adder that computes the value of PC + 4 that computes the branch target MemRead None Data memory contents designated by the address input are put on the first Read data output MemWrite None Data memory contents designated by the address input are replaced by the value of the Wrt data input MemtoReg The value fed to the register Write data input The value fed to the register Write data input comes from the ALU comes from the data memory Effects of the seven control signals
  • 34. Ajit Pal, IIT Kharagpur Datapath with Control II PC Instruction memory Read address Instruction [31– 0] Instruction [20 16] Instruction [25 21] Add Instruction [5 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch RegDst ALUSrc Instruction [31 26] 4 16 32 Instruction [15 0] 0 0M u x 0 1 Control Add ALU result M u x 0 1 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend M u x 1 ALU result Zero PCSrc Data memory Write data Read data M u x 1 Instruction [15 11] ALU control Shift left 2 ALU Address MIPS datapath with the control unit: input to control is the 6-bit instruction opcode field, output is seven 1-bit signals and the 2-bit ALUOp signal
  • 35. Ajit Pal, IIT Kharagpur Datapath with Control II (cont.) Instruction RegDst ALUSrc Memto- Reg Reg Write Mem Read Mem Write Branch ALUOp1 ALUp0 R-format 1 0 0 1 0 0 0 1 0 lw 0 1 1 1 1 0 0 0 0 sw X 1 X 0 0 1 0 0 0 beq X 0 X 0 0 0 1 0 1 PC Instruction memory Read address Instruction [31– 0] Instruction [20 16] Instruction [25 21] Add Instruction [5 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch RegDst ALUSrc Instruction [31 26] 4 16 32 Instruction [15 0] 0 0M u x 0 1 Control Add ALU result M u x 0 1 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend M u x 1 ALU result Zero PCSrc Data memory Write data Read data M u x 1 Instruction [15 11] ALU control Shift left 2 ALU Address Determining control signals for the MIPS datapath based on instruction opcode PCSrc cannot be set directly from the opcode: zero test outcome is required
  • 36. Ajit Pal, IIT Kharagpur Control Signals: R-Type Instruction Control signals shown in blue 1 0 0 0 1 ??? Value depends on funct 0 0 5 516 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction I32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc MUX RegDst 5 rd I[15:11] rt I[20:16] rs I[25:21] immediate/ offset I[15:0] 0 1 0 1 1 0 10
  • 37. Ajit Pal, IIT Kharagpur 5 516 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction I32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc MUX RegDst 5 rd I[15:11] rt I[20:16] rs I[25:21] immediate/ offset I[15:0] 0 1 0 1 1 0 10 Control Signals: lw Instruction 0 Control signals shown in blue 0 010 1 1 1 0 1
  • 38. Ajit Pal, IIT Kharagpur 5 516 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction I32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc MUX RegDst 5 rd I[15:11] rt I[20:16] rs I[25:21] immediate/ offset I[15:0] 0 1 0 1 1 0 10 Control Signals: sw Instruction 0 Control signals shown in blue X 010 1 X 0 1 0
  • 39. Ajit Pal, IIT Kharagpur 5 516 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction I32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc MUX RegDst 5 rd I[15:11] rt I[20:16] rs I[25:21] immediate/ offset I[15:0] 0 1 0 1 1 0 10 Control Signals: beq Instruction Control signals shown in blue X 110 0 X 0 0 0 1 if Zero=1
  • 40. Ajit Pal, IIT Kharagpur Datapath with Control III Shift left 2 PC Instruction memory Read address Instruction [31– 0] Data memory Read data Write data Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Instruction [15– 11] Instruction [20– 16] Instruction [25– 21] Add ALU result Zero Instruction [5– 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch Jump RegDst ALUSrc Instruction [31– 26] 4 M u x Instruction [25– 0] Jump address [31– 0] PC+4 [31– 28] Sign extend 16 32 Instruction [15– 0] 1 M u x 1 0 M u x 0 1 M u x 0 1 ALU control Control Add ALU result M u x 0 1 0 ALU Shift left 2 26 28 Address 31-26 25-0 opcode addres s Jump MIPS datapath extended to jumps: control unit generates new Jump control bit New multiplexor with additional control bit Jump Composing jump target address
  • 41. Ajit Pal, IIT Kharagpur Datapath Executing j
  • 42. Ajit Pal, IIT Kharagpur R-type Instruction: Step 1 add $t1, $t2, $t3 (active = bold) PC Instruction memory Read address Instruction [31– 0] Instruction [20– 16] Instruction [25– 21] Add Instruction [5– 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch RegDst ALUSrc Instruction [31– 26] 4 16 32 Instruction [15– 0] 0 0M u x 0 1 Control Add ALU result M u x 0 1 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend Shift left 2 M u x 1 ALU result Zero Data memory Write data Read data M u x 1 Instruction [15– 11] ALU control ALU Address Fetch instruction and increment PC count
  • 43. Ajit Pal, IIT Kharagpur R-type Instruction: Step 2 add $t1, $t2, $t3 (active = bold) PC Instruction memory Read address Instruction [31– 0] Instruction [20– 16] Instruction [25– 21] Add Instruction [5– 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch RegDst ALUSrc Instruction [31– 26] 4 16 32 Instruction [15– 0] 0 0M u x 0 1 Control Add ALU result M u x 0 1 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend Shift left 2 M u x 1 ALU result Zero Data memory Write data Read data M u x 1 Instruction [15– 11] ALU control ALU Address Read two source registers from the register file
  • 44. Ajit Pal, IIT Kharagpur R-type Instruction: Step 3 add $t1, $t2, $t3 (active = bold) PC Instruction memory Read address Instruction [31– 0] Instruction [20 16] Instruction [25 21] Add Instruction [5 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch RegDst ALUSrc Instruction [31 26] 4 16 32 Instruction [15 0] 0 0M u x 0 1 ALU control Control Add ALU result M u x 0 1 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend M u x 1 ALU result Zero Data memory Read dataAddress Write data M u x 1 Instruction [15 11] ALU Shift left 2 ALU operates on the two register operands
  • 45. Ajit Pal, IIT Kharagpur R-type Instruction: Step 4 add $t1, $t2, $t3 (active = bold) PC Instruction memory Read address Instruction [31– 0] Instruction [20 16] Instruction [25 21] Add Instruction [5 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch RegDst ALUSrc Instruction [31 26] 4 16 32 Instruction [15 0] 0 0M u x 0 1 ALU control Control Shift left 2 Add ALU result M u x 0 1 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend M u x 1 ALU result Zero Data memory Write data Read data M u x 1 Instruction [15 11] ALU Address Write result to register
  • 46. Ajit Pal, IIT Kharagpur Single-cycle Implementation Notes  The steps are not really distinct as each instruction completes in exactly one clock cycle – they simply indicate the sequence of data flowing through the datapath  The operation of the datapath during a cycle is purely combinational – nothing is stored during a clock cycle  Therefore, the machine is stable in a particular state at the start of a cycle and reaches a new stable state only at the end of the cycle  Very important for understanding single-cycle computing
  • 47. Ajit Pal, IIT Kharagpur 1. Fetch instruction and increment PC 2. Read base register from the register file: the base register ($t2) is given by bits 25-21 of the instruction 3. ALU computes sum of value read from the register file and the sign-extended lower 16 bits (offset) of the instruction 4. The sum from the ALU is used as the address for the data memory 5. The data from the memory unit is written into the register file: the destination register ($t1) is given by bits 20-16 of the instruction Load Instruction Steps lw $t1, offset ($t2)
  • 48. Ajit Pal, IIT Kharagpur Load Instruction lw $t1, offset($t2) PC Instruction memory Read address Instruction [31– 0] Instruction [15– 11] Instruction [20– 16] Instruction [25– 21] Add Instruction [5– 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch RegDst ALUSrc Instruction [31– 26] 4 16 32Instruction [15– 0] 0 0M u x 0 1 ALU control Control Shift left 2 Add ALU result M u x 0 1 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend M u x 1 ALU result Zero Data memory Write data Read data M u x 1 ALU Address
  • 49. Ajit Pal, IIT Kharagpur 1. Fetch instruction and increment PC 2. Read two register ($t1 and $t2) from the register file 3. ALU performs a subtract on the data values from the register file; the value of PC+4 is added to the sign-extended lower 16 bits (offset) of the instruction shifted left by two to give the branch target address 4. The Zero result from the ALU is used to decide which adder result (from step 1 or 3) to store in the PC Branch Instruction Steps beq $t1, $t2, offset
  • 50. Ajit Pal, IIT Kharagpur Branch Instruction beq $t1, $t2, offset PC Instruction memory Read address Instruction [31– 0] Instruction [15– 11] Instruction [20– 16] Instruction [25– 21] Add Instruction [5– 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch RegDst ALUSrc Instruction [31– 26] 4 16 32 Instruction [15– 0] Shift left 2 0 M u x 0 1 ALU control Control Registers Write register Write data Read data 1 Read register 1 Read register 2 Sign extend 1 ALU result Zero Data memory Write data Read dataM u x Read data 2 Add ALU result M u x 0 1 M u x 1 0 ALU Address
  • 51. Ajit Pal, IIT Kharagpur Implementation: ALU Control Block Operation2 Operation1 Operation0 Operation ALUOp1 F3 F2 F1 F0 F (5– 0) ALUOp0 ALUOp ALU control block ALU control logic Truth table for ALU control bits ALUOp Funct field Operation ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0 0 0 X X X X X X 010 0 1 X X X X X X 110 1 X X X 0 0 0 0 010 1 X X X 0 0 1 0 110 1 X X X 0 1 0 0 000 1 X X X 0 1 0 1 001 1 X X X 1 0 1 0 111
  • 52. Ajit Pal, IIT Kharagpur Implementation: Main Control Block R-format Iw sw beq Op0 Op1 Op2 Op3 Op4 Op5 Inputs Outputs RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp1 ALUOpO Signal R-format lw sw beq name Op5 0 1 1 0 Op4 0 0 0 0 Op3 0 0 1 0 Op2 0 0 0 1 Op1 0 1 1 0 Op0 0 1 1 0 RegDst 1 0 x x ALUSrc 0 1 1 0 MemtoReg 0 1 x x RegWrite 1 1 0 0 MemRead 0 1 0 0 MemWrite 0 0 1 0 Branch 0 0 0 1 ALUOp1 1 0 0 0 ALUOP2 0 0 0 1 InputsOutputs Truth table for main control signals Main control PLA (programmable logic array): principle underlying PLAs is that any logical expression can be written as a sum-of-products
  • 53. Ajit Pal, IIT Kharagpur  Assuming fixed-period clock every instruction datapath uses one clock cycle implies: ■ CPI = 1 ■ cycle time determined by length of the longest instruction path (load) ●but several instructions could run in a shorter clock cycle: waste of time ●consider if we have more complicated instructions like floating point! ■ resources used more than once in the same cycle need to be duplicated ●waste of hardware and chip area Single-Cycle Design Problems
  • 54. Ajit Pal, IIT Kharagpur Example: Fixed-period clock vs. variable-period clock in a single-cycle implementation  Consider a machine with an additional floating point unit. Assume functional unit delays as follows ■ memory: 2 ns., ALU and adders: 2 ns., FPU add: 8 ns., FPU multiply: 16 ns., register file access (read or write): 1 ns. ■ multiplexors, control unit, PC accesses, sign extension, wires: no delay  Assume instruction mix as follows ■ all loads take same time and comprise 31% ■ all stores take same time and comprise 21% ■ R-format instructions comprise 27% ■ branches comprise 5% ■ jumps comprise 2% ■ FP adds and subtracts take the same time and totally comprise 7% ■ FP multiplys and divides take the same time and totally comprise 7%  Compare the performance of (a) a single-cycle implementation using a fixed- period clock with (b) one using a variable-period clock where each instruction executes in one clock cycle that is only as long as it needs to be (not really practical but pretend it’s possible!)
  • 55. Ajit Pal, IIT Kharagpur Solution  Clock period for fixed-period clock = longest instruction time = 20 ns.  Average clock period for variable-period clock = 8 × 31% + 7 × 21% + 6 × 27% + 5 × 5% + 2 × 2% + 20 × 7% + 12 × 7% = 7.0 ns.  Therefore, performancevar-period /performancefixed-period = 20/7 = 2.9 Instruction Instr. Register ALU Data Register FPU FPU Total class mem. read oper. mem. write add/ mul/ time sub div ns. Load word 2 1 2 2 1 8 Store word 2 1 2 2 7 R-format 2 1 2 0 1 6 Branch 2 1 2 5 Jump 2 2 FP mul/div 2 1 1 16 20 FP add/sub 2 1 1 8 12
  • 56. Ajit Pal, IIT Kharagpur Fixing the problem with single-cycle designs  One solution: a variable-period clock with different cycle times for each instruction class ■ unfeasible, as implementing a variable-speed clock is technically difficult  Another solution: ■ use a smaller cycle time… ■ …have different instructions take different numbers of cycles by breaking instructions into steps and fitting each step into one cycle ■ feasible: multicyle approach!
  • 57. Ajit Pal, IIT Kharagpur  Break up the instructions into steps ■ each step takes one clock cycle ■ balance the amount of work to be done in each step/cycle so that they are about equal ■ restrict each cycle to use at most once each major functional unit so that such units do not have to be replicated ■ functional units can be shared between different cycles within one instruction  Between steps/cycles ■ At the end of one cycle store data to be used in later cycles of the same instruction ● need to introduce additional internal (programmer- invisible) registers for this purpose ■ Data to be used in later instructions are stored in programmer-visible state elements: the register file, PC, memory Multicycle Approach
  • 58. Ajit Pal, IIT Kharagpur  Our goal is to break up the instructions into steps so that ■ each step takes one clock cycle ■ the amount of work to be done in each step/cycle is about equal ■ each cycle uses at most once each major functional unit so that such units do not have to be replicated ■ functional units can be shared between different cycles within one instruction  Data at end of one cycle to be used in next must be stored !! Breaking Instructions into Steps
  • 59. Ajit Pal, IIT Kharagpur Breaking Instructions into Steps  We break instructions into the following potential execution steps – not all instructions require all the steps – each step takes one clock cycle 1.Instruction fetch and PC increment (IF) 2.Instruction decode and register fetch (ID) 3.Execution, memory address computation, or branch completion (EX) 4.Memory access or R-type instruction completion (MEM) 5.Memory read completion (WB)  Each MIPS instruction takes from 3 – 5 cycles (steps)
  • 60. Ajit Pal, IIT Kharagpur  Use PC to get instruction and put it in the instruction register. Increment the PC by 4 and put the result back in the PC.  Can be described succinctly using RTL (Register- Transfer Language): IR = Memory[PC]; PC = PC + 4; Step 1: Instruction Fetch & PC Increment (IF)
  • 61. Ajit Pal, IIT Kharagpur  Read registers rs and rt in case we need them. Compute the branch address in case the instruction is a branch.  RTL: A = Reg[IR[25-21]]; B = Reg[IR[20-16]]; ALUOut = PC + (sign-extend(IR[15-0]) << 2); Step 2: Instruction Decode and Register Fetch (ID)
  • 62. Ajit Pal, IIT Kharagpur  ALU performs one of four functions depending on instruction type ■ memory reference: ALUOut = A + sign-extend(IR[15-0]); ■ R-type: ALUOut = A op B; ■ branch (instruction completes): if (A==B) PC = ALUOut; ■ jump (instruction completes): PC = PC[31-28] || (IR(25-0) << 2) Step 3: Execution, Address Computation or Branch Completion (EX)
  • 63. Ajit Pal, IIT Kharagpur  Again depending on instruction type:  Loads and stores access memory ■ load MDR = Memory[ALUOut]; ■ store (instruction completes) Memory[ALUOut] = B;  R-type (instructions completes) Reg[IR[15-11]] = ALUOut; Step 4: Memory access or R-type Instruction Completion (MEM)
  • 64. Ajit Pal, IIT Kharagpur  Again depending on instruction type:  Load writes back (instruction completes) Reg[IR[20-16]]= MDR;  Important: There is no reason from a datapath (or control) point of view that Step 5 cannot be eliminated by performing  Reg[IR[20-16]]= Memory[ALUOut];  for loads in Step 4. This would eliminate the MDR as well.  The reason this is not done is that, to keep steps balanced in length, the design restriction is to allow each step to contain at most one ALU operation, or one register access, or one memory access. Step 5: Memory Read Completion (WB)
  • 65. Ajit Pal, IIT Kharagpur Summary of Instruction Execution Step name Action for R-type instructions Action for memory-reference instructions Action for branches Action for jumps Instruction fetch IR = Memory[PC] PC = PC + 4 Instruction A = Reg [IR[25-21]] decode/register fetch B = Reg [IR[20-16]] ALUOut = PC + (sign-extend (IR[15-0]) << 2) Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] II computation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2) jump completion Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut] completion ALUOut or Store: Memory [ALUOut] = B Memory read completion Load: Reg[IR[20-16]] = MDR 1: IF 2: ID 3: EX 4: MEM 5: WB Step
  • 66. Ajit Pal, IIT Kharagpur Multicycle Execution Step (1): Instruction Fetch IR = Memory[PC]; PC = PC + 4; 4PC + 4
  • 67. Ajit Pal, IIT Kharagpur Multicycle Execution Step (2): Instruction Decode & Register Fetch A = Reg[IR[25-21]]; (A = Reg[rs]) B = Reg[IR[20-15]]; (B = Reg[rt]) ALUOut = (PC + sign-extend(IR[15-0]) << 2) Branch Target Address Reg[rs] Reg[rt] PC + 4
  • 68. Ajit Pal, IIT Kharagpur Multicycle Execution Step (3): Memory Reference Instructions ALUOut = A + sign-extend(IR[15-0]); Mem. Address Reg[rs] Reg[rt] PC + 4
  • 69. Ajit Pal, IIT Kharagpur Multicycle Execution Step (3): ALU Instruction (R-Type) ALUOut = A op B R-Type Result Reg[rs] Reg[rt] PC + 4
  • 70. Ajit Pal, IIT Kharagpur Multicycle Execution Step (3): Branch Instructions if (A == B) PC = ALUOut; Branch Target Address Reg[rs] Reg[rt] Branch Target Address
  • 71. Ajit Pal, IIT Kharagpur Multicycle Execution Step (3): Jump Instruction PC = PC[31-28] concat (IR[25-0] << 2) Jump Address Reg[rs] Reg[rt] Branch Target Address
  • 72. Ajit Pal, IIT Kharagpur Multicycle Execution Step (4): Memory Access - Read (lw) MDR = Memory[ALUOut]; Mem. Data PC + 4 Reg[rs] Reg[rt] Mem. Address
  • 73. Ajit Pal, IIT Kharagpur Multicycle Execution Step (4): Memory Access - Write (sw) Memory[ALUOut] = B; PC + 4 Reg[rs] Reg[rt]
  • 74. Ajit Pal, IIT Kharagpur Multicycle Execution Step (4): ALU Instruction (R-Type) Reg[IR[15:11]] = ALUOUT R-Type Result Reg[rs] Reg[rt] PC + 4
  • 75. Ajit Pal, IIT Kharagpur Multicycle Execution Step (5): Memory Read Completion (lw) Reg[IR[20-16]] = MDR; PC + 4 Reg[rs] Reg[rt]Mem. Data Mem. Address
  • 76. Ajit Pal, IIT Kharagpur  Note particularities of multicyle vs. single- diagrams ■ single memory for data and instructions ■ single ALU, no extra adders ■ extra registers to hold data between clock cycles Datapath Comparison PC Instruction memory Read address Instruction 16 32 Add ALU result M u x Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Shift left 2 4 M u x ALU operation3 RegWrite MemRead MemWrite PCSrc ALUSrc MemtoReg ALU result Zero ALU Data memory Address Write data Read data M u x Sign extend Add PC Memory Address Instruction or data Data Instruction register Registers Register # Data Register # Register # ALU Memory data register A B ALUOut Single-cycle datapath Multicycle datapath (high-level view)
  • 77. Ajit Pal, IIT Kharagpur Multicycle Datapath Shift left 2 PC Memory MemData Write data M u x 0 1 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 M u x 0 1 M u x 0 1 4 Instruction [15– 0] Sign extend 3216 Instruction [25– 21] Instruction [20– 16] Instruction [15– 0] Instruction register 1 M u x 0 3 2 M u x ALU result ALU Zero Memory data register Instruction [15– 11] A B ALUOut 0 1 Address Basic multicycle MIPS datapath handles R-type instructions and load/stores: new internal register in red ovals, new multiplexors in blue ovals
  • 78. Ajit Pal, IIT Kharagpur Multicycle Datapath with Control I Shift left 2 MemtoReg IorD MemRead MemWrite PC Memory MemData Write data M u x 0 1 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Instruction [15– 11] M u x 0 1 M u x 0 1 4 ALUOpALUSrcB RegDst RegWrite Instruction [15– 0] Instruction [5– 0] Sign extend 3216 Instruction [25– 21] Instruction [20– 16] Instruction [15– 0] Instruction register 1 M u x 0 3 2 ALU control M u x 0 1 ALU result ALU ALUSrcA ZeroA B ALUOut IRWrite Address Memory data register With control lines and the ALU control block added – not all control lines are shown
  • 79. Ajit Pal, IIT Kharagpur Multicycle Datapath with Control II Complete multicycle MIPS datapath (with branch and jump capability) and showing the main control block and all control lines Shift left 2 PC M u x 0 1 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Instruction [15– 11] M u x 0 1 M u x 0 1 4 Instruction [15– 0] Sign extend 3216 Instruction [25– 21] Instruction [20– 16] Instruction [15– 0] Instruction register ALU control ALU result ALU Zero Memory data register A B IorD MemRead MemWrite MemtoReg PCWriteCond PCWrite IRWrite ALUOp ALUSrcB ALUSrcA RegDst PCSource RegWrite Control Outputs Op [5– 0] Instruction [31-26] Instruction [5– 0] M u x 0 2 Jump address [31-0]Instruction [25– 0] 26 28 Shift left 2 PC [31-28] 1 1 M u x 0 3 2 M u x 0 1 ALUOut Memory MemData Write data Address New multiplexorNew gates For the jump address
  • 80. Ajit Pal, IIT Kharagpur Multicycle Control Step (1): Fetch IR = Memory[PC]; PC = PC + 4; 1 0 1 0 1 0 X 0 X 0 010 1 5 5 RD1 RD2 RN1 RN2 WN WD RegWrite Registers Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Memory ADDR MemWrite 5 Instruction I 32 ALUSrcB <<2 PC 4 RegDst 5 I R M D R M U X 0 1 2 3 M U X 1 0 M U X 0 1A B ALU OUT 0 1 2 M U X <<2 CONCAT 28 32 M U X 0 1 ALUSrcA jmpaddr I[25:0] rd MUX 0 1 rtrs immediate PCSource MemtoReg IorD PCWr* IRWrite
  • 81. Ajit Pal, IIT Kharagpur Multicycle Control Step (2): Instruction Decode & Register Fetch A = Reg[IR[25-21]]; (A = Reg[rs]) B = Reg[IR[20-15]]; (B = Reg[rt]) ALUOut = (PC + sign-extend(IR[15-0]) << 2); 0 0 X 0 0 X 3 0 X X 010 0 5 5 RD1 RD2 RN1 RN2 WN WD RegWrite Registers Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Memory ADDR MemWrite 5 Instruction I 32 ALUSrcB <<2 PC 4 RegDst 5 I R M D R M U X 0 1 2 3 M U X 1 0 M U X 0 1A B ALU OUT 0 1 2 M U X <<2 CONCAT 28 32 M U X 0 1 ALUSrcA jmpaddr I[25:0] rd MUX 0 1 rtrs immediate PCSource MemtoReg IorD PCWr* IRWrite
  • 82. Ajit Pal, IIT Kharagpur 0 X Multicycle Control Step (3): Memory Reference Instructions ALUOut = A + ssign-extend(IR[15-0]); X 2 0 0 X 0 1 X 010 0 5 5 RD1 RD2 RN1 RN2 WN WD RegWrite Registers Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Memory ADDR MemWrite 5 Instruction I 32 ALUSrcB <<2 PC 4 RegDst 5 I R M D R M U X 0 1 2 3 M U X 1 0 M U X 0 1A B ALU OUT 0 1 2 M U X <<2 CONCAT 28 32 M U X 0 1 ALUSrcA jmpaddr I[25:0] rd MUX 0 1 rtrs immediate PCSource MemtoReg IorD PCWr* IRWrite
  • 83. Ajit Pal, IIT Kharagpur Multicycle Control Step (3): ALU Instruction (R-Type) ALUOut = A op B; 0 X X 0 0 0 X 0 1 X ??? 0 5 5 RD1 RD2 RN1 RN2 WN WD RegWrite Registers Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Memory ADDR MemWrite 5 Instruction I 32 ALUSrcB <<2 PC 4 RegDst 5 I R M D R M U X 0 1 2 3 M U X 1 0 M U X 0 1A B ALU OUT 0 1 2 M U X <<2 CONCAT 28 32 M U X 0 1 ALUSrcA jmpaddr I[25:0] rd MUX 0 1 rtrs immediate PCSource MemtoReg IorD PCWr* IRWrite
  • 84. Ajit Pal, IIT Kharagpur 1 if Zero=1 Multicycle Control Step (3): Branch Instructions if (A == B) PC = ALUOut; 0 X X 0 0 X 0 1 1 011 0 5 5 RD1 RD2 RN1 RN2 WN WD RegWrite Registers Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Memory ADDR MemWrite 5 Instruction I 32 ALUSrcB <<2 PC 4 RegDst 5 I R M D R M U X 0 1 2 3 M U X 1 0 M U X 0 1A B ALU OUT 0 1 2 M U X <<2 CONCAT 28 32 M U X 0 1 ALUSrcA jmpaddr I[25:0] rd MUX 0 1 rtrs immediate PCSource MemtoReg IorD PCWr* IRWrite
  • 85. Ajit Pal, IIT Kharagpur Multicycle Execution Step (3): Jump Instruction PC = PC[21-28] concat (IR[25-0] << 2); 0 X X X 0 1 X 0 X 2 XXX 0 5 5 RD1 RD2 RN1 RN2 WN WD RegWrite Registers Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Memory ADDR MemWrite 5 Instruction I 32 ALUSrcB <<2 PC 4 RegDst 5 I R M D R M U X 0 1 2 3 M U X 1 0 M U X 0 1A B ALU OUT 0 1 2 M U X <<2 CONCAT 28 32 M U X 0 1 ALUSrcA jmpaddr I[25:0] rd MUX 0 1 rtrs immediate PCSource MemtoReg IorD PCWr* IRWrite
  • 86. Ajit Pal, IIT Kharagpur Multicycle Control Step (4): Memory Access - Read (lw) MDR = Memory[ALUOut]; 0 X X X 1 0 1 0 X X XXX 0 5 5 RD1 RD2 RN1 RN2 WN WD RegWrite Registers Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Memory ADDR MemWrite 5 Instruction I 32 ALUSrcB <<2 PC 4 RegDst 5 I R M D R M U X 0 1 2 3 M U X 1 0 M U X 0 1A B ALU OUT 0 1 2 M U X <<2 CONCAT 28 32 M U X 0 1 ALUSrcA jmpaddr I[25:0] rd MUX 0 1 rtrs immediate PCSource MemtoReg IorD PCWr* IRWrite
  • 87. Ajit Pal, IIT Kharagpur Multicycle Execution Steps (4) Memory Access - Write (sw) Memory[ALUOut] = B; 0 X X X 0 0 1 1 X X XXX 0 5 5 RD1 RD2 RN1 RN2 WN WD RegWrite Registers Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Memory ADDR MemWrite 5 Instruction I 32 ALUSrcB <<2 PC 4 RegDst 5 I R M D R M U X 0 1 2 3 M U X 1 0 M U X 0 1A B ALU OUT 0 1 2 M U X <<2 CONCAT 28 32 M U X 0 1 ALUSrcA jmpaddr I[25:0] rd MUX 0 1 rtrs immediate PCSource MemtoReg IorD PCWr* IRWrite
  • 88. Ajit Pal, IIT Kharagpur 10 0 X 0 X 0 XXX X X 1 1 5 5 RD1 RD2 RN1 RN2 WN WD RegWrite Registers Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Memory ADDR MemWrite 5 Instruction I 32 ALUSrcB <<2 PC 4 RegDst 5 I R M D R M U X 0 1 2 3 M U X 0 1 M U X 0 1A B ALU OUT 0 1 2 M U X <<2 CONCAT 28 32 M U X 0 1 ALUSrcA jmpaddr I[25:0] rd MUX 0 1 rtrs immediate PCSource MemtoReg IorD PCWr* IRWrite Multicycle Control Step (4): ALU Instruction (R-Type) Reg[IR[15:11]] = ALUOut; (Reg[Rd] = ALUOut)
  • 89. Ajit Pal, IIT Kharagpur Multicycle Execution Steps (5) Memory Read Completion (lw) Reg[IR[20-16]] = MDR; 1 0 0 X 0 0 X 0 X X XXX 0 5 5 RD1 RD2 RN1 RN2 WN WD RegWrite Registers Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Memory ADDR MemWrite 5 Instruction I 32 ALUSrcB <<2 PC 4 RegDst 5 I R M D R M U X 0 1 2 3 M U X 0 1 M U X 0 1A B ALU OUT 0 1 2 M U X <<2 CONCAT 28 32 M U X 0 1 ALUSrcA jmpaddr I[25:0] rd MUX 0 1 rtrs immediate PCSource MemtoReg IorD PCWr* IRWrite
  • 90. Ajit Pal, IIT Kharagpur  Value of control signals is dependent upon: ■ what instruction is being executed ■ which step is being performed  Use the information we have accumulated to specify a finite state machine ■ specify the finite state machine graphically, or ■ use microprogramming  Implementation is then derived from the specification Implementing Control
  • 91. Ajit Pal, IIT Kharagpur  Finite state machines (FSMs): ■ a set of states and ■ next state function, determined by current state and the input ■ output function, determined by current state and possibly input ■ We’ll use a Moore machine – output based only on current state Review: Finite State Machines Next-state function Current state Clock Output function Next state Outputs Inputs
  • 92. Ajit Pal, IIT Kharagpur Example: Moore Machine  The Moore machine below, given input a binary string terminated by “#”, will output “even” if the string has an even number of 0’s and “odd” if the string has an odd number of 0’s Even state Odd state Output even state Output odd state No output No output Output “even” Output “odd” 0 0 1 1 # # Start
  • 93. Ajit Pal, IIT Kharagpur FSM Control: High-level View Memory access instructions (Figure 5.38) R-type instructions (Figure 5.39) Branch instruction (Figure 5.40) Jump instruction (Figure 5.41) Instruction fetch/decode and register fetch (Figure 5.37) Start ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 MemRead ALUSrcA = 0 IorD = 0 IRWrite ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00 Instruction fetch Instruction decode/ Register fetch (Op = 'LW') or (Op = 'SW') (Op = R-type) (O p = 'BEQ ') (Op='JMP') 0 1 Start Memory reference FSM (Figure 5.38) R-type FSM (Figure 5.39) Branch FSM (Figure 5.40) Jump FSM (Figure 5.41) High-level view of FSM control Instruction fetch and decode steps of every instruction is identical Asserted signals shown inside state circles
  • 94. Ajit Pal, IIT Kharagpur FSM Control: Memory Reference MemWrite IorD = 1 MemRead IorD = 1 ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 RegWrite MemtoReg = 1 RegDst = 0 Memory address computation (Op = 'LW') or (Op = 'SW') Memory access Write-back step (O p = 'SW ') (Op='LW') 4 2 53 From state 1 To state 0 (Figure 5.37) Memory access FSM control for memory-reference has 4 states
  • 95. Ajit Pal, IIT Kharagpur FSM Control: R-type Instruction ALUSrcA = 1 ALUSrcB = 00 ALUOp = 10 RegDst = 1 RegWrite MemtoReg = 0 Execution R-type completion 6 7 (Op = R-type) From state 1 To state 0 (Figure 5.37) FSM control to implement R- type instructions has 2 states
  • 96. Ajit Pal, IIT Kharagpur FSM Control: Branch Instruction Branch completion 8 (Op = 'BEQ') From state 1 To state 0 (Figure 5.37) ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond PCSource = 01 FSM control to implement branches has 1 state
  • 97. Ajit Pal, IIT Kharagpur FSM Control: Jump Instruction Jump completion 9 (Op = 'J') From state 1 To state 0 (Figure 5.37) PCWrite PCSource = 10 FSM control to implement jumps has 1 state
  • 98. Ajit Pal, IIT Kharagpur FSM Control: Complete View PCWrite PCSource = 10 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond PCSource = 01 ALUSrcA =1 ALUSrcB = 00 ALUOp= 10 RegDst = 1 RegWrite MemtoReg = 0 MemWrite IorD = 1 MemRead IorD = 1 ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 RegDst=0 RegWrite MemtoReg=1 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 MemRead ALUSrcA = 0 IorD = 0 IRWrite ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00 Instruction fetch Instruction decode/ register fetch Jump completion Branch completionExecution Memory address computation Memory access Memory access R-type completion Write-back step (Op = 'LW') or (Op = 'SW') (Op = R-type) (Op = 'BEQ') (Op='J') (O p = 'SW ') (Op='LW') 4 0 1 9862 753 Start The complete FSM control for the multicycle MIPS datapath: refer Multicycle Datapath with Control II Labels on arcs are conditions that determine next state IF ID EX MEM WB
  • 99. Ajit Pal, IIT Kharagpur Example: CPI in a multicycle CPU  Assume ■ the control design of the previous slide ■ An instruction mix of 22% loads, 11% stores, 49% R-type operations, 16% branches, and 2% jumps  What is the CPI assuming each step requires 1 clock cycle?  Solution: ■ Number of clock cycles from previous slide for each instruction class: ● loads 5, stores 4, R-type instructions 4, branches 3, jumps 3 ■ CPI = CPU clock cycles / instruction count = Σ (instruction countclass i × CPIclass i) / instruction count = Σ (instruction countclass I / instruction count) × CPIclass I = 0.22 × 5 + 0.11 × 4 + 0.49 × 4 + 0.16 × 3 + 0.02 × 3 = 4.04
  • 100. Ajit Pal, IIT Kharagpur FSM Control: Implementation High-level view of FSM implementation: inputs to the combinational logic block are the current state number and instruction opcode bits; outputs are the next state number and control signals to be asserted for the current state PCWrite PCWriteCond IorD MemtoReg PCSource ALUOp ALUSrcB ALUSrcA RegWrite RegDst NS3 NS2 NS1 NS0Op5 Op4 Op3 Op2 Op1 Op0 S3 S2 S1 S0 State register IRWrite MemRead MemWrite Instruction register opcode field Outputs Control logic Inputs Four state bits are required for 10 states
  • 101. Ajit Pal, IIT Kharagpur FSM Control: PLA Implementation Op5 Op4 Op3 Op2 Op1 Op0 S3 S2 S1 S0 IorD IRWrite MemRead MemWrite PCWrite PCWriteCond MemtoReg PCSource1 ALUOp1 ALUSrcB0 ALUSrcA RegWrite RegDst NS3 NS2 NS1 NS0 ALUSrcB1 ALUOp0 PCSource0  Upper half is the AND plane that computes all the products  The products are carried to the lower OR plane by the vertical lines  The sum terms for each output is given by the corresponding horizontal line  E.g., IorD = S0.S1.S2.S3 + S0.S1.S2.S3
  • 102. Ajit Pal, IIT Kharagpur  ROM (Read Only Memory) ■ values of memory locations are fixed ahead of time  A ROM can be used to implement a truth table ■ if the address is m-bits, we can address 2m entries in the ROM ■ outputs are the bits of the entry the address points to FSM Control: ROM Implementation m n 0 0 0 0 0 1 1 0 0 1 1 1 0 0 0 1 0 1 1 0 0 0 1 1 1 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 1 1 1 0 0 1 1 0 1 1 1 0 1 1 1 ROM m = 3 n = 4 The size of an m-input n-output ROM is 2m x n bits – such a ROM can be thought of as an array of size 2m with each entry in the array being n bits outputaddress
  • 103. Ajit Pal, IIT Kharagpur  First improve the ROM: break the table into two parts ■ 4 state bits give the 16 output signals – 24 x 16 bits of ROM ■ all 10 input bits give the 4 next state bits – 210 x 4 bits of ROM ■ Total – 4.3K bits of ROM  PLA is much smaller ■ can share product terms ■ only need entries that produce an active output ■ can take into account don't cares  PLA size = (#inputs #product-terms) + (#outputs #product-terms) ■ FSM control PLA = (10x17)+(20x17) = 460 PLA cells  PLA cells usually about the size of a ROM cell (slightly bigger) FSM Control: ROM vs. PLA
  • 104. Ajit Pal, IIT Kharagpur  Microprogramming is a method of specifying FSM control that resembles a programming language – textual rather graphic ■ this is appropriate when the FSM becomes very large, e.g., if the instruction set is large and/or the number of cycles per instruction is large ■ in such situations graphical representation becomes difficult as there may be thousands of states and even more arcs joining them ■ a microprogram is specification : implementation is by ROM or PLA  A microprogram is a sequence of microinstructions ■ each microinstruction has eight fields (label + 7 functional) ● Label: used to control microcode sequencing ● ALU control: specify operation to be done by ALU ● SRC1: specify source for first ALU operand ● SRC2: specify source for second ALU operand ● Register control: specify read/write for register file ● Memory: specify read/write for memory ● PCWrite control: specify the writing of the PC ● Sequencing: specify choice of next microinstruction Microprogramming
  • 105. Ajit Pal, IIT Kharagpur Microprogramming  The Sequencing field value determines the execution order of the microprogram ■ value Seq : control passes to the sequentially next microinstruction ■ value Fetch : branch to the first microinstruction to begin the next MIPS instruction, i.e., the first microinstruction in the microprogram ■ value Dispatch i : branch to a microinstruction based on control input and a dispatch table entry (called dispatching): ● Dispatching is implemented by means of creating a table, called dispatch table, whose entries are microinstruction labels and which is indexed by the control input. There may be multiple dispatch tables – the value Dispatch i in the sequencing field indicates that the i th dispatch table is to be used
  • 106. Ajit Pal, IIT Kharagpur Control Microprogram  The microprogram corresponding to the FSM control shown graphically earlier: Label ALU control SRC1 SRC2 Register control Memory PCWrite control Sequencing Fetch Add PC 4 Read PC ALU Seq Add PC Extshft Read Dispatch 1 Mem1 Add A Extend Dispatch 2 LW2 Read ALU Seq Write MDR Fetch SW2 Write ALU Fetch Rformat1 Func code A B Seq Write ALU Fetch BEQ1 Subt A B ALUOut-cond Fetch JUMP1 Jump address Fetch Dispatch ROM 1 Dispatch ROM 2 Op Opcode name Value Op Opcode name Value 000000 R-format Rformat1 100011 lw LW2000010 jmp JUMP1 101011 sw SW2 000100 beq BEQ1 100011 lw Mem1 101011 sw Mem1 Microprogram containing 10 microinstructions Dispatch Table 2 Dispatch Table 1
  • 107. Ajit Pal, IIT Kharagpur Microcode: Trade-offs Specification advantages ■ easy to design and write ■ typically manufacturer designs architecture and microcode in parallel  Implementation advantages ■ easy to change since values are in memory (e.g., off-chip ROM) ■ can emulate other architectures ■ can make use of internal registers  Implementation disadvantages ■ control is implemented nowadays on same chip as processor so the advantage of an off-chip ROM does not exist ■ ROM is no longer faster than on-board cache ■ there is little need to change the microcode as general-purpose computers are used far more nowadays than computers designed for specific applications
  • 108. Instruction RegDst RegWrite ALUSrc MemRea d MemWri te MemToReg PCSrc ALU operation R-format 1 1 0 0 0 0 0 0000 (and) 0001 (or) 0010 (add) 0110 (sub) lw 0 1 1 1 0 1 0 0010 (add) sw X 0 1 0 1 X 0 0010 (add) beq x 0 0 0 0 X 1 or 0 0110 (sub)
  • 109. Control  We next add the control unit that generates ■ write signal for each state element ■ control signals for each multiplexer ■ ALU control signal  Input to control unit: instruction opcode and function code
  • 110. Control Unit  Divided into two parts ■ Main Control Unit ●Input: 6-bit opcode ●Output: all control signals for Muxes, RegWrite, MemRead, MemWrite and a 2-bit ALUOp signal ■ ALU Control Unit ●Input: 2-bit ALUOp signal generated from Main Control Unit and 6-bit instruction function code ●Output: 4-bit ALU control signal
  • 111. Truth Table for Main Control Unit
  • 113. ALU Control Unit  Must describe hardware to compute 4-bit ALU control input given ■ 2-bit ALUOp signal from Main Control Unit ■ function code for arithmetic  Describe it using a truth table (can turn into gates):
  • 115. Putting It All Together Again PC Instruction memory Read address Instruction [31– 0] Instruction [20 16] Instruction [25 21] Add Instruction [5 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch RegDst ALUSrc Instruction [31 26] 4 16 32 Instruction [15 0] 0 0M u x 0 1 Control Add ALU result M u x 0 1 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend M u x 1 ALU result Zero PCSrc Data memory Write data Read data M u x 1 Instruction [15 11] ALU control Shift left 2 ALU Address Use this for R-type, memory, & beq instructions scenarios.
  • 116. Addition of the Unconditional Jump  We now add one more op code to our single cycle design: ■ Op code 2: “j” ■ The format is op field 28-31 is a “2” ■ Remaining 26 low bits is the immediate target address  The full 32 bit target address is computed by concatenating: ■ Upper 4 bits of PC+4 ■ 26 bit immediate field of the jump instruction ■ Bits 00 in the lowest positions (word boundary) ■ See text chapter 3, p. 150  An additional control line from the main controller will have to be generated to select this “new” instruction  A two bit shifter is also added to get the two low order zeros
  • 117. Final Design with jump Instruction Shift left2 PC Instruction memory Read address Instruction [31–0] Data memory Read data Write data Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Instruction [15– 11] Instruction [20– 16] Instruction [25– 21] Add ALU result Zero Instruction [5– 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch Jump RegDst ALUSrc Instruction [31– 26] 4 M u x Instruction [25–0] Jump address[31–0] PC+4 [31– 28] Sign extend 16 32 Instruction [15– 0] 1 M u x 1 0 M u x 0 1 M u x 0 1 ALU control Control Add ALU result M u x 0 1 0 ALU Shift left 2 26 28 Address
  • 118. Ajit Pal, IIT Kharagpur Thanks!