Issue Stage
Part of the Extended Instruction Pipeline .
- In-order issue
- Instructions are issued in the program order
- Out-of-order issue
- Instructions can be issued in another order
In-Order Execution #
- “Static instruction scheduling”
- Check the oldest instruction in the issue queue if it is ready to execute (instructions are ready, and functional unit is available)
- Then send it
Example #
div f0,f2,f4
add f10,f0,f8
load f12,f8,10
- Divison blocks all instructions
- Add is True dependent
- Load is independent; could be started before division is finished but In-Order execution prevents this
Out-Of-Order Execution #
-
“Dynamic instuction scheduling”
-
Early concept:
- Have scoreboard keeping track of Data Dependences and state of the functional unit
- Issue instructions as soon as the Data Dependences are fulfilled
-
Implementations:
- Tomasulo Algorithm (Reservation Stations)
- Unified Issue Queue
Read before Issue #
- The way Tomasulo Algorithm and Unified Issue Queue work.
- Issue queue stores the value of the operands
Instruction Wakeup #
- Entries of the inssue queue have to be updated when operands get ready -> This is done via the bypass network
Instruction Selection #
- Choose subset of ready instructions and issue them given an issue width (how many instructins can be issued per cycle (Superscalar Processor )).
- Once instruction has been selected, it’s issue queue spot can be reclaimed.
Read after Issue #
- Issue queues do not contain the operand data (less space and copying needed)
- Operands are read when issuing
- Downside: register file needs many read ports
- in case of Read before Issue, the data is already in the issue queue and can be forwarded to the FU directly
Distributed Issue Queue #
- Divide issue queues into clusters (shorter issue queues)
- Arbitter decides into which cluster an instruction should go
Handling port read issues #
Two ways:
- Read after Issue:
- Active scheme: Possibly cancel issuing if not enough read ports available
- Reactive scheme: Possibly cancel already issued instructions if not enough read ports available
- Separate register files per cluster
- Either replicated register files (need to be kept in sync)
- Or distributed register files (mechanism to query remote registers necessary)