Very Long Instruction Word (VLIW) processing

Implemented in Explicit Parallel Instruction Computing (EPIC)

Motivation #

In Superscalar Processors , hardware has to actively search for independent insturctions for Out-Of-Order Execution . This has high hardware overhead and significant power consumption. the means to find independent instructions are also limited, since the search space is limited only issue queue (see Issue Stage ) can be searched.

Solution #

Compiler is responsible for determining which parts can run in parrallel and thus are independent, hardware does not have to look for it. Compilers have a more global view of the program anyways. This needs a different ISA .

One “Instruction” in a VLIW Architecture contains one instruction for each functional unit defined in the ISA , can also be nop for some FUs.

Compiler code generation for VLIW #

Trivial translation #

Loop unrolled #

With Software pipelining #

Advantages #

No renaming since parallelism is explicit
- Hazard detection between VLIW instructions is still required

Disadvantages #

ISA has to define the functional units
Latencies of functional units are visible
static in-order scheduling
nops in VLIW instruction lead to significant memory overhead