Memory Disambiguation
Like Data Dependences but for main memory. The order of loads and stores has to be considered even in Out-Of-Order Execution . The mechanism to handle memory dependences is called memory disambiguation.
1: store (r3), r4
2: load r1, (r2)
3: store (r6), r1
4: add r3, r1, r7
The addresses in the registers could be the same and lead to memory dependences. This can only be chaeck adter the instruction have been issued and their addresses are computed.
-
Stores have to be commited in program order
-> They change the state of the system and changing their order could result in different outcomes
- Checking True dependences (write … read) requires the availablility of all addresses of preceeding stores.
- Output and Anti dependenceds are resolved since stores can update the store only when they commit (and they always commit in program order (so preceeding loads (anti) or stores (output) instructions have finished))
Non-Speculative vs Speculative #
- Non-Speculative
- loads have to wait until the addresses of previous stores have been computed
- Speculative
- loads are sent to the chache without checking all older stores for true dependences
Ordering #
Total Ordering #
- All memory opertaions are executed (access the cache) in order.
- Not used in Out-Of-Order Processors
Store Ordering (partial ordering) #
- Stores are executed in order.
- Loads are executed out of order
- True dependences (write … read) have to be respected
- Loads are blocked if there is a true dependence
- Or continue but then it’s speculative
Load Ordering Store Ordering #
- Loads access the cache in order but out of order with the stores -> Except True dependences (write … read)
Example: AMD K6 #

Figure 1: AMD K6: Non speculative Load Ordering Store Ordering
-
Load Queue (after renaming) -> in order
- Loads procees if their operands are available and they are the head of the queue
-
Store Queue (after renaming) -> in order
- Wait for operands (not for actual data)
-
Address Generation
- Compute accessed address
- Stores wait for data to store here if not available -> store pipline stalls
-
Store Buffer
- Keeps instructions in order until they become the oldest and they can commit when all preceding instructions have committed.
- Stores addresses and data to be stored
-
Disambiguation stage
- Loads check if there are preceding stores in the store buffer to the same address
- Also check with the store currently in the address generation stage
- Check the scheduler if there are any preceding stores not yet in the store queue yet -> If any are true then True dependences (write … read) detected and load pipeline is stalled
Example: Alpha 21264 #

Figure 2: Speculative Memory Disambiguation (Weak memory system)
-
Unified Load / Store Queue -> Holds memory operations until their operands are available
-
For loads: Stores loads in program order with the accessed memory
- Allocated in Renaming / Allocation stage and freed on commit/retire
-
For stores: Same as for loads but also store the data to be stored
-
-
Wait table:
- Table with 1024 single bit entries, indesed by the address of the load operation
- A set bit indicates a speculative load and violated a true dependency
- Loads that map to a bit that is set are executed in order
- Wait table is reset periodically, as it just fills up with 1s over time
-
Issuing:
- Loads and stores wait for their operands (stores also wait for data)
- Loads that have the
waitbit set wait until all previous stores have been issued
-
Disambiguation:
-
store-load memory violation: Stores entereing the store queue check for subsequent loads that match the address. Since they were already exectuted, they violated a True dependency (write … read) .
- Load is marked on the wait table
- Pipeline resumes from the load
-
Load-load memory violation: Loads entering the load queue check for subsequent loads to the same address. If there are some a load-load violation is generated.
- Processor resumes execution form the subsequent load that triggered the violation (other load probably has to wait)
-
-
Cache access:
- Loads access the cache after disambiguation
- Stores access chache only in commit
-
Store forwarding:
- if subsequent load issued after preceding store, the stored data are forwarded to the load