Prefetching

Prefetching

A Cache Optimization .

Hardware prefetching #

  • predict of addresses that will be needed and load them in advance

  • Prefetch-On-Miss: fetch the missed block and also prefetch the next one

  • Tagged prefetch scheme: Tag == prefetched and not yet referenced

    • on miss -> load memory block and set tag 0
      • also prefetch next block and set tag 1
    • on hit and tag 0: cache line was demanded or already referenced, do nothing
    • on hit and tag 1: set this tag to 0 and prefetch next memory block and set tag 1
  • Stream buffers

    • Hardware trying to detect stream memory access and prefetch lines for each detected steam. On cache miss, check the stream buffers and if one hits, transfer to cache

    • Stream buffers can detect strides

    • If hardware has stream buffers, to optimize for it you should distribute the loops such that you don’t exhaust all stream buffers

Evaluation #

pro:

  • reduce missrate
  • reduce miss penalty

con:

  • wastes memroy bandwidth
  • prefetched memory might evict other blocks too early

Software prefetching #

  • non blocking prefetch instruction
  • Should ignore exceptions
  • Requires non blocking caches
  • Compiler can insert them
  • Drawback:
    • Additional instructions -> Reduces IPC
Calendar October 22, 2023