Intel Knights Landing (KNL)
- Many integrated cores (MIC) architecture
- high thread prallelism
- high data parallelism
- high memory bandwidth
- compatible to x86 ISA
- 38 compute tiles (36 active)
- 2 cores per tile -> 72 cores
Core architecture #
- Reorder buffer with 72 entries
- 72 physical registers
- Architectual register file
- 2 operations wide (Superscalar Processor ?)
- Reservation stations in Integer and FP FUs and memory execution unit
-
Integer execution unit (2 units):
- 12 entry reservation station
- fully Out-Of-Order Execution
- Read before Issue
-
Memory execution unit:
- 12 entry reservation station
- issued in order, can be executed Out-Of-Order
- Stores are kept in store buffers, so loads can find them there
- Stores are committed in porgram order
- Implements scatter/gather operations
-
Floating point units (2 VPUs)
- avx512
- 8 double precision vector ops
- Throughput: 16 DP FLops per cycle per VPU -> 32 doubles from both VPUs per cycle per core
- 20 entry reservation station
- but no data -> Read after Issue
- Vector data is big, so copying them there would take bandwidth and cost space
- avx512
-
Cache architecture #
- Core private L1 data Cache
- L1 Instruction Cache
- L2 Cache shared along both cores on a tile