Trace cache

In computer architecture, a trace cache is a mechanism for increasing the instruction fetch bandwidth by storing traces of instructions that have already beeen fetched, and maybe even executed. The mechanism was first proposed by Eric Rotenberg, Steve Bennett, and Jim Smith in their 1996 paper "Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching."

Trace caches are essentially caches that store instructions either after they have been decoded, or as they are retired. This allows the instruction fetch unit of a processor to fetch several basic blocks, without having to worry about branches in the execution flow. Trace lines are stored in the trace cache based on the program counter of the first instruction in the trace and a set of branch predictions. This allows for storing different trace paths that start on the same address. In the instruction fetch stage of a pipeline, the current program counter along with a set of branch predictions is checked in the trace cache for a hit. If there is a hit, a trace line is supplied to fetch which does not have to go to a regular cache or to memory for these isntructions. The trace cache continues to feed the fetch unit until the trace line ends or until there is a misprediction in the pipeline. If there is a miss, a new trace starts to be build. Because traces also contain different branch paths, a good multiple branch predictor is essential to the success rate of trace caches.

Trace caches are also used in processors like the Intel Pentium 4 to store already decoded micro-operations, or translations of complex x86 instructions, so that the next time an instruction is needed, it does not have to be decoded again.

External links

Full text of Rotenberg et al's paper at Citeseer