Quantcast
Channel: Intel Developer Zone Articles
Viewing all articles
Browse latest Browse all 3384

Cache Miss Rates in Intel® VTune™ Amplifier XE

$
0
0

Intel® VTune™ Amplifier XE has the ability to use Performance Monitoring Units (PMUs) on Intel CPUs to count hardware events and use these events to locate performance issues. The most common way to do this is through the General Exploration analysis type. One set of metrics within General Exploration is related to the memory subsystem and can be found in the Back-End Bound > Memory Bound section of the hierarchy. A common question we receive about memory metrics is "can I calculate cache hit and miss rates?". The General Exploration metrics do not include these rates for a very specific reason. The Top-Down characterization in General Exploration attempts to find the hardware bottleneck which is causing REAL performance issues. In VTune Amplifier we have abstracted away the actual cache miss counts and replaced them with L1/L3 and DRAM Bound metrics. We did this because cache misses may or may not actually affect performance. The complex, pipelined, superscalar Intel processors may be able to schedule instructions in such a way that all the time spent waiting on an L1 miss, for example, is actually not a performance issue because other instructions were able to execute while you wait. The L1/L3 and DRAM Bound metrics in VTune Amplifier actually count cycles while the CPU was STALLED waiting for cache misses. This represents a real performance impact.

Having said that, if you’re still interested in counting cache misses, you will need to create a custom VTune Amplifier analysis type to collect the events. The events may have slightly different names depending on your hardware, and not all may be available on all platforms. The events should have names similar to these:

MEM_LOAD_UOPS_RETIRED.L1_HIT

MEM_LOAD_UOPS_RETIRED.L2_HIT

MEM_LOAD_UOPS_RETIRED.LLC_HIT/MEM_LOAD_UOPS_RETIRED.L3_HIT

 

MEM_LOAD_UOPS_RETIRED.L1_MISS

MEM_LOAD_UOPS_RETIRED.L2_MISS

MEM_LOAD_UOPS_RETIRED.LLC_MISS/ MEM_LOAD_UOPS_RETIRED.L3_MISS

 

Read the descriptions of each event to determine what it counts and how you would like to use it. Be aware that cache misses and miss rates may provide a characteristic profile of your application, however they do not always correlate with performance issues.

 


Viewing all articles
Browse latest Browse all 3384

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>