The Intel Advisor will soon offer a great step forward in memory performance optimization with a new vivid Advisor “Roofline” bounds and bottlenecks analysis

This new feature provides insights beyond vectorization, such as memory usage and the quality of algorithm implementation.

If you want to try it out or influence the development of this new Roofline feature, sign up for the early access program by sending a request to vector_advisor@intel.com.

Accelerate your application: Tuning existing vectorization and adding new vectorization is easy with the visually intuitive Vectorization Advisor tool in the Intel® Advisor. Try out new vectorization capabilities available in the Intel® Parallel Studio Beta Update, such as expanded memory access patterns analysis, Flops information, and special features for the second generation of the Intel® Xeon Phi™ processor (code name Knights Landing) that uses the AVX-512 instruction set. Register for Intel® Parellel Studio XE 2017 beta program https://software.intel.com/en-us/articles/intel-parallel-studio-xe-2017-beta#howto , or download an evaluation copy from http://software.intel.com/en-us/articles/intel-software-evaluation-center/.

Roofline Modeling

Roofline modeling was first proposed by Berkley researchers Samuel Williams, Andrew Waterman, and David Patterson in paper "Roofline: An Insightful Visual Performance Model for Multicore Architectures" in 2009.

A Roofline model provides insight into how your application works by helping you answer the following questions:

Does my application work optimally on the current hardware?
What limits performance? Is my application workload memory or compute bound?
What is the right strategy to improve application performance?

The model plots data to help you visualize application compute- and memory-bandwidth ceilings by measuring two parameters:

Operational intensity – the number of floating-point operations per byte transferred from memory
Floating-point performance – in Gflops per second

The proximity of the data points to the model lines (rooflines) shows the degree of optimization.

Consider the roofline plot in Fig. 1.

The kernel's on the right hand side are more compute bound and as you move up the Y-axis they become get close to the FP peak. The performance of these kernels are bounded by the compute capabilities of the platform. To improve performance of kernel 3 consider migrating this kernel to a highly parallel platform, such as the Intel Xeon Phi processor, where the compute ceiling and memory throughput is higher. For the kernel 2 vectorization can be considered as a performance improvement strategy as it is far away from the ceiling.

Towards the left-hand side of the plot, the kernel's here are memory bound and you go up the Y-axis they are more bound to the DRAM and cache peak bandwidth of the platform. To increase the performance of these kernels (shifting the plot position to the right with a higher performance ceiling), consider improving the algorithm or its implementation to perform more computations per data item. These kernels may also run faster on an Intel Xeon Phi processor because of greater memory bandwidth availability.

Roofline model

Intel Advisor Roofline Feature

The Intel Advisor implemented "Cache-aware roofline" model described in "Cache-aware Roofline model: Upgrading the loft" paper authored by Aleksandar Ilic, Frederico Pratas, and Leonel Sousa. It provides additional insight by addressing all levels of memory / cache hierarchy:

Slope rooflines illustrate performance levels, if all the data fits into respective cache.
Horizontal lines show the peak achievable performance levels if vectorization and other CPU resources are used effectively.

Intel Advisor places a dot for every loop in the Roofline plot. Consider the Intel Advisor roofline plot in the Fig.2. Most of loops require extra cache use optimizations. Loops to the right of the plotted blue data point fall below the scalar execution roofline and therefore require vectorization.

Advisor Roofline

You can examine you application performance opportunities by applying our experimental Roofline Vector Advisor and browsing through high loaded loops. The circles sizes denotes execution time of loops.

Intel® Advisor "Roofline model" early access program

Roofline Modeling

Intel Advisor Roofline Feature

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List