Using Intel MKL and Intel TBB in the same application

Intel MKL 11.3 Beta has introduced Intel TBB support.

Intel MKL 11.3 can increase performance of applications threaded using Intel TBB. Applications using Intel TBB can benefit from the following Intel MKL functions:

BLAS dot, gemm, gemv, gels
LAPACK getrf, getrs, syev, gels, gelsy, gesv, pstrf, potrs
Sparse BLAS csrmm, bsrmm
Intel MKL Poisson Solver
Intel MKL PARDISO

If such applications call functions not listed above, Intel MKL 11.3 executes sequential code. Depending on feedback from customers, future versions of Intel MKL may support Intel TBB in more functions.

Linking applications to Intel TBB and Intel MKL

The simplest way to link applications to Intel TBB and Intel MKL is to use Intel C/C++ Compiler. While Intel MKL supports static and dynamic linking, only dynamic Intel TBB library is available.

Under Linux, use the following commands to compile your application app.c and link it to Intel TBB and Intel MKL.

Dynamic Intel TBB, dynamic Intel MKL icc app.c -mkl -tbb

Dynamic Intel TBB, static Intel MKL icc app.c -static -mkl -tbb

Under Windows, use the following commands to compile your application app.c and link it to dynamic Intel TBB and Intel MKL.

Dynamic Intel TBB, dynamic Intel MKL icl.exe app.c -mkl -tbb

Improving Intel MKL performance with Intel TBB

Performance of Intel MKL can be improved by telling Intel TBB to ensure thread affinity to processor cores. Use the tbb::affinity_partitioner class to this end.

To improve performance of Intel MKL for small input data, you may limit the number of threads allocated by Intel TBB for Intel MKL. Use the tbb::task_scheduler_init class to do so.

For more information on controlling behavior of Intel TBB, see the Intel TBB documentation at https://www.threadingbuildingblocks.org/documentation.

LAPACK performance in applications using Intel TBB and Intel MKL 11.3

* Each call is single run of single size on range from 1000 to 10000 with step 1000. Performance (GFlops) is computed as cumulative number of floating point operations for all 10 calls divided by wall clock time from starting very first call till finishing very last call.

Using Intel MKL and Intel TBB in the same application

Trending Articles

Mp3 Download: Mdu - Mazola

Division 4 ya 29

Essex Police seek Harlow man Joel Steadman

Download EFF Album: 12 –“ASINAMALI”

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Teen drug dealers who avoided jail told by judge to 'make most of lucky escape'

Devon police appeal for help to trace missing 13-year-old girl

Henrique & Juliano – Manifesto Musical 2 (Ao Vivo) – EP 3 [iTunes Plus M4A]

Forum Post: RE: Help: ERROR(15053): Can not initialize PSpice UI

NATHAN CARL DAHLIN Arrested by Clackamas County Sheriff's Office on May 15, 2020

The 10 Tennessee Cities With The Largest Black Population For 2021

Summary of The Schoolboy by William Blake

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

ZARIA CUMMINGS

Subwoofer kukoroma kabla ya kuliwasha!

Moondru Mudichu 20-07-2016 – Polimer tv Serial

BHUNP TBBP - 3BBB(UNP Renewal)

SAHARA FLASH LIVE IN WERAGOLLA 2018-04-20

Who’s been sentenced at Northampton Magistrates’ Court