Overview
R is a popular programming language for statistical computing and machine learning. There is one article we published already- Using Intel® Math Kernel Library (Intel MKL) with R to show how to integrate Intel MKL BLAS and LAPACK libraries within R to improve the math computing performance of R. But we see there are still a lot of troubles for R developers to link the Intel MKL library to R. This article will provide a simple way to link Intel MKL BLAS and LAPACK to R environment.
Reference: http://cran.r-project.org/doc/manuals/r-release/R-admin.html#Shared-BLAS
Prerequisites:
- · Intel® MKL
It contains highly optimized BLAS, LAPACK as well as statistical functionality of direct application to R. More information on Intel MKL can be found here: Intel® Math Kernel Library
- · Download the R package - http://www.r-project.org/
The article is based on Intel MKL 11.2.0 from Intel Parallel Studio XE 2015 Composer Edition for Linux* and later versions and R-3.1.2.tar.gz
System Platform: Red Hat Enterprise Linux Server release 6.3 on Intel® Xeon® CPU E5-2680 @ 2.70GHz, 8 Cores, AVX support.
Linking Intel MKL to R
The BLAS library will be used for many of the add-on packages as well as for R itself. R offers the option of compiling the BLAS into a dynamic library libRblas stored in R_HOME/lib and linking both R itself and all the other add-on packages against that library. This is the default on all platforms except IBM AIX*. So it will be easy for most of developers to change the BLAS without needing to re-install R and all the add-on packages, since all references to the BLAS go through libRblas, and that can be replaced. R project shows a simple way to change the BLAS by using symlink a dynamic BLAS library (such as ACML or Goto’s) to R_HOME/lib/libRblas.so. in their documentation located at http://cran.r-project.org/doc/manuals/r-release/R-admin.html#Shared-BLAS
In this article, we will illustrate the same way to link Intel MKL BLAS library to R. Please follow the below instructions, to build R with default BLAS, LAPACK using gnu compiler chain.
$ tar -xzvf R-3.1.2.tar.gz
$ cd R-3.1.2
$ ./configure
(or $./configure --with-readline=no --with-x=no if package readline and X11 is not installed)
$make
(not $ make install, so, we do not pollute system directory)
$ ldd bin/exec/R
(To make sure it will link libRblas.so although it may show that libRblas.so => not found)
$ cd lib
$ mv libRblas.so libRblas.so.keep
$ln –s /opt/intel/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_rt.so libRblas.so
The same way, you can replace the LAPACK libRlapack.so library too
($mv libRlapack.so libRlapack.so.keep
$ln –s /opt/intel/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_rt.so libRlapack.so)
Performance Results
To provide some indication of the performance improvements that this replacement can provide, I ran R-benchmark-25.R found on the R benchmarks site on the system mentioned above,
$cd ..
Set the Intel MKL environment by sourcing the mklvars.sh for 64 bit platforms
$source /opt/intel/composer_xe_2015.0.090/mkl/bin/mklvars.sh intel64
Because R uses GNU OpenMP multithread library libgomp.so, and Intel MKL uses Intel OpenMP multithread library, from Intel MKL 11.1.3 onwards, we provided the flexibility of supporting GNU threading layer by setting certain environment variables as explained in the MKL reference manual section here https://software.intel.com/en-us/node/528522/)
Please set the MKL interface and threading layer to GNU and LP64 as
$export MKL_INTERFACE_LAYER=GNU,LP64
$export MKL_THREADING_LAYER=GNU
$ ./bin/Rscript ../R-benchmark-25.R
With the intel mkl blas, I was able to get:
R Benchmark 2.5
…
I. Matrix calculation
2800x2800 cross-product matrix (b = a' * a)_________ (sec): 0.109999999999999
…
Total time for all 15 tests_________________________ (sec): 8.89966666666666
Overall mean (sum of I, II and III trimmed means/3)_ (sec): 0.494403941035161
And if with default build, or change back to default BLAS and LAPACK library
$ mv libRblas.so libRblas.so.mkl
$ mv libRlapack.so libRlapack.so.mkl
$ mv libRblas.so.keep libRblas.so
$ mv libRlapack.so.keep libRlapack.so
I get:
R Benchmark 2.5
…
2800x2800 cross-product matrix (b = a' * a)_________ (sec): 14.0946666666667
…
Total time for all 15 tests_________________________ (sec): 42.2893333333333
Overall mean (sum of I, II and III trimmed means/3)_ (sec): 1.42207437362512
As you can see, the overall performance speedup is about 4.75X in this standard R benchmark. By just replacing the default BLAS, and LAPACK library with Intel MKL using the simple step explained above you can be able to get significant performance boost for your R applications.
Other Reference: