Introduction and Description of Product
Intel® Threading Building Blocks (Intel® TBB) is a portable, open-source parallel programming library from the parallelism experts at INTEL®. A Python module for Intel® TBB is included in the Intel® Distribution of Python and provides an out-of-the-box scheduling replacement to address common problems arising from nested parallelism. It handles coordination of both intra- and inter-process concurrency. This article will show you how to launch Python programs using the Python module for Intel® TBB to parallelize math from popular Python modules like Numpy* and Scipy* by way of Intel® Math Kernel Library (Intel® MKL) thread scheduling. Please note that Intel® MKL also comes bundled free with the Intel® Distribution of Python. Intel® TBB is the native threading library for Intel® Data Analytics Acceleration Library (Intel® DAAL), which is a high-performance analytics package with a fully functional Python API. Furthermore, If working with the full Intel® Distribution for Python package, it is also the native threading underneath Numba*, OpenCV*, and select Scikit-learn* algorithms (which have been accelerated with Intel® DAAL).
How to Get Intel® TBB
To install full Intel® Distribution of Python package, which includes Intel® TBB, click below for installation guides:
Anaconda* Package
YUM Repository
APT Repository
Docker* Images
To install from Anaconda cloud:
conda install –c intel tbb
(It will change to ‘tbb4py’ in Q1 of 2018. Article will be updated accordingly)
Drop-in Use with Interpreter Call (no other code changes)
Simply drop in Intel® TBB and determine if it is the right solution for your problem statement!
Performance degradation due to over-subscription can be caused by nested parallel calls, many times unbeknownst to the user. These sort of “mistakes” are easy to make in a scripting environment. Intel® TBB can be turned on easily for out-of-the-box thread scheduling with no code changes. In the faith of the scripting culture of the Python community, this allows for quick checking of Intel® TBB’s performance recovery. If you already have math code written, you can easily launch with the “-m tbb ” interpreter flag, followed by script name and any required args for your script. It’s as easy as this:
python -m tbb script.py args*
NOTE: See the Interpreter Flag Reference Section for full list of available flags.
Interpreter Flag Reference
Command Line Usage
python -m tbb [-h] [--ipc] [-a] [--allocator-huge-pages] [-p P] [-b] [-v] [-m] script.py args*
Get Help from Command Line
python -m tbb –-help pydoc tbb
List of the currently available interpreter flags
Interpreter Flag | Description of Instruction |
---|---|
-h, --help | show this help message and exit |
-m | Executes following as a module (default: False) |
-a, --allocator | Enable TBB scalable allocator as a replacement for standard memory allocator (default: False) |
--allocator-huge-pages | Enable huge pages for TBB allocator (implies: -a) (default: False) |
-p P, --max-num-threads P | Initialize TBB with P max number of threads per process (default: number of available logical processors on system) |
-b, --benchmark | Block TBB initialization until all the threads are created before continue the script. This is necessary for performance benchmarks that want to exclude TBB initialization from the measurements (default: False) |
-v, --verbose | Request verbose and version information (default: False) |
--ipc | Enable inter-process (IPC) coordination between TBB schedulers (default: False) |