Introduction
The Intel® Integrated Performance Primitives (Intel® IPP) is a cross-architecture software library that provides a broad range of library functions for image processing, signal processing, data compression, cryptography, and computer vision, as well as math support routines for such processing capabilities. Intel® IPP is optimized for the wide range of Intel microprocessors.
One of the key advantages within Intel® IPP is performance. The performance advantage comes through per processor architecture optimized functions, compiled into one single library. Intel® IPP functions are “dispatched” at run-time. The “dispatcher” chooses which of these processor-specific optimized libraries to use when the application makes a call into the IPP library. This is done to maximize each function’s use of the underlying vector instructions and other architecture-specific features.
This paper covers CPU dispatching of the Intel® IPP library in more detail. After reading this article you will understand how CPU dispatching works and which libraries are needed for which processor architecture. Further documentation on Intel® IPP can be found at Intel® Integrated Performance Primitives – Documentation.
Dispatcher
Dispatching refers to the process of detecting CPU features at run-time and then selecting the Intel® IPP optimized library set that corresponds to your CPU. For example, in the <ipp directory>\ia32\ipp directory, the ippip8.dll library file contains the 32-bit optimized image processing libraries for processors with Intel® SSE4.2; ‘ippi’ refers to the image processing library, ‘p8’ refers to 32-bit SSE4.2 architecture.
Note: You can build custom processor-specific libraries that do not require the dispatcher, but that is outside thescope of this article. Please read this IPP linkage models articlefor information on how to build custom versions of the Intel® IPP library.
In the general case, the “dispatcher” identifies the run-time processor only once, at library initialization time. It sets an internal table or variable that directs your calls to the internal functions that match your architecture. For example, ippsCopy_8u(), may have multiple implementations stored in the library, with each version optimized to a specific Intel® processor architecture. Thus, the p8_ippsCopy_8u() version of ippsCopy_8u() is called by dispatcher when running on an Intel processor with Intel® SSE4.2 on IA-32, because it is optimized for this processor architecture.
Note:IPP architectures generally correspond to SIMD (MMX, SSE, AES, etc.) instructions sets.
Initializing the IPP Dispatcher
The process of identifying the specific processor being used, and initialization of the dispatcher, should be performed before making any calls into the IPP library. If you are using a dynamic link library this process is handled automatically when the dynamic link library is initialized. However, if you are using a static library you must perform this step manually. See this article on the ipp*Init*() functions for more information on how to do this.
The following table lists all the architecture codes defined by the Intel® IPP library through version 8.2 of the product. Note that some of these IPP architectures have been deprecated and are no longer supported in the current version of the product. Deprecated architectures are identified in the “Notes” column of the table.
IA-32 Intel® architecture | Intel® 64 architecture | Meaning |
px | mx | Generic code optimized for processors with Intel® Streaming SIMD Extensions (Intel® SSE) |
w7 | my | Optimized for processors with Intel SSE2 |
s8 | n8 | Optimized for processors with Supplemental Streaming SIMD Extensions 3 (SSSE3) |
- | m7 | Optimized for processors with Intel SSE3 |
p8 | y8 | Optimized for processors with Intel SSE4.2 |
g9 | e9 | Optimized for processors with Intel® Advanced Vector Extensions (Intel® AVX) and Intel® Advanced Encryption Standard New Instructions (Intel® AES-NI) |
h9 | l9 | Optimized for processors with Intel® Advanced Vector Extensions 2 (Intel® AVX2) |
- | k0 | Optimized for processors with Intel® Advanced Vector Extensions 512 (Intel® AVX-512) |
| n0 | Optimized for processors with Intel® Advanced Vector Extensions 512 (Intel® AVX-512) for Intel® Many Integrated Core Architecture (Intel® MIC Architecture) |
Table 1: CPU Identification Codes Associated with Processor-Specific Libraries
For non-Intel based processors support, please see the article titled Use Intel® IPP on Intel or Compatible AMD* Processors.
P8/Y8 Internal Run-Time Dispatcher
Within the 32-bit 'p8' and equivalent 64-bit 'y8' architectures there is an additional "run-time" dispatching mechanism, a kind of mini-dispatcher. The Nehalem (Intel® Core™ i7) and Westmere processor families add additional SIMD instructions beyond those defined by SSE4.1. The Nehalem processor family adds the SSE4.2 SIMD instructions and the Westmere family adds AES-NI.
Creating two additional internal versions of the IPP library for the SSE4.2 and AES-NI instructions would be very space inefficient, so they are bundled as part of the SSE4.1 library. When you call a function that includes, for example, AES-NI optimizations, an additional jump directs your call to the AES-NI version within the p8/y8 library. Because the enhancements affect the optimization of only a small number of IPP functions, this additional overhead occurs infrequently and only when your application is executing on a p8/y8 architecture processor.
Processor Architecture Table
The following table was copied from an Intel® Compiler Options for Intel® SSE and Intel® AVX generation (SSE2, SSE3, SSSE3, ATOM_SSSE3, SSE4.1, SSE4.2, ATOM_SSE4.2, AVX, AVX2, AVX-512) and processor-specific optimizations article describing some compiler architecture options. It contains a list of Intel processors showing which processors support which vector instructions. For the latest table please refer to the original article; it gets updated on a regular basis. Please note that the behavior of the Intel Compiler SIMD dispatcher described in that article does not apply to the Intel® IPP library.
Note:The Intel® IPP library dispatching mechanism behaves different than the one in the Intel Compiler products, and may also behave different than other Intel library products.
Additional information regarding dispatching and how it relates to non-Intel processors can be found here. How to identify your specific processor is described here. To correlate a processor family name with an Intel CPU brand name, use the ark.intel.com web site.
COMMON-AVX512 | A future Intel® Processor. |
MIC-AVX512 | The Intel® Xeon Phi™ processor x200 product family. |
CORE-AVX512 | A future Intel® Processor |
CORE-AVX2 | 4th Generation Intel® Core™ Processors |
CORE-AVX-I | 3rd Generation Intel® Core™ i7 Processors 3rd Generation Intel® Core™ i5 Processors 3rd Generation Intel® Core™ i3 Processors Intel® Xeon® Processor E7 v2 Family Intel® Xeon® Processor E5 v2 Family Intel® Xeon® Processor E3 v2 Family |
AVX | 2nd Generation Intel® Core™ i7 Processors 2nd Generation Intel® Core™ i5 Processors 2nd Generation Intel® Core™ i3 Processors Intel® Xeon® Processor E5 Family Intel® Xeon® Processor E3 Family |
SSE4.2 | Previous Generation Intel® Core™ i7 Processors Previous Generation Intel® Core™ i5 Processors Previous Generation Intel® Core™ i3 Processors Intel® Xeon® 55XX series Intel® Xeon® 56XX series Intel® Xeon® 75XX series Intel® Xeon® Processor E7 Family |
ATOM_SSE4.2 | Intel® Atom™ processors that support Intel® SSE4.2 instructions. |
SSE4.1 | Intel® Xeon® 74XX series Quad-Core Intel® Xeon 54XX, 33XX series Dual-Core Intel® Xeon 52XX, 31XX series Intel® Core™ 2 Extreme 9XXX series Intel® Core™ 2 Quad 9XXX series Intel® Core™ 2 Duo 8XXX series Intel® Core™ 2 Duo E7200 |
SSSE3 | Quad-Core Intel® Xeon® 73XX, 53XX, 32XX series Dual-Core Intel® Xeon® 72XX, 53XX, 51XX, 30XX series Intel® Core™ 2 Extreme 7XXX, 6XXX series Intel® Core™ 2 Quad 6XXX series Intel® Core™ 2 Duo 7XXX (except E7200), 6XXX, 5XXX, 4XXX series Intel® Core™ 2 Solo 2XXX series Intel® Pentium® dual-core processor E2XXX, T23XX series |
ATOM_SSSE3 | Intel® Atom™ processors |
SSE3 | Dual-Core Intel® Xeon® 70XX, 71XX, 50XX Series Dual-Core Intel® Xeon® processor (ULV and LV) 1.66, 2.0, 2.16 Dual-Core Intel® Xeon® 2.8 Intel® Xeon® processors with SSE3 instruction set support Intel® Core™ Duo Intel® Core™ Solo Intel® Pentium® dual-core processor T21XX, T20XX series Intel® Pentium® processor Extreme Edition Intel® Pentium® D Intel® Pentium® 4 processors with SSE3 instruction set support |
SSE2 | Intel® Xeon® processors Intel® Pentium® 4 processors Intel® Pentium® M |
IA32 | Intel® Pentium® III Processor Intel® Pentium® II Processor Intel® Pentium® Processor |
Table 2: Intel Processors Associated with Specific CPU Vector Instructions
* Other names and brands may be claimed as the property of others.
Microsoft, Windows, and the Windows logo are trademarks, or registered trademarks of Microsoft Corporation in the United States and/or other countries.
Optimization Notice |
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 |
Copyright © 2002-2016, Intel Corporation. All rights reserved.