The Intel Xeon Phi processor, code named Knights Landing, is part of the second generation of Intel Xeon Phi products. Knights Landing supports AVX-512 instructions, specifically AVX-512F (foundation), AVX-512CD (conflict detection), AVX-512ER (exponential and reciprocal) and AVX-512PF (prefetch).
If we want an application to run everywhere, in order to use these instructions in a program, we need to make sure that the operating system and the processor have support for them when the application is run.
The Intel compiler provides a single function _may_i_use_cpu_feature that does all this easily. This program shows how we can use it to test for the ability to use AVX-512F, AVX-512ER, AVX-512PF and AVX-512CD instructions.
#include <immintrin.h> #include <stdio.h> int main(int argc, char *argv[]) { const unsigned long knl_features = (_FEATURE_AVX512F | _FEATURE_AVX512ER | _FEATURE_AVX512PF | _FEATURE_AVX512CD ); if ( _may_i_use_cpu_feature( knl_features ) ) printf("This CPU supports AVX-512F+CD+ER+PF as introduced in Knights Landing\n"); else printf("This CPU does not support all Knights Landing AVX-512 features\n"); return 1; }
if we compile with the -xMIC_AVX512 flag, the Intel compiler will automatically protect the binary and such checking is not necessary. For instance, if we compile and run as follow we can see the result of running on a machine other than a Knights Landing.
icc -xMIC-AVX512 -o sample sample.c
./sample
Please verify that both the operating system and the processor support Intel(R) MOVBE, F16C, AVX, FMA, BMI, LZCNT, AVX2, AVX512F, ADX, RDSEED, AVX512ER, AVX512PF and AVX512CD instructions.
In order to run on all processors, we compile and run as follows:
icc -axMIC-AVX512 -o sample sample.c
./sample
When we run on a Knights Landing it prints:
This CPU supports AVX-512F+CD+ER+PF as introduced in Knights Landing
When we run on a processor without the AVX-512 support at least equivalent to Knights Landing it prints:
This CPU does not support all Knights Landing AVX-512 features
If we want to support compilers other than Intel, the code is slightly more complex because the function _may_i_use_cpu_feature is not standard (and neither are the __buildin functions in gcc and clang/LLVM). The following code works with at least the Intel compiler, gcc, clang/LLVM and Microsoft compilers.
#if defined(__INTEL_COMPILER) && (__INTEL_COMPILER >= 1300) #include <immintrin.h> int has_intel_knl_features() { const unsigned long knl_features = (_FEATURE_AVX512F | _FEATURE_AVX512ER | _FEATURE_AVX512PF | _FEATURE_AVX512CD ); return _may_i_use_cpu_feature( knl_features ); } #else /* non-Intel compiler */ #include <stdint.h> #if defined(_MSC_VER) #include <intrin.h> #endif void run_cpuid(uint32_t eax, uint32_t ecx, uint32_t* abcd) { #if defined(_MSC_VER) __cpuidex(abcd, eax, ecx); #else uint32_t ebx, edx; #if defined( __i386__ ) && defined ( __PIC__ ) /* in case of PIC under 32-bit EBX cannot be clobbered */ __asm__ ( "movl %%ebx, %%edi \n\t cpuid \n\t xchgl %%ebx, %%edi" : "=D" (ebx), # else __asm__ ( "cpuid" : "+b" (ebx), # endif"+a" (eax), "+c" (ecx), "=d" (edx) ); abcd[0] = eax; abcd[1] = ebx; abcd[2] = ecx; abcd[3] = edx; #endif } int check_xcr0_zmm() { uint32_t xcr0; uint32_t zmm_ymm_xmm = (7 << 5) | (1 << 2) | (1 << 1); #if defined(_MSC_VER) xcr0 = (uint32_t)_xgetbv(0); /* min VS2010 SP1 compiler is required */ #else __asm__ ("xgetbv" : "=a" (xcr0) : "c" (0) : "%edx" ); #endif return ((xcr0 & zmm_ymm_xmm) == zmm_ymm_xmm); /* check if xmm, zmm and zmm state are enabled in XCR0 */ } int has_intel_knl_features() { uint32_t abcd[4]; uint32_t osxsave_mask = (1 << 27); // OSX. uint32_t avx2_bmi12_mask = (1 << 16) | // AVX-512F (1 << 26) | // AVX-512PF (1 << 27) | // AVX-512ER (1 << 28); // AVX-512CD run_cpuid( 1, 0, abcd ); // step 1 - must ensure OS supports extended processor state management if ( (abcd[2] & osxsave_mask) != osxsave_mask ) return 0; // step 2 - must ensure OS supports ZMM registers (and YMM, and XMM) if ( ! check_xcr0_zmm() ) return 0; return 1; } #endif /* non-Intel compiler */ static int can_use_intel_knl_features() { static int knl_features_available = -1; /* test is performed once */ if (knl_features_available < 0 ) knl_features_available = has_intel_knl_features(); return knl_features_available; } #include <stdio.h> int main(int argc, char *argv[]) { if ( can_use_intel_knl_features() ) printf("This CPU supports AVX-512F+CD+ER+PF as introduced in Knights Landing\n"); else printf("This CPU does not support all Knights Landing AVX-512 features\n"); return 1; }
Acknowledgment: Thank you to Max Locktyukhin (Intel) for his article 'How to detect New Instruction support in the 4th generation Intel® Core™ processor family' which served as the model for my Knights Landing detection code.