- Introduction
- What is SPIR?
- How is SPIR binary different from Intermediate Binary?
- How to produce SPIR binary with an Intel command line compiler?
- How to produce SPIR binary with Intel® INDE's Kernel Builder?
- How to consume SPIR binary in your OpenCL™ program?
- Advantages of a SPIR Binary
- Disadvatages of a SPIR Binary
- Building and Running a SPIR Sample
- Conclusion
- Acknowledgements
- References
- About the Author
- Download the Sample
Introduction
In this short tutorial we are going to give you a brief introduction to Khronos SPIR, touch on the differences between a SPIR binary and an Intel proprietary Intermediate Binary, and then are going to demonstrate couple of ways of creating SPIR binaries using tools shipped with Intel® INDE and way of consuming SPIR binaries in your OpenCL program.
What is SPIR?
SPIR stands for The Standard Portable Intermediate Representation. It is a portable binary encoding of OpenCL C device programs based on LLVM IR. The main goal of SPIR was to enable application developers to avoid shipping their kernels in a source form, while maintaining portability between vendors and devices. Note, that SPIR is a Khronos extension, so you should check whether device you are considering to deploy to supports cl_khr_spir extension. All 4th Generation and above Intel® Processors running Windows and Adroid OSes with the latest drivers support SPIR. If you installed Intel® OpenCL™ Code Builder, which is now part of Intel® INDE and Intel® Media Server Studio suites, you can query platform and device information from within the Microsoft Visual Studio:
For example, the latest Intel® Windows graphics driver for 4th Generation Intel® Core Processors and above (10.18.14.4080) supports three OpenCL devices: OpenCL 1.2 CPU and GPU devices and Experimental OpenCL 2.0 CPU device.
And all of these devices support cl_khr_spir extension:
How is SPIR binary different from Intermediate Binary?
SPIR binary is portable between the devices from a single vendor and between devices provided by different vendors. For example, if you have a computer with an Intel Processor with Intel® Processor Graphics that has an NVidia or AMD graphics card, you should see at least three devices: Intel CPU device, Intel GPU device and an NVidia or AMD device. Your SPIR binary should work on all three of them.
How to produce SPIR binary with an Intel command line compiler?
Intel® OpenCL™ Code Builder comes with command line compilers for OpenCL C. At the command line, type the following:
ioc64 -cmd=build -input=SobelKernels.cl -device=gpu -spir64=SobelKernels_x64.spir -bo="-cl-std=CL1.2"
or you can type the following shorter version:
ioc64 -cmd=build -input=SobelKernels.cl -spir64=SobelKernels_x64.spir -bo="-cl-std=CL1.2"
Note, that specifying the device just enables you to make sure that your kernel can compile for a particular device, but does not affect the result: in the first case the kernel will be built for the GPU device, in the second for the default CPU device, but the resulting SPIR file in both cases will be the same! Also note that you should be able to produce SPIR files even if the platform you are developing on does not support SPIR. SPIR generation is also supported on Linux as part of Intel® OpenCL Code Builder that comes with Intel® Media Server Studio: you can used either a standalone Kernel Builder tool of the Eclipse plug-in to generate SPIR binaries on supported Linux platforms.
How to produce SPIR binary with Intel® INDE’s Kernel Builder?
Open a solution with the OpenCL file in it. Click on the OpenCL file, right click and select Create Code Builder Session from the popup menu:
Click on the resulting Session in the Code Builder Session Explorer and select Build:
If the build is successful, you should see .spir files among the artifacts.
Please use the _x64.spir version for your purposes. Note that corresponding SobelKernels_x86.ll and SobelKernels_x64.ll files above are textual representations of the SPIR binaries SobelKernels_x86.spir and SobelKernels_x64.spir respectively. Advanced developers can examine these files for their analysis work. Note, that when you build your application in Win32 mode, the CPU device will require _x86.spir file while the GPU device is always consumes _x64.spir files. When you build your application in x64 mode, both the CPU device and the GPU device will consume the same _x64.spir files.
How to consume SPIR binary in your OpenCL™ program?
Before trying to build a program from a SPIR binary, you should check whether the platform and the device(s) you are planning to target support cl_khr_spir extension. Use clGetPlatformInfo call with CL_PLATFORM_EXTENSIONS flag and search for the cl_khr_spir string. You should also use clGetDeviceInfo call with CL_DEVICE_EXTENSIONS flag to check SPIR support of each device on your platform. You will need to read a binary file (in our case SobelKernels_x64.spir) into a character array with regular C or C++ APIs. Then you need to create a program with clCreateProgramWithBinary call. You will then need to build the program with clBuildProgram and provide “-x spir” in addition to regular optimization flags, like “-cl-mad-enable”. You are now ready to create your kernels as you would normally would.
Advantages of a SPIR Binary
SPIR binary is portable between devices and between vendors.
SPIR binary is smaller than the native Intermediate Binary.
Disadvantages of a SPIR Binary
Additional time is required to build a program from a SPIR binary as opposed to building a program from an Intermediate Binary, since additional translation and optimization steps are involved.
You need to provide build flags when building a program from a SPIR binary, since some of the optimizations specified by the flags are done by later compilation stages.
Building and Running a SPIR Sample
Make sure to install Intel® OpenCL™ Code Builder that comes with either Intel® INDE or Intel® Media Server Studio (see References section). Open Sobel_OCL solution (the source code for the sample is available at the end of the article) and build it:
Now go ahead and create a new Code Builder Kernel Development session as follows:
In the New Session popup dialog, make sure to name the session SobelKernels, specify location as directory where your solution is located, select Add CL Files: in the Session Content section and select SobelKernels.cl file provided in the solution. Uncheck Duplicate files to session folder check box and click Done button:
Now you can select Session ‘SobelKernels’ and build it:
You should see Build Artifacts and Kernels folders populated:
Now, you have everything ready to run your workload. Open the command window and change directory to where your solution file is located and type the following at the command prompt:
.\x64\Release\Sobel_OCL.exe 100 gpu intel 2048 2048 show_CL spir
You should see the following output:
Notice, that four *_validation.ppm files appeared in your directory. When you examine them, they all should contain the following picture of a nicely Sobelized dog:
Now, you can run the tutorial on the CPU OpenCL device by typing the following:
.\x64\Release\Sobel_OCL.exe 100 cpu intel 2048 2048 show_CL spir
You can also run the tutorial with openclc and ir options on the GPU device:
.\x64\Release\Sobel_OCL.exe 100 gpu intel 2048 2048 show_CL openclc .\x64\Release\Sobel_OCL.exe 100 gpu intel 2048 2048 show_CL ir
Examine OpenCL C Program Build Log printed at the beginning of each run and compare it with the SPIR run on the GPU device. Notice that there is no output with ir option, since Intermediate Binary is fully built and ready to go.
You can also run the tutorial with openclc and ir options on the CPU device: just remember before running with ir option to change the session options
and specify CPU device there:
and don’t forget to provide appropriate build options:
And select the right Session Architecture:
then rebuild the session. Notice that SobelKernels.asm appeared under Build Artifacts folder. SobelKernels.ir will be overwritten with the CPU device Intermediate Representation binary. You can now safely run
.\x64\Release\Sobel_OCL.exe 100 cpu intel 2048 2048 show_CL ir
Conclusion
In this short tutorial we gave a brief introduction to SPIR, its differences from Intel proprietary Intermediate Binary format, ways to produce SPIR binaries with Intel command line compiler shipped with Intel® INDE and with the Intel® Kernel Builder add-on to Microsoft Visual Studio and then how to consume resulting SPIR binary in your OpenCL program.
Acknowledgements
Thank you to Uri Levy to reviewing this article and providing great feedback!
References
- Khronos SPIR website: https://www.khronos.org/spir
- Khronos SPIR FAQ: https://www.khronos.org/faq/spir
- Intel® INDE: https://software.intel.com/en-us/intel-inde
- Intel® Media Server Studio: https://software.intel.com/en-us/intel-media-server-studio
- Intel® Code Builder for OpenCL™ API for Linux*: https://software.intel.com/en-us/articles/intel-code-builder-for-opencl-api
- User Manual for OpenCL™ Code Builder: https://software.intel.com/en-us/code-builder-user-manual
About the Author
Robert Ioffe is a Technical Consulting Engineer at Intel’s Software and Solutions Group. He is an expert in OpenCL programming and OpenCL workload optimization on Intel Iris and Intel Iris Pro Graphics with deep knowledge of Intel Graphics Hardware. He was heavily involved in Khronos standards work, focusing on prototyping the latest features and making sure they can run well on Intel architecture. Most recently he has been working on prototyping Nested Parallelism (enqueue_kernel functions) feature of OpenCL 2.0 and wrote a number of samples that demonstrate Nested Parallelism functionality, including GPU-Quicksort for OpenCL 2.0. He also recorded and released two Optimizing Simple OpenCL Kernels videos and GPU-Quicksort and Sierpinski Carpet in OpenCL 2.0 videos.
You might also be interested in the following:
Optimizing Simple OpenCL Kernels: Modulate Kernel Optimization
Optimizing Simple OpenCL Kernels: Sobel Kernel Optimization
GPU-Quicksort in OpenCL 2.0: Nested Parallelism and Work-Group Scan Functions
Sierpiński Carpet in OpenCL 2.0