Prominent features of the Intel® Manycore Platform Software Stack (Intel® MPSS) version 3.5

The Intel® Manycore Platform Software Stack (Intel® MPSS) version 3.5 was released on April 6th, 2015. This page lists the prominent features in this release.

Intel® MPSS 3.5

Added support for Lustre client.

Coprocessor Offload Infrastructure (COI)

New APIs to better support programming models

Second Preview of Heterogeneous Streams

Microsoft® Windows*

Updated the handling of errno on SCIF applications developed for Microsoft® Windows* binaries

---

Intel® MPSS 3.5

Addition of Lustre client for the Intel® Xeon Phi™ coprocessor x100 Product Family: This release includes a pre-built Lustre client for the coprocessor. The basic installation of the Lustre Client on Intel® MPSS, requires to have :

lustre-client-2.5.3+mpss3.5-1.k1om.rpm lustre-client-modules-2.5.3+mpss3.5-1.k1om.rpm

installed in the coprocessor os.

Assuming that IPoIB is correctly configured for the coprocessor, the interface ib0 is available in the coprocessor and for configuring the client,

echo 'options lnet networks="o2ib0(ib0),tcp0(mic0)"'> /etc/modprobe.d/lustre.conf modprobe lustre

and to mount a Lustre FS shared in your network, in the coprocessor do:

/sbin/mount.lustre <server IP>@o2ib0:/lustre /mnt/lustre

After this step, all normal operations can be done in the lustre filesystem.

Coprocessor Offload Infrastructure (COI) enhancements:

New APIs to better support programming models: This release introduces two new COI API's, namely COIEventRegisterCallback() and COIProcessConfigureDMA() to enable more programming models possible:

COIEventRegisterCallback(), for supporting applications where it is desirable to know when an asynchronous event occurs. It is now possible for the user to register a callback when an operation completes, for example, to implement a state machine.
COIProcessConfigureDMA() was introduced to better utilize the Intel® Xeon Phi™ coprocessor DMA engines. With this API it is now possible to have more than one operation happening concurrently using separate DMA engines, such as sending and receiving data from the coprocessor concurrently.

Starting with this release it is now possible to indicate during COI runtime the CPU affinity of the threads it creates in the host for handling of User Events, the Pipeline and DMA. The new COI environment variable COI_HOST_THREAD_AFFINITY serves this purpose by allowing one to designate a set of CPUs in the host by which all the internal COI utility threads are affinitized.

This release introduces the ability for users to configure the temporary directory used by the coi_daemon on the coprocessor by specifying the temp directory as an argument to the coi_daemon execution, --tempDIR=<directory_path>. With this new feature users will be able to set the temp directory that COI will use on the card for all temp files created during the COI process and run function execution. For more information on this new argument please refer to the ‘coi_daemon –help’ output.

For additional details on the APIs, please refer to the header file: /usr/include/intel-coi/source/COIBuffer_source.h or the relevant man pages provided by Intel® MPSS .

Heterogeneous Streams – Second Preview

Intel® hStreams is a library that enables streaming for heterogeneous platforms; this library was originally introduced with Intel(R) MPSS 3.4 and Intel has continued to improve its features based on user feedback. hStreams makes it relatively easy to manage task parallelism on Intel® Xeon Phi™ coprocessors, and it will be portable to more general heterogeneous clusters in the future. It supports concurrency between the host and cards, within cards, and between computation and communication. It does not rely on compiler support. It uses a C API, and can thus be integrated with applications written in Fortran, Perl and other languages. While it provides a few fixed functions, like memcpy, memset, and four *gemm functions from BLAS, it allows users to provide their own source code to execute in streams, with a large number of scalar and heap arguments and even return values.

There are several improvements in this Alpha release: It enhances the portability by letting programmers refer to logical domains, but letting those logical domains be separately and flexibly bound to varying numbers and types of physical domains. It adds more functions to the query state, for physical domains, logical domains and logical streams. Support for 2MB pages is added, which offers a significant performance boost. There are changes to parameter types and API names that improve consistency, see hStreams_Porting_Guide.pdf for additional references in /usr/share/doc/hStreams (Linux*) or C:\Program Files\Intel\MPSS\docs\hstreams (Microsoft® Windows). Additionally, support for automatically allocating of sink-side buffers with mutual 64-byte alignment is added, which sometimes offers a significant performance boost.

Microsoft® Windows

Updated the handling of errno on SCIF applications developed for Microsoft® Windows* binaries: Because of static linking with the C runtime of Windows SCIF library (uscif.dll) in releases prior to Intel® MPSS 3.5, applications using the SCIF library on Windows were getting a different copy/version of ErrorNo. Please refer to

http://msdn.microsoft.com/en-us/library/windows/desktop/ms680347(v=vs.85).aspx
http://msdn.microsoft.com/en-us/library/5814770t.aspx

To work around this issue and to return the correct errors from the SCIF library, when SetLastError was called, the SCIF library used the export dll_errno method for applications. With dynamic linking support, the workaround was removed from the Intel® MPSS Windows stack. Please note that applications using the SCIF library (uscif.dll) now need to be recompiled with Intel® MPSS 3.5.

Intel, Xeon, and Intel Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries.

* Other names and brands may be claimed as the property of others.