Introduction
This document introduces the basic concept of the Intel® Xeon Phi™ coprocessor x200 product family, tells how to install the coprocessor software stack, discusses the build environment, and points to important documents so that you can write code and run applications.
The Intel Xeon Phi coprocessor x200 is the second generation of the Intel Xeon Phi product family. Unlike the first generation running on an embedded Linux* uOS, this second generation supports the standard Linux kernel. The Intel Xeon Phi coprocessor x200 is designed for installation in a third-generation PCI Express* (PCIe*) slot of an Intel® Xeon® processor host. The following figure shows a typical configuration:
Benefits of the Intel Xeon Phi coprocessor:
- System flexibility: Build a system that can support a wide range of applications, from serial to highly parallel, while leveraging code optimized for Intel Xeon processors or Intel Xeon Phi processors.
- Maximize density: Gain significant performance improvements with limited acquisition cost by maximizing system density.
- Upgrade path: Improve performance by adding to an Intel Xeon processor system or upgrading from the first generation of the Intel Xeon Phi product family with minimum code changes.
For workloads that fit within 16 GB coprocessor memory, adding a coprocessor to a host server allows customers to avoid costly networking. For workloads that have a significant portion of highly parallel phases, offload can offer significant performance with minimal code optimization investment.
Additional Documentation
- Intel® 64 and IA-32 Architectures Software Developer’s Manual: Documents the model-specific registers (MSRs) for the Intel Xeon Phi coprocessor x200 product family.
- Intel Xeon Phi Processor Software Optimization Guide: Documents important features of the Intel Xeon Phi processor x200 product family and how to take advantage of them.
- Intel Xeon Phi Processor Performance Monitoring Reference Manual: Documents the performance monitoring registers and events for the Intel Xeon Phi processor x200 product family.
- Intel Xeon Phi processor x200 product family Linux OS support: Included with the MPSP release.
- Intel® Parallel Studio XE
Basic System Architecture
The Intel Xeon Phi coprocessor x200 is based on a modern Intel® Atom™ microarchitecture with considerable high performance computing (HPC)-focused performance improvements. It has up to 72 cores with four threads per core, giving a total of 288 CPUs as viewed by the operating system, and has up to 16 GB of high-bandwidth on-package MCDRAM memory that provides over 500 GB/s effective bandwidth. The coprocessor has an x16 PCI Express Gen3 interface (8 GT/s) to connect to the host system.
The cores are laid out in units called tiles. Each tile contains a pair of cores, a shared 1 MB L2 cache, and a hub connecting the tile to a mesh interface. Each core contains two 512-bit wide vector processing units. The coprocessor supports Intel® AVX-512F (foundation), Intel AVX-512CD (conflict detection), Intel AVX-512PF (prefetching), and Intel AVX-512ER (exponential reciprocal) ISA.
Intel® Manycore Platform Software Stack
Intel® Manycore Platform Software Stack (Intel® MPSS) is the user and system software that allows programs to run on and communication with the Intel Xeon Phi coprocessor. Intel MPSS version 4.x.x is used for the Intel Xeon Phi coprocessor x200 and can be download from here [(https://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-for-intel-xeon-phi-coprocessor-x200)]. (Note that the older Intel MPSS version 3.x.x is used for the Intel Xeon Phi coprocessor x100); standard Linux kernel running on the coprocessor.
You can download the Intel MPSS stack at https://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-for-intel-xeon-phi-coprocessor-x200. The following host operating systems are supported: Red Hat* Enterprise Linux Server, SUSE* Linux Enterprise Server and Microsoft Windows*. For detailed information on requirements and on installation, please consult the README file for Intel MPSS. The figure below shows the high representation of the Intel MPSS. The host software stack is on the left and the coprocessor software stack is on the right.
Install the Software Stack and Start the Coprocessor
Installation Guide for Linux* Host:
- From the “Intel Manycore Platform Software Stack for Intel Xeon Phi Coprocessor x200 (https://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-for-intel-xeon-phi-coprocessor-x200), navigate to the latest version of the Intel MPSS release for Linux and download “Readme for Linux (English)” (README.txt). Also download the release notes (releasenotes-linux.txt) and the User’s Guide for Intel MPSS.
- Install one of the following supported operating systems in the host:
- Red Hat Enterprise Linux Server 7.2 64-bit kernel 3.10.0-327
- Red Hat Enterprise Linux Server 7.3 64-bit kernel 3.10.0-514
- SUSE Linux Enterprise Server SLES 12 kernel 3.12.28-4-default
- SUSE Linux Enterprise Server SLES 12 SP1 kernel 3.12.49-11-default
- SUSE Linux Enterprise Server SLES 12 SP2 kernel 4.4.21-69-default
Be sure to install ssh, which is used to log in to the card.
WARNING: On installing Red Hat, it may automatically update you to a new version of the Linux kernel. If this happens, you will not be able to use the prebuilt host driver, but will need to rebuild it manually for the new kernel version. Please see Section 5 in the readme.txt for instructions on building an Intel MPSS host driver for a specific Linux kernel.
- Log in as root.
- Download the release driver appropriated for your operating system in Step 1 (<mpss-version>-linux.tar), where <mpss-4> is mpss-4.3.3 at the time this document was written.
- Install the host driver RPMs as detailed in Section 6 of readme.txt. Don’t skip the creation of configuration files for your coprocessor.
- Update the flash on your coprocessor(s) as detailed in Section 8 of readme.txt.
- Reboot the system.
- Start the Intel Xeon Phi coprocessor (you can set up the card to start with the host system; it will not do so by default), and then run micinfo to verify that it is set up properly:
# systemctl start mpss # micctrl –w # /usr/bin/micinfo micinfo Utility Log Created On Mon Apr 10 12:14:08 2017 System Info: Host OS : Linux OS Version : 3.10.0-327.el7.x86_64 MPSS Version : 4.3.2.5151 Host Physical Memory : 128529 MB Device No: 0, Device Name: mic0 [x200] Version: SMC Firmware Version : 121.27.10198 Coprocessor OS Version : 4.1.36-mpss_4.3.2.5151 GNU/Linux Device Serial Number : QSKL64000441 BIOS Version : GVPRCRB8.86B.0012.R02.1701111545 BIOS Build date : 01/11/2017 ME Version : 3.2.2.4 Board: Vendor ID : 0x8086 Device ID : 0x2260 Subsystem ID : 0x7494 Coprocessor Stepping ID : 0x01 UUID : A03BAF9B-5690-E611-8D4F-001E67FC19A4 PCIe Width : x16 PCIe Speed : 8.00 GT/s PCIe Ext Tag Field : Disabled PCIe No Snoop : Enabled PCIe Relaxed Ordering : Enabled PCIe Max payload size : 256 bytes PCIe Max read request size : 128 bytes Coprocessor Model : 0x57 Coprocessor Type : 0x00 Coprocessor Family : 0x06 Coprocessor Stepping : B0 Board SKU : B0 SKU _NA_A ECC Mode : Enabled PCIe Bus Information : 0000:03:00.0 Coprocessor SMBus Address : 0x00000030 Coprocessor Brand : Intel(R) Corporation Coprocessor Board Type : 0x0a Coprocessor TDP : 300.00 W Core: Total No. of Active Cores : 68 Threads per Core : 4 Voltage : 900.00 mV Frequency : 1.20 GHz Thermal: Thermal Dissipation : Active Fan RPM : 6000 Fan PWM : 100 % Die Temp : 38 C Memory: Vendor : INTEL Size : 16384.00 MB Technology : MCDRAM Speed : 6.40 GT/s Frequency : 6.40 GHz Voltage : Not Available
Installation Guide for Windows* Host:
- From the “Intel Manycore Platform Software Stack for Intel Xeon Phi Coprocessor x200 (https://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-for-intel-xeon-phi-coprocessor-x200), navigate to the latest version of the Intel MPSS release for Microsoft Windows. Download “Readme file for Microsoft Windows” (readme-windows.pdf). Also download the “Release notes” (releaseNotes-windows.txt) and the “Intel MPSS User’s Guide” (MPSS_Users_Guide-windows.pdf).
- Install one of the following supported operating systems in the host:
- Microsoft Windows 8.1 (64-bit)
- Microsoft Windows® 10 (64-bit)
- Microsoft Windows Server 2012 R2 (64-bit)
- Microsoft Windows Server 2016 (64-bit)
- Log in as “administrator”.
- Install .NET Framework* 4.5 or higher on the system (http://www.microsoft.com/net/download), Python* 2.7.5 x86-64 or higher (Python 3.x is not supported), Pywin32 build or higher (https://sourceforge.net/projects/pywin32).
- Be sure to install PuTTY* and PuTTYgen*, which are used to log in to the card’s OS.
- Follow the preliminary steps as instructed in Section 2.2.1 of the Readme file.
- Restart the system.
- Download the drivers package mpss-4.*-windows.zip for your Windows operating system from the page described in Step 1.
- Unzip the zip file to get the Windows exec files (“mpss-4.*.exe” and “mpss-essentials-4*.exe”).
- Install the Windows Installer file “mpss-4.*.exe” as detailed in Section 3.2 of the User’s Guide. Note that if a previous version of the Intel Xeon Phi coprocessor stack is already installed, use Windows Control Panel to uninstall it prior to installing the current version. By default, Intel MPSS is installed in “c:\Program Files\Intel\MPSS”. Also, install “mpss-essentials-4*.exe”, the native binary utilities for the Intel Xeon Phi coprocessor. These are required when using offload programming or cross compilers.
- Confirm that the new Intel MPSS stack is successfully installed by looking at Control Panel > Programs > Programs and Features: Intel Xeon Phi (see the following illustrations).
- Update the flash according to Section 2.2.3 of the readme-windows.pdf file.
- Reboot the system.
- Log in to the host and verify that the Intel Xeon Phi x200 coprocessors are detected by the Device Manager (Control Panel > Hardware > Device Manager, and click “System devices”):
- Start the Intel Xeon Phi coprocessor (you can set up the card to start with the host system; it will not do so by default). Launch a command-prompt window and start the Intel MPSS stack:
prompt> micctrl --start
- Run the command “micinfo” to verify that it is set up properly:
prompt> micinfo.exe
Intel® Parallel Studio XE
After starting the Intel MPSS stack, users can write applications running on the coprocessor using Intel Parallel Studio XE.
Intel Parallel Studio XE is a software development suite that helps boost application performance by taking advantage of the ever-increasing processor core count and vector register width available in Intel Xeon processors, Intel Xeon Phi processors and coprocessors, and other compatible processors. Starting with the Intel Parallel Studio 2018 beta, the following Intel® products support program development on the Intel Xeon Phi coprocessor x200:
- Intel® C Compiler/Intel® C++ Compiler/Intel® Fortran Compiler
- Intel® Math Kernel Library (Intel® MKL)
- Intel® Data Analytics Acceleration Library (Intel® DAAL)
- Intel® Integrated Performance Primitives (Intel® IPP)
- Intel® Cilk™ Plus
- Intel® Threading Building Blocks (Intel® TBB)
- Intel® VTune™ Amplifier XE
- Intel® Advisor XE
- Intel® Inspector XE
- Intel® MPI Library
- Intel® Trace Analyzer and Collector
- Intel® Cluster Ready
- Intel® Cluster Checker
To get started writing programs running on the coprocessor, you can get the code samples at https://software.intel.com/en-us/product-code-samples. The packages “Intel Parallel Studio XE for Linux - Sample Bundle”, and “Intel Parallel Studio XE for Windows - Sample Bundle” contain code samples for Linux and Windows, respectively.
Programming Models on Coprocessor
There are three programing models that can be used for the Intel Xeon Phi coprocessor x200: offload programing model, symmetric programing model, and native programing model.
- Offload programing: The main application runs on the host, and offload selected, highly parallel portions of the program to the coprocessor(s) to take advantage of manycore architecture. The serial portion of the program still runs in the host to take advantage of big cores architecture.
- Symmetric programming: The coprocessors and the host are treated as separate nodes. This model is suitable for distributed computing.
- Native programming: The coprocessors are used as independent nodes, just like a host. Users compile the binary for the coprocessor in the host, transfer the binary, and log in the coprocessor to run the binary.
The figure below summarizes different programming models used for the Intel Xeon Phi coprocessor: