Purpose
This recipe describes how to get, build, and run the GROMACS* code on Intel® Xeon® and Intel® Xeon Phi™ processors for better performance on a single node.
Introduction
GROMACS is a versatile package for performing molecular dynamics, using Newtonian equations of motion, for systems with hundreds to millions of particles. GROMACS is primarily designed for biochemical molecules like proteins, lipids, and nucleic acids that have a multitude of complicated bonded interactions. But, since GROMACS is extremely fast at calculating the non-bonded interactions typically dominating simulations, many researchers use it for research on non-biological systems, such as polymers.
GROMACS supports all the usual algorithms expected from a modern molecular dynamics implementation.
The GROMACS code is maintained by developers around the world. The code is available under the GNU General Public License from www.gromacs.org.
Code Access
Download GROMACS:
- Get the GROMACS-2016.1 release. This code version includes optimization for better performance on the Intel® Xeon Phi™ processor: http://manual.gromacs.org/documentation/2016/download.html
Workloads Access
Download the workloads:
- water1.5M_pme and water1.5M_rf: ftp://ftp.gromacs.org/pub/benchmarks/water_GMX50_bare.tar.gz
- lignocellulose3M_rf: http://www.prace-i.eu/UEABS/GROMACS/1.2/GROMACS_TestCaseB.tar.gz
Generate Water Workloads Input Files:
To generate the .tpr input file:
- tar xf water_GMX50_bare.tar.gz
- cd water-cut1.0_GMX50_bare/1536
- gmx_mpi grompp -f pme.mdp -c conf.gro -p topol.top -o topol_pme.tpr
- gmx_mpi grompp -f rf.mdp -c conf.gro -p topol.top -o topol_rf.tpr
Build Directions
Build the GROMACS binary. Use cmake configuration for Intel® Compiler 2017.1.132 + Intel® MKL + Intel® MPI 2017.1.132:
Set the Intel Xeon Phi BIOS options to be:
- Quadrant Cluster mode
- MCDRAM Flat mode
- Turbo Enabled
For Intel Xeon Phi, build the code as:
- BuildDir= "${GromacsPath}/build” # Create the build directory
- installDir="${GromacsPath}/install"
mkdir $BuildDir
- source /opt/intel/<version>/bin/compilervars.sh intel64 # Source the Intel compiler, MKL and IMPI
- source /opt/intel/impi/<version>/mpivars.sh
source /opt/intel/mkl/<version>/mklvars.sh intel64
- cd $BuildDir # Set the build environments for Intel Xeon Phi
FLAGS="-xMIC-AVX512 -g -static-intel"; CFLAGS=$FLAGS CXXFLAGS=$FLAGS CC=mpiicc CXX=mpiicpc cmake .. -DBUILD_SHARED_LIBS=OFF -DGMX_FFT_LIBRARY=mkl -DCMAKE_INSTALL_PREFIX=$installDir -DGMX_MPI=ON -DGMX_OPENMP=ON -DGMX_CYCLE_SUBCOUNTERS=ON -DGMX_GPU=OFF -DGMX_BUILD_HELP=OFF -DGMX_HWLOC=OFF -DGMX_SIMD=AVX_512_KNL -DGMX_OPENMP_MAX_THREADS=256
For Intel Xeon, set the build environments and build the code as above with changes:
- FLAGS="-xCORE-AVX2 -g -static-intel"
- -DGMX_SIMD=AVX2_256
Build GROMACS:
- make -j 4
- sleep 5
- make check
Run Directions
Run workloads on Intel Xeon Phi with the environment settings and command lines as (nodes.txt : localhost:272):
export I_MPI_DEBUG=5 export I_MPI_FABRICS=shm export I_MPI_PIN_MODE=lib export KMP_AFFINITY=verbose,compact,1 gmxBin="${installDir}/bin/gmx_mpi" mpiexec.hydra -genvall -machinefile ./nodes.txt -np 66 numactl -m 1 $gmxBin mdrun -npme 0 -notunepme -ntomp 4 -dlb yes -v -nsteps 4000 -resethway -noconfout -pin on -s ${WorkloadPath}water-cut1.0_GMX50_bare/1536/topol_pme.tpr export KMP_BLOCKTIME=0 mpiexec.hydra -genvall -machinefile ./nodes.txt -np 66 numactl -m 1 $gmxBin mdrun -ntomp 4 -dlb yes -v -nsteps 1000 -resethway -noconfout -pin on -s ${WorkloadPath}lignocellulose-rf.BGQ.tpr mpiexec.hydra -genvall -machinefile ./nodes.txt -np 64 numactl -m 1 $gmxBin mdrun -ntomp 4 -dlb yes -v -nsteps 5000 -resethway -noconfout -pin on -s ${WorkloadPath}water-cut1.0_GMX50_bare/1536/topol_rf.tpr
Run workloads on Intel Xeon with the environment settings and command lines as:
export I_MPI_DEBUG=5 export I_MPI_FABRICS=shm export I_MPI_PIN_MODE=lib export KMP_AFFINITY=verbose,compact,1 gmxBin="${installDir}/bin/gmx_mpi" mpiexec.hydra -genvall -machinefile ./nodes.txt -np 72 $gmxBin mdrun -notunepme -ntomp 1 -dlb yes -v -nsteps 4000 -resethway -noconfout -s ${WorkloadPath}water-cut1.0_GMX50_bare/1536_bdw/topol_pme.tpr export KMP_BLOCKTIME=0 mpiexec.hydra -genvall -machinefile ./nodes.txt -np 72 $gmxBin mdrun -ntomp 1 -dlb yes -v -nsteps 1000 -resethway -noconfout -s ${WorkloadPath}lignocellulose-rf.BGQ.tpr mpiexec.hydra -genvall -machinefile ./nodes.txt -np 72 $gmxBin mdrun -ntomp 1 -dlb yes -v -nsteps 5000 -resethway -noconfout -s ${WorkloadPath}water-cut1.0_GMX50_bare/1536_bdw/topol_rf.tpr
Performance Testing
Performance tests for GROMACS are illustrated below with comparisons between an Intel Xeon processor and an Intel Xeon Phi processor against three standard workloads: water1536k_pme, water1536k_rf, and lignocellulose3M_rf. In all cases, turbo mode is turned on.
Testing Platform Configurations
The following hardware was used for the above recipe and performance testing.
Processor | Intel® Xeon® Processor E5-2697 v4 | Intel® Xeon Phi™ Processor 7250 |
---|---|---|
Stepping | 1 (B0) | 1 (B0) Bin1 |
Sockets / TDP | 2S / 290W | 1S / 215W |
Frequency / Cores / Threads | 2.3 GHz / 36 / 72 | 1.4 GHz / 68 / 272 |
DDR4 | 8x16GB 2400 MHz(128GB) | 6x16 GB 2400 MHz |
MCDRAM | N/A | 16 GB Flat |
Cluster/Snoop Mode/Mem Mode | Home | Quadrant/flat |
Turbo | On | On |
BIOS | GRRFSDP1.86B.0271.R00.1510301446 | GVPRCRB1.86B.0011.R04.1610130403 |
Compiler | ICC-2017.1.132 | ICC-2017.1.132 |
Operating System | Red Hat Enterprise Linux* 7.2 | Red Hat Enterprise Linux 7.2 |
3.10.0-327.el7.x86_64 | 3.10.0-327.13.1.el7.xppsl_1.3.3.151.x86_64 |
GROMACS Build Configurations
The following configurations were used for the above recipe and performance testing.
- GROMACS Version: GROMACS-2016.1
- Intel® Compiler Version: 2017.1.132
- Intel® MPI Library Version: 2017.1.132
- Workloads used: water1536k_pme, water1536k_rf, and lignocellulose3M_rf