Quantcast
Channel: Intel Developer Zone Articles
Viewing all articles
Browse latest Browse all 3384

Recipe: Building NAMD on Intel® Xeon® and Intel® Xeon Phi™ Processors

$
0
0

Purpose

This recipe describes a step-by-step process of how to get, build, and run NAMD, Scalable Molecular Dynamic, code on Intel® Xeon Phi™ processor and Intel® Xeon® E5 processors for better performance.

Introduction

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecule systems. Based on Charm++ parallel objects, NAMD scales to hundreds of cores for typical simulations and beyond 500,000 cores for the largest simulations. NAMD uses the popular molecular graphics program VMD for simulation setup and trajectory analysis, but is also file-compatible with AMBER, CHARMM, and X-PLOR.

NAMD is distributed free of charge with source code. You can build NAMD yourself or download binaries for a wide variety of platforms. Find the details below of how to build on Intel® Xeon Phi™ processor and Intel® Xeon® E5 processors and learn more about NAMD at http://www.ks.uiuc.edu/Research/namd/

Building NAMD on Intel® Xeon® Processor E5-2697 v4 (BDW) and Intel® Xeon Phi™ Processor 7250 (KNL)

  1. Download the latest NAMD source code(Nightly Build) from this site: http://www.ks.uiuc.edu/Development/Download/download.cgi?PackageName=NAMD
  2. Download fftw3 from this site: http://www.fftw.org/download.html
    • Version 3.3.4 is recommended
  3. Build fftw3:
    1. Cd<path>/fftw3.3.4
    2. ./configure --prefix=$base/fftw3 --enable-single --disable-fortran CC=icc
                        Use xMIC-AVX512 for KNL or –xCORE-AVX2 for BDW
    3. make CFLAGS="-O3 -xMIC-AVX512 -fp-model fast=2 -no-prec-div -qoverride-limits" clean install
  4. Download charm++* version 6.7.1
  5. Build multicore version of charm++:
    1. cd <path>/charm-6.7.1
    2. ./build charm++ multicore-linux64 iccstatic --with-production "-O3 -ip"
  6. Build BDW:
    1. Modify the Linux-x86_64-icc.arch to look like the following:
      NAMD_ARCH = Linux-x86_64
      CHARMARCH = multicore-linux64-iccstatic
      FLOATOPTS = -ip -xCORE-AVX2 -O3 -g -fp-model fast=2 -no-prec-div -qoverride-limits -DNAMD_DISABLE_SSE
      CXX = icpc -std=c++11 -DNAMD_KNL
      CXXOPTS = -static-intel -O2 $(FLOATOPTS)
      CXXNOALIASOPTS = -O3 -fno-alias $(FLOATOPTS) -qopt-report-phase=loop,vec -qopt-report=4
      CXXCOLVAROPTS = -O2 -ip
      CC = icc
      COPTS = -static-intel -O2 $(FLOATOPTS)
    2.  ./config Linux-x86_64-icc --charm-base <charm_path> --charm-arch multicore-linux64- iccstatic --with-fftw3 --fftw-prefix <fftw_path> --without-tcl --charm-opts –verbose
    3. gmake -j
  7. Build KNL:
    1. Modify the arch/Linux-KNL-icc.arch to look like the following:
      NAMD_ARCH = Linux-KNL
      CHARMARCH = multicore-linux64-iccstatic
      FLOATOPTS = -ip -xMIC-AVX512 -O3 -g -fp-model fast=2 -no-prec-div -qoverride-limits
      DNAMD_DISABLE_SSE
      CXX = icpc -std=c++11 -DNAMD_KNL
      CXXOPTS = -static-intel -O2 $(FLOATOPTS)
      CXXNOALIASOPTS = -O3 -fno-alias $(FLOATOPTS) -qopt-report-phase=loop,vec -qopt-report=4
      CXXCOLVAROPTS = -O2 -ip
      CC = icc
      COPTS = -static-intel -O2 $(FLOATOPTS)
    2. ./config Linux-KNL-icc --charm-base <charm_path> --charm-arch multicore-linux64-iccstatic --with-fftw3 --fftw-prefix <fftw_path> --without-tcl --charm-opts –verbose
    3. gmake –j
  8. Change the kernel setting for KNL: “nmi_watchdog=0 rcu_nocbs=2-271 nohz_full=2-271”
  9. Download apoa and stmv workloads from here: http://www.ks.uiuc.edu/Research/namd/utilities/
  10. Change next lines in *.namd file for both workloads:
    	numsteps         1000
            outputtiming     20
            outputenergies   600

Run NAMD workloads on Intel® Xeon® Processor E5-2697 v4 and Intel® Xeon Phi™ Processor 7250

Run BDW (ppn = 72):

           $BIN +p $ppn apoa1/apoa1.namd +pemap 0-($ppn-1)

Run KNL (ppn = 136, MCDRAM, similar performance in cache):

           numactl –m 1 $BIN +p $ppn apoa1/apoa1.namd +pemap 0-($ppn-1)

Performance results reported in Intel® Salesforce repository

(ns/day; higher is better):

WorkloadIntel® Xeon® Processor E5-2697 v4 (ns/day)Intel® Xeon Phi™ Processor 7250 (ns/day)KNL vs. 2S BDW (speedup)
stmv0.450.55  1.22x
Ap0a15.5  6.181.12x

Systems configuration:

ProcessorIntel® Xeon® Processor E5-2697 v4(BDW)Intel® Xeon Phi™ Processor 7250 (KNL)
Stepping1 (B0)1 (B0) Bin1
Sockets / TDP2S / 290W1S / 215W
Frequency / Cores / Threads2.3 GHz / 36 / 721.4 GHz / 68 / 272
DDR4 8x16 GB 2400 MHz(128 GB)7210: 6x16 GB 2400 MHz
MCDRAMN/A16 GB Flat
Cluster/Snoop Mode/Mem ModeHomeQuadrant/flat
TurboOnOn
BIOSGRRFSDP1.86B0271.R00.1510301446GVPRCRB1.86B.0010.R02.1608040407
CompilerICC-2017.0.098ICC-2017.0.098
Operating System

Red Hat* Enterprise Linux* 7.2

(3.10.0-327.e17.x86_64)

Red Hat Enterprise Linux 7.2

(3.10.0-327.22.2.el7.xppsl_1.4.1.3272._86_64)

  

Viewing all articles
Browse latest Browse all 3384

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>