Quantcast
Channel: Intel Developer Zone Articles
Viewing all articles
Browse latest Browse all 3384

Recipe: Building NAMD on Intel® Xeon® and Intel® Xeon Phi™ Processors on a Single Node

$
0
0

For cluster run, please refer to the recipe: Building NAMD on Intel® Xeon® and Intel® Xeon Phi™ Processors on cluster

Purpose

This recipe describes a step-by-step process for getting, building, and running NAMD (scalable molecular dynamics code) on the Intel® Xeon Phi™ processor and Intel® Xeon® processor E5 family to achieve better performance.

Introduction

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecule systems. Based on Charm++ parallel objects, NAMD scales to hundreds of cores for typical simulations and beyond 500,000 cores for the largest simulations. NAMD uses the popular molecular graphics program VMD for simulation setup and trajectory analysis, but is also file-compatible with AMBER, CHARMM, and X-PLOR.

NAMD is distributed free of charge with source code. You can build NAMD yourself or download binaries for a wide variety of platforms. Below are the details for how to build NAMD on the Intel Xeon Phi processor and Intel Xeon processor E5 family. You can learn more about NAMD at http://www.ks.uiuc.edu/Research/namd/.

Building and Running NAMD on the Intel® Xeon® Processor E5-2697 v4 (formerly Broadwell (BDW)), Intel® Xeon Phi™ Processor 7250 (formerly Knight Landing (KNL)), and Intel® Xeon® Gold 6148 Processor (formerly Skylake (SKX))

Download the code

  1. Download the latest NAMD source code from this site: http://www.ks.uiuc.edu/Development/Download/download.cgi?PackageName=NAMD
  2. Download the Charm++ 6.7.1 version.

    a. You can get Charm++ from the NAMD source code of the Version Nightly Build.

    b. Or download it separately: http://charmplusplus.org/download/

  3. Download the fftw3 version: http://www.fftw.org/download.html

    Version 3.3.4 is used is this run.

  4. Download apoa1 and stvm workloads: http://www.ks.uiuc.edu/Research/namd/utilities/

Build the binaries

  1. Set environment for compilation:
    CC=icc; CXX=icpc; F90=ifort; F77=ifort
    export CC CXX F90 F77
    source /opt/intel/compiler/<version>/compilervars.sh intel64​
  2. Build fftw3:

    a.

    cd <fftw_root_path>

    b.

    ./configure --prefix=<fftw_install_path> --enable-single --disable-fortran CC=icc
    Use –xCORE-AVX512 for SKX, -xMIC-AVX512 for KNL and –xCORE-AVX2 for BDW
    

    c.

    make CFLAGS=“-O3 -xMIC-AVX512 -fp-model fast=2 -no-prec-div -qoverride-limits” clean install 
  3. Build a multicore version of Charm++:

    a.

    cd <charm_root_path>

    b.

    ./build charm++ multicore-linux64 iccstatic --with-production “-O3 -ip” 
  4. Build NAMD:

    a. Modify the arch/Linux-x86_64-icc to look like the following (select one of the FLOATOPTS options depending on the CPU type):

    NAMD_ARCH = Linux-x86_64
    CHARMARCH = multicore-linux64-iccstatic
    
    # For KNL
    FLOATOPTS = -ip -xMIC-AVX512  -O3 -g -fp-model fast=2 -no-prec-div -qoverride-limits -DNAMD_DISABLE_SSE
    
    # For SKX
    FLOATOPTS = -ip -xCORE-AVX512  -O3 -g -fp-model fast=2 -no-prec-div -qoverride-limits -DNAMD_DISABLE_SSE
    
    # For BDW
    FLOATOPTS = -ip -xCORE-AVX2  -O3 -g -fp-model fast=2 -no-prec-div -qoverride-limits -DNAMD_DISABLE_SSE
    
    CXX = icpc -std=c++11 -DNAMD_KNL
    CXXOPTS = -static-intel -O2 $(FLOATOPTS)
    CXXNOALIASOPTS = -O3 -fno-alias $(FLOATOPTS) -qopt-report-phase=loop,vec -qopt-report=4
    CXXCOLVAROPTS = -O2 -ip
    CC = icc
    COPTS = -static-intel -O2 $(FLOATOPTS)
    

    b. Compile NAMD:

    i.

    ./config Linux-x86_64-icc --charm-base <charm_root_path>​ --charm-arch multicore-linux64- iccstatic --with-fftw3 --fftw-prefix <fftw_install_path> ​--without-tcl --charm-opts –verbose

    ii.

    gmake –j

Other system setup

  1. Change the kernel setting for KNL: “nmi_watchdog=0 rcu_nocbs=2-271 nohz_full=2-271” Here is one way to change the settings (this could be different for every system):

    a. To be safe, first save your original grub.cfg:

    cp /boot/grub2/grub.cfg /boot/grub2/grub.cfg.ORIG

    b. In “/etc/default/grub” add (append) the following to

    “GRUB_CMDLINE_LINUX”: nmi_watchdog=0 rcu_nocbs=2-271 nohz_full=2-271

    c. Save your new configuration:

    grub2-mkconfig -o /boot/grub2/grub.cfg 

    d. Reboot the system. After logging in, verify the settings with “cat /proc/cmdline

  2. Change next lines in *.namd file for both workloads:

    numsteps 1000

    outputtiming 20

    outputenergies 600

Run NAMD

  • on SKL/BDW (ppn = 40 / ppn = 72 correspondingly):
    ./namd2 +p $ppn apoa1/apoa1.namd +pemap 0-($ppn-1)
  • on KNL (ppn = 136 (2 hyper threads per core), MCDRAM in flat mode, similar performance in cache):
    numactl -p 1 ./namd2 +p $ppn apoa1/apoa1.namd +pemap 0-($ppn-1)

KNL example:

numactl -p 1 <namd_root_path>/Linux-KNL-icc/namd2 +p 136 apoa1/apoa1.namd +pemap 0-135

Performance results reported in the Intel Salesforce repository (ns/day; higher is better):

Workload

2S Intel® Xeon® Processor E5-2697 v4 18c 2.3 GHz (ns/day)

Intel® Xeon Phi™ Processor 7250  bin1 (ns/day)

Intel® Xeon Phi™ Processor 7250  versus 2S Intel® Xeon® Processor E5-2697 v4 (speedup)

stmv

0.45

0.55

 

1.22x

apoa1

5.5

 

6.18

1.12x

Workload

2S Intel® Xeon® Gold 6148 Processor 20c 2.4 GHz (ns/day)

Intel® Xeon Phi™ Processor 7250 versus 2S Intel® Xeon® Processor E5-2697 v4 (speedup)

stmv

0.73

1.44x

apoa1 original

 

7.68

1.43x

apoa1

8.70

 

1.44x

Systems configuration

Processor

Intel® Xeon® Processor E5-2697 v4

Intel® Xeon® Gold 6148 Processor

Intel® Xeon Phi™ Processor 7250

Stepping

1 (B0)

1 (B0)

1 (B0) Bin1

Sockets / TDP

2S / 290W

2S / 300W

1S / 215W

Frequency / Cores / Threads

2.3 GHz / 36 / 72

2.4 GHz / 40 / 80

1.4 GHz / 68 / 272

DDR4 

8x16 GB 2400 MHz (128 GB)

12x16 GB 2666 MHz (192 GB)

6x16 GB 2400 MHz

MCDRAM

N/A

N/A

16 GB Flat

Cluster/Snoop Mode/Mem Mode

Home

Home

Quadrant/flat

Turbo

On

On

On

BIOS

GRRFSDP1.86B0271.R00.1510301446

 

GVPRCRB1.86B.0010.R02.1608040407

Compiler

ICC-2017.0.098

ICC-2016.4.298

ICC-2017.0.098

Operating System

Red Hat Enterprise Linux* 7.2

Red Hat Enterprise Linux 7.3

Red Hat Enterprise Linux 7.2

(3.10.0-327.e17.x86_64)

(3.10.0-514.6.2.0.1.el7.x86_64.knl1)

(3.10.0-327.22.2.el7.xppsl_1.4.1.3272._86_64)


Viewing all articles
Browse latest Browse all 3384

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>