Intel® Parallel Computing Center at Princeton University's Institute

Image may be NSFW.
Clik here to view. Princeton University’s Institute

Principal Investigators:

Image may be NSFW.
Clik here to view. Professor William Tang

Prof. William Tang, the PI of this project, originally began the Princeton Gyrokinetic Toroidal Code (GTC-P) project in 2008 with the goal of producing a modern HPC application code capable of delivering discovery science for increasing problem size by effective utilization of the most advanced supercomputing platforms. He was also the U.S. PI for the National Science Foundation-supported G8 Exascale Computing for Global Scale Projects Program in Fusion Energy that successfully ported GTC-P to leading HPC systems in Europe and Japan as well as in the US. This activity has currently been extended to top supercomputing systems worldwide to carry out comparative performance studies with “time to solution” and “energy to solution” as the relevant metrics.

Dr. Bei Wang is the current lead developer for the GTC-P code and has extensive experience in porting and optimizing the code on a variety of multi-core and many-core systems worldwide. Most recently, she has successfully ported the code to Stampede’s Intel® Xeon Phi™ Coprocessor system at NSF’s Texas Advanced Computing Center and at the world-leading Tianhe – 2 system in China. Significant results operating in symmetric mode have been obtained, and active development of a more efficient offload mode implementation is currently in progress. More recently, she has actively collaborated on GTC-P performance studies with the Intel® PCC in ETH-Zurich to significantly advance progress in this key area.

Dr. Khaled Ibrahim, a Computer Science expert in performance modeling and simulation acceleration in the computer science division of the University of California Lawrence Berkeley National Lab (LBNL), has been the lead member of the CS team there engaged specifically in active collaborations with Princeton on modernizing the GTC-P code. In particular, he has led the R&D efforts that have enabled GTC-P to exploit the optimization of “scatter” and “gather” operations on modern multi-core and many-core systems. He will also explore the best way to effectively use the cache and memory hierarchy in the Xeon Phi architectures.

Dr. Carlos Rosales is Co-Director of the Advanced Computing Evaluation Laboratory at the TACC, where his main responsibility is the evaluation of new computer architectures relevant to High Performance Computing. His areas of expertise are benchmarking, code optimization, and computational fluid dynamics. Dr. Rosales has worked on code optimization for the Intel® Xeon Phi™ coprocessor since its pre-production days, and works closely with Intel engineers in several areas related to performance and stability of codes deployed on the Intel® Architectures.

Description:

The Intel® PCC at Princeton University’s Institute for Computational Science & Engineering in partnership with the TACC and LBNL will focus on conducting a systematic collaborative case study on the Intel® Xeon Phi™ coprocessor of a discovery-science-capable particle-in-cell (PIC) production code named Gyrokinetic Toroidal Code -Princeton (GTC-P). This work will involve exploiting vectorization and determining the best strategy for dealing with the last level of the cache used in Intel® Xeon Phi™ coprocessors. In particular, the associated R&D will explore the best ways to use the memory hierarchy in the Knights Landing (KNL) architecture. Additionally, improved efficiency of the offload programming model on the Knights Corner (KNC) architecture will also be addressed. Overall, the aim is to produce a successful case study to demonstrate the performance of advanced PIC algorithms on Intel® Architectures.

In order to more efficiently utilize the full power of Intel® Xeon Phi™ coprocessors, it is important that the applications utilize all cores and vector units effectively. This will accordingly involve investigation of optimization opportunities on data parallelism for two key kernels in GTC-P featuring algorithmic level “scatter” and gather” operations. Specifically, the optimizations will include careful examination of data layouts (Array of Structure and Structure of Array), data alignment, data prefetching, intrinsics, and auto-vectorization. In addition, the R&D will involve exploring the best strategy for dealing with the last level of the cache hierarchy that is used in the Intel® Xeon Phi™ coprocessor series. Since the KNL architecture soon to be accessible on “Cori” at NERSC/LBNL, on “Theta” at ALCF/ANL, and on “Stampede II” at TACC will feature a hierarchy of dynamic memory capabilities, this Intel® PCC has special interest in analyzing the access pattern of different data structures to guide the allocation to the various dynamic memories. For the current generation KNC architecture featured on “Stampede,” we plan to add an “offload pragma” with the goal of improving offloading of the loops in these key kernels, while keeping nearly the same performance as the native version. Deploying an efficient offload programming model is necessary for properly performing application production runs on leadership-class computing facilities (such as Stampede and TH-2) where supporting direct MPI communication involving Intel® Xeon Phi™ coprocessors is quite challenging.

Related websites:

http://extremescaleglobalpic.princeton.edu

Intel® Parallel Computing Center at Princeton University's Institute

Principal Investigators:

Description:

Related websites:

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112