Principal Investigators:
Anouar Benali obtained a Ph.D. in Theoretical Physical Chemistry from the University of Toulouse (France) in 2010. He is an Assistant Computational Scientist at the Argonne Leadership Computing Facility and a fellow of the Computation Institute at the University of Chicago. His work focuses on implementing and speeding QMC algorithms for High Performance Computers.
Luke Shulenburger is a staff scientist at Sandia National Laboratories working on electronic structure calculations of materials with a particular focus on extremes of temperature and pressure. He received his PhD from the University of Illinois at Urbana-Champaign in 2008, and was a postdoctoral researcher at the Carnegie Institution of Washington until moving to Sandia in 2010.
Description:
Quantum Monte Carlo (QMC) has emerged as an important tool for extreme-scale calculations of complex material properties. QMCPACK is a code for calculating the electronic structure of materials with unprecedented accuracy. It works by stochastically solving the many-body Schrödinger equation. This method is uniquely suited for calculations of technologically important materials and has been shown to be predictive for a wide range of materials and molecules. Over the past decade, the size of the physical problems and computational facilities have been firmly in a regime where the method has been shown to scale nearly linearly with the number of computational elements available. The coming of the exascale era has allowed consideration of larger problems involving thousands of electrons that will need to utilize millions of threads, further straining this relationship. Additionally, the constant memory necessary for evaluating single-particle wavefunctions will grow beyond the fast device memory expected in heterogeneous architectures. Through the Intel® Parallel Computing Center, we aim to increase the current vectorization of the code, parallelize the work for each "walker" to achieve good parallel efficiency using nested threading, and finally develop a caching scheme to allow use of slower main memory for heterogeneous platforms with minor performance penalty. This project will pilot extreme-scale threading and vectorization in a popular QMC code and will disseminate the experience gained to other QMC codes, allowing the study of larger and more realistic systems with predictive accuracy.