Intel® Parallel Computing Center at The University of Texas at Austin

University of Texas at Austin

Principal Investigators:

Robert van de Geijn is professor of computer science and core member of the Institute for Computational Engineering and Sciences, where he heads the Science of High-Performance Computing Group, which pursues foundational research in the field of linear algebra. The group’s focus on formal derivation of algorithms has led to the development of projects such as the libflame library, a modern, high-performance dense linear algebra library that targets both sequential and parallel architectures, and the BLIS framework which enables the the rapid creation of high-performance matrix operations on a variety of architectures. Prof. van de Geijn has published several books and more than 100 refereed publications.

Devin Matthews is the Arnold O. Beckman Postdoctoral Fellow in the Institute for Computational Engineering and Sciences at the University of Texas at Austin. His interests include high-accuracy quantum chemistry and tensor algorithms. He received his Ph.D. in Chemistry at UT Austin as a DOE Computational Science Graduate Fellow and received the Howes Scholar award for his work on massively-parallel quantum chemistry algorithms in the newly-developed Aquarius program.

Description:

The Science of High Performance Computing Group at UT Austin works on fundamental aspects of computer science, software, and algorithm development, as applied to high-performance computing in general and computational quantum chemistry in particular. Our group involves faculty members from computer science, chemistry, engineering, and statistics as well as researchers at the Texas Advanced Computing Center (TACC).

The modeling of chemical systems using quantum mechanics is essential to understanding the behavior and properties of these systems, such as reactivity, structure, catalytic and enzymatic activity, spectroscopic signatures, and bulk physical/mechanical properties. Detailed high-accuracy calculations of fundamental molecular systems—small molecules and clusters in the gas and solution phases—provide critical calibration for more approximate calculations as well as a quantitative and predictive tool for analyzing and explaining experimental data. These types of calculations often require techniques which go beyond standard methods such as the popular coupled cluster singles and doubles with perturbative triples model, CCSD(T), for which there are generally no high-performance parallel implementations available. Optimized, scalable implementations of these methods which could take advantage of advanced technologies such as many-core Intel® Xeon Phi™ processors would drastically increase the applicability of very high-accuracy calculations.

Our group has recently developed high-performance software related to this goal in several areas. First, we developed the NCC module, which plugs into the CFOUR quantum chemistry suite (www.cfour.de), to perform high-accuracy calculations using the CCSDTQ method and various related approximations such as CCSDT(Q) using a novel spin-adapted algorithm. CFOUR is an actively developed code with a broad user base, who apply it to diverse problems in spectroscopy, kinetics, and thermodynamics. This code achieves a significant performance improvement over existing implementations, but there is still significant room for improvement both in sequential performance and especially in multi-threaded performance and scalability. The performance issues in NCC stem mostly from two major sources, (1) the lack of “native” tensor and extended matrix algorithms beyond the traditional BLAS interface, and (2) limited opportunities for parallelism due to small matrix/tensor sizes, relatively short loops for course-grained parallelism with poor load-leveling, and a lack of hierarchical parallelism. Second, in the domain of dense linear algebra (DLA), meaning matrices as opposed to tensors in this context, our group has developed the BLIS framework, which uses a structured methodology to implement high-performance matrix operations using only straightforward C99 code and a single assembly-coded “micro-kernel”. Using an appropriate micro-kernel and cache blocking parameters, this implementation achieves very high performance on multi-core CPU architectures and on the Xeon Phi many-core architecture.

We have proposed to optimize the NCC module for highly parallel architectures such as Intel® Xeon™ and Intel® Xeon Phi™ processors. These optimizations come in two broad categories which directly address the performance issues encountered in NCC, (1) novel tensor algorithms which leverage the BLIS framework to increase scalability, consolidate computation, increase data reuse, reduce latency and memory movement, and enable both large and small tensor operations to be computed efficiently, and (2) a runtime system for extracting parallel execution of tensor operations from sequential program semantics which will allow for natural hierarchical parallelization and reduce synchronization.

Related Websites:

http://shpc.ices.utexas.edu

Intel® Parallel Computing Center at The University of Texas at Austin

Principal Investigators:

Description:

Related Websites:

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112