Intel® Parallel Computing Center at Lawrence Berkeley National Laboratory

Principal Investigators:

Costin Iancu is a scientist at Lawrence Berkeley National Laboratory (LBNL), where he spent 15 years performing research in the areas of programming models and code optimization for large scale parallel systems.

Khaled Ibrahim is a computer scientist at Lawrence Berkeley Laboratory. His research interest spans runtime systems, virtualized computing, high performance computing, and computer architectures. He has numerous publications in these areas in top tier journals and conferences. He is also an HPC challenge awardee in supercomputing 2014.

Description:

To extend their mission and to open new science frontiers, operators of large scale supercomputers have a vested interest to deploy existing data analytics frameworks such as Spark or Hadoop. So far, this deployment has been hampered by the differences in system architecture, which are reflected in the design and optimization approaches for the data analytics stacks. HPC systems and data centers have opposing design constraints and optimization goals. In a data center, the file system is optimized for latency (with local disks) and the network is optimized for bandwidth. In a supercomputer, the file system is optimized for bandwidth (without local disk) while the network is optimized for latency. Tightly coupling the compute nodes allows leveraging high performance centralized storage space with globally visible file directory structures.

We propose research and development efforts to ensure the successful adoption of data analytics frameworks in the HPC ecosystem, with emphasis on Spark. Our initial evaluation on a Cray XC30 and on an InfiniBand cluster indicates that performance is hampered by the interaction with Lustre: we observe 2-4X slowdown that can be directly attributed to it.

Scalability is the main target of our project. Currently, data analytics frameworks scale up to O(100) cores. We plan to demonstrate scalability up to O(10,000) cores. As a reference, we plan to use data center installations such as Amazon EC2. Initial R&D efforts will be performed on Cray XC30 and IB clusters with server class processors. In the later stages of the project we plan to experiment with the Intel® Xeon Phi processor (also known as Knights Landing) based system (Cori) at NERSC.

In the first stage of the project we plan to systematically re-design the Spark internals to accommodate the different performance characteristics of Lustre based systems. Examples include hiding the latency of expensive metadata operations associated with IO or exploiting the global name space. The goal is to at least match Amazon EC2 performance at this stage. This will provide us with a decent starting point and we can initiate collaborations with external application groups.

In the second stage of the project, we plan to re-examine the memory management in Spark to accommodate the complex vertical memory hierarchies present in HPC systems. Data center nodes provide two logical levels of storage (disk and memory) and data management is Spark reflects this hierarchy. In a sense, Lustre already provides a third level of storage in HPC systems. Newer systems already incorporate an extra level of NVRAM memory, such as the expected NERSC Cori system. We plan to explore Spark specific modifications: 1) using compositions of existing layers (e.g. Tachyon) as a memory manager for vertical hierarchies; 2) exploring a HDFS emulation layer that targets NVRAM storage. We also plan to explore low-level memory management policies at the OS level. We feel there is an exciting research opportunity to re-examine the data coherency models in both Lustre and Spark and propose parallel file system coherency modes for data analytics.

Related websites:

http://crd.lbl.gov/departments/computer-science/CLaSS/staff/costin-iancu/

http://www.intel.com/content/www/us/en/software/intel-solutions-for-lustre-software.html

Intel® Parallel Computing Center at Lawrence Berkeley National Laboratory - Lustre

Principal Investigators:

Description:

Related websites:

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112