Background
The whole point of simulation is to model the behavior of a design and potential changes against various conditions to determine whether we are getting an expected response; and simulation in software is far cheaper than building hardware and performing a physical simulation and modifying the hardware model each time.
Dassault Systèmes [1] through its SIMULIA* brand, is creating a new paradigm to establish Finite Element Analysis and mulitphysics simulation software as an integral business process in the engineering value chain. More information about SIMULIA can be found here [2].
The Abaqus* Unified Finite Elements Analysis product suite, from Dassault Systèmes* SIMULIA, offers powerful and complete solutions for both routine and sophisticated engineering problems covering a vast spectrum of industrial applications in Automotive, Aerospace, Consumer Packaged Goods, Energy, High Tech, Industrial Equipment and Life Sciences. As an example, automotive industry engineering work groups are able to consider full vehicle loads, dynamic vibration, multibody systems, impact/crash, nonlinear static, thermal coupling, and acoustic-structural coupling using a common model data structure and integrated solver technology.
What is Finite Element Analysis (FEA)?
FEA is a computerized method of simulating the behavior of engineering structures and components under a variety of conditions. It is the application of the Finite Element method (FEM)[3] [8]. It works by breaking down an object into a large number of finite elements and each element is represented by an equation. By integrating all the element’s equations, the whole object can be mathematical modeled.
How Abaqus/Standard take advantage of Intel® AVX2
Abaqus/Standard is general purpose FEA. It includes many analysis capabilities. According to Dassault Systèmes web site, it “employs solution technology ideal for static and low-speed dynamic events where highly accurate stress solutions are critically important. Examples include sealing pressure in a gasket joint, steady-state rolling of a tire, or crack propagation in a composite airplane fuselage. Within a single simulation, it is possible to analyze a model both in the time and frequency domain. For example, one may start by performing a nonlinear engine cover mounting analysis including sophisticated gasket mechanics. Following the mounting analysis, the pre-stressed natural frequencies of the cover can be extracted, or the frequency domain mechanical and acoustic response of the pre-stressed cover to engine induced vibrations can be examined.” More information about Abaqus/Standard can be found at [9].
According to Dassault Systèmes web site, Abaqus/Standard uses Hilber-Hughes-Taylor time [12] integration by default. The time integration is implicit, meaning that the operator matrix must be inverted and a set of simultaneous nonlinear dynamic equilibrium equations must be solved at each time increment. This solution is done iteratively using Newton’s [13] method. This solution utilizes a function called DGEMM [5] (Double-Precision General Matrix Multiplication) in the Intel® Math Kernel Libraries (Intel® MKL [4]) to handle matrix multiplication involving double-precision values.
Analysis of Abaqus workloads using performance monitoring tools, such as Intel® VTune™, showed a significant number of them spend 40% to 50% of their runtime time in DGEMM. Further analysis of the DGEMM function showed that it makes extensively used of the multiply-add operation since DGEMM is, basically, matrix multiplication.
One of the new feature of the Intel® Xeon® E5-2600 v3 Product Family is the support of a new extension set called Intel AVX2 [7]. One of the new instructions in Intel AVX2 is the three-operand fused multiply-add (FMA3 [6]). By implementing the combined multiply-addition operation in the hardware, the speed of this operation is considerably improved.
Abaqus/Standard uses Intel® MKL’s DGEMM implementation. It should also be noted that in Intel MKL version 11 update 5, and later versions, DGEMM was optimized to use Intel AVX2 extensions, thus allowing DGEMM to run optimally on the Intel® Xeon® E5-2600 v3 Product Family.
Performance test procedure
To prove the performance improvement brought forth by using a newer DGEMM implementation that takes advantage of Intel AVX2, we performed tests on two platforms. One system was equipped with Intel Xeon E5-2697 v3 and the other with Intel Xeon E5-2697 v2. The duration of the tests were measured in seconds.
Performance test Benchmarks
The following four benchmarks from Abaqus/Standard were used: s2a, s3a, s3b and s4b.
Figure 1. S2a is a nonlinear static analysis of a flywheel with centrifugal loading.
Figure 2. S3 extracts the natural frequencies and mode shapes of a turbine impeller.
S3 has three versions.
S3a is a 360,000 degrees of freedom (DOF) using Lanczos Eigensolver [11] version.
S3b is a 1,100,000 degrees of freedom (DOF) using Lanczos Eigensolver version.
Figure 3. S4 is a benchmark that simulates the bolting of a cylinder head onto an engine block.
S4b is S4 version with 5,000,000 degrees of freedom (DOF) using direct solver version.
Note that these pictures are properties of Dassault Systèmes*. They are reprinted with the permission from Dassault Systèmes.
Test configurations
System equipped with Intel Xeon E5-2697 v3
- System: Pre-production
- Processors: Xeon E5-2697 v3 @2.6GHz
- Memory: 128GB DDR4-2133MHz
System equipped with Intel Xeon E5-2697 v2
- System: Pre-production
- Processors: Xeon E5-2697 v2 @2.7GHz
- Memory: 64GB DDR4-1866MHz
Operating System: Red Hat* Enterprise Linux Server release 6.4
Application: Abaqus/Standard benchmarks version 6.13-1
Note:
1) Although the system equipped with the Intel® Xeon® E5-2697 v3 processor has more memory, the memory capacity does not affect the tests results, as the largest workload only used 43GB of memory.
2) The duration was measured by wall-clock time in seconds.
Test Results
Figure 4. Comparison between Intel Xeon E5-2697 v3 and E5-2697 v2
Figure 4 shows the benchmarks running on a system equipped with Intel Xeon E5-2697 v3 and on a system equipped with E5-2697 v2. Performance improvement due to Intel AVX2 and hardware advantage ranging from 1.11X to 1.39X.
Figure 5. Comparison between benchmarks with Intel AVX2 enabled and disabled
Figure 5 shows the results of benchmarks with Intel AVX2 enabled and disabled on a system equipped with Intel Xeon E5-2697 v3. Using Intel AVX2 allows benchmarks to finish faster than without using Intel AVX2. The performance increase due to Intel AVX2 is ranging from 1.03X to 1.11X.
Note: Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance
Conclusion
Simulation software performance is very critical since it can significantly reduce the model development and analysis time. Abaqus/Standard is well-known for FEA that relies on DGEMM for its solvers. As a result of the introduction of Intel® AVX2 in the Intel® Xeon® E5-2600 v3 Product Family, and as a result of the Intel MKL augmentation to take advantage of Intel AVX2, a simple change to the Abaqus/Standard to use the latest libraries yielded a considerable performance improvement.
References
[1] www.3ds.com
[2] http://www.3ds.com/products-services/simulia/
[3] http://en.wikipedia.org/wiki/Finite_element_method
[4] http://en.wikipedia.org/wiki/Math_Kernel_Library
[5] https://software.intel.com/en-us/node/429920
[6] http://en.wikipedia.org/wiki/FMA_instruction_set
[7] http://en.wikipedia.org/wiki/Advanced_Vector_Extensions
[8] http://people.maths.ox.ac.uk/suli/fem.pdf
[9] http://www.3ds.com/products-services/simulia/products/abaqus/abaqusstandard/
[10] http://www.simulia.com/support/v66/v66_performance.html#s2
[11] http://en.wikipedia.org/wiki/Lanczos_algorithm
[12] http://sbel.wisc.edu/People/schafer/mdexperiments/node13.html
[13] http://en.wikipedia.org/wiki/Newton%27s_method
Notices INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm Any software source code reprinted in this document is furnished under a software license and may only be used or copied in accordance with the terms of that license. Intel, the Intel logo, Intel Core, and Intel Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. Copyright © 2015 Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.