This article guides you through creation of new benchmarks and benchmark suites within the Intel® MPI Benchmarks 2019 infrastructure.
A benchmark suite is a logically connected group of benchmarks. For each suite, you can declare command line arguments and share data structures.
Initial Setup
To create a new benchmark suite:
- Choose a name for your new benchmark suite and create a new subdirectory in the
src
directory of the Intel® MPI Benchmarks directory using this name. For example, if the benchmark suite name isnew_bench
, the source code sub-tree will be the following:src/new_bench/ src/new_bench/new_bench_1.cpp
- Create a
Makefile
. A simpleMakefile
may look as follows:BECHMARK_SUITE_SRC += new_bench/new_bench_1.cpp CPPFLAGS += -Inew_bench
In this example, theMakefile
rules add the new benchmark source code into the full list of files to build and add thenew_bench
subdirectory to the search list of the included directories via the–I
compiler flag. - Save the
Makefile
with the.mk
extension in the benchmark suite subdirectory:src/new_bench/Makefile.new_bench.mk
Examples
You can find a benchmark suite example
in the example subdirectory of your Intel® MPI Benchmarks distribution.
example_benchmark1.cpp
This file contains the bare minimum required to introduce a new benchmark suite and a new benchmark to the benchmarking infrastructure. Two main entities must be correctly specified: a benchmark suite class and a benchmark class.
Custom benchmark suite class
In this example, the new benchmark suite class is specified by the DECLARE_BENCHMARK_SUITE_STUFF
macro, which specializes the BenchmarkSuite<>
template class with the BS_GENERIC
enum value. Using the marco is recommended for simple cases like this.
Please note that there is a side effect of using the BenchmarkSuite<BS_GENERIC>
template: multiple instantiations of this class in different parts of the source code tree cause linker errors. To avoid this, use a unique namespace for all custom benchmark suites, and custom benchmark data structures and functions. In the example, the example_suite1
namespace is used exactly for this purpose.
Custom benchmark class
A new benchmark class must be inherited from the Benchmark
base class and must overload at least one virtual function: void run(const scope_item &item)
. This is the core of any benchmark. There are two helper macros DEFINE_INHERITED
and DECLARE_INHERITED
that define all static variables for the automatic runtime registration of any benchmark in the source tree.
You can check in runtime that example1
benchmark appears in the benchmark list of the Intel® MPI Benchmarks with the –list
option. The option output also shows that it belongs to our example_suite1
suite.
example_benchmark2.cpp
The example_suite1
can be successfully integrated into the Intel® MPI Benchmarks infrastructure, but to actually run the benchmark, you need to define another main entity of the infrastructure for it.
Custom benchmarking scope
The Intel® MPI Benchmarks infrastructure automatically registers new benchmarks and benchmark suites in the source tree, but the infrastructure must also know how many times each benchmark should be run and which parameters it should be passed each run. This is done by creating an object of an abstract class Scope
. The smart pointer to this object belongs to each benchmark object as a member and is supposed to be created by benchmark’s init()
virtual function definition.
The example example_benchmark2.cpp introduces the void init()
member function, which initializes the scope member of the base class Benchmark
. The VarLenScope
class, which is derived from the abstract base class Scope, creates a benchmarking scope of all messages or problem lengths from the set: 20, 21,…,222. The Intel® MPI Benchmarks infrastructure uses the scope initialized this way to run the benchmark by calling void run(const scope_item &item)
virtual function for each scope item. In this example, each scope item represents a single message length.
example_benchmark3.cpp
The third example extends example_benchmark2.cpp
with a simple and close to real world example of an MPI benchmark and implements the well-known ping-pong pattern. The void init()
virtual function adds receive and send buffers allocation. The void finalize()
virtual function implements the summary results output. The virtual destructor takes care of buffers deallocation.
example_benchmark4.cpp
The fourth example adds command line parameter handling to the previous ping-pong example. There are three command line parameters:
–len
takes a comma-separated list of message lengths to run the benchmark with–datatype
allows you to select the datatype used in MPI messages:MPI_CHAR
orMPI_INT
–ncycles
defines the number of benchmark iterations to execute during eachrun()
call
To set up the descriptions of expected command line arguments, the bool declare_args()
function of the BenchmarkSuite<BS_GENERIC>
template class is specified. It uses the args_parser
class API to declare options names that are expected to be parsed and option arguments that are meant. For example, the following API call:
parser.add<int>("ncycles", 1000);
instructs the command line parser to expect the –ncycles
option with an integer argument, the default argument value being 1000. The call:
parser.add_vector<int>("len", "1,2,4,8"). set_mode(args_parser::option::APPLY_DEFAULTS_ONLY_WHEN_MISSING);
instructs the command line parser to expect the –len
option with a comma-separated list of integers as an argument. The number of integers in the list is arbitrary. The default list consists of 4 integers: 1, 2, 4 and 8, and the nesting set_mode()
call makes the parser apply defaults only when the option is missing from the launch command line.
In this example the bool prepare()
function is used to handle the options and transfer data, given by the user on the command line, to internal data structures with corresponding parser.get<>()
calls . In particular, the vector<int> len
variable stores the list of desired message lengths received from the command line parser, MPI_Datatype datatype
stores the chosen data type, int ncycles
stores the given number of iterations.
The get_parameter()
function specialization implements an interface to pass pointers to data structures from the benchmark suite class to the benchmark class. Any benchmark in this suite may call the get_parameter()
function to get a smart pointer to a particular variable. The benchmark suite passes the pointer to the variable via the type erasure template class any
. In this example, both the run()
and init()
virtual functions of the benchmark class use this interface to get pointers to en
, datatype
and ncycles
values. The HANDLE_PARAMETER
and GET_PARAMETER
macros make the pointer pass handier.
Now the benchmark parameters may be controlled at runtime on the command line. When this example is compiled into a benchmark infrastructure, the command line option parser recognizes the –len
, -datatype
and –ncycles
options. The help output contains information on these options, which is integrated automatically.
example_benchmark5.cpp
This example implements the same functionality as the example_benchmark1.cpp
but with minimum usage of predefined macros and template classes.