Quantcast
Channel: Intel Developer Zone Articles
Viewing all articles
Browse latest Browse all 3384

Configure Virtual Fabrics in Intel® Omni-Path Architecture

$
0
0

Introduction

Virtual Fabrics (vFabrics)* allow multiple network applications running on the same fabric at the same time with limited interference. Using vFabrics, a physical fabric is divided into many overlapping virtual fabrics, which keep network applications separate even though they connect to the same physical fabric. Virtual fabric is a feature of the Intel® Omni-Path Architecture (Intel® OPA) Fabric Manager (FM). For a complete overview of the FM, please refer to the document Intel® Omni-Path Fabric (Intel® OP Fabric) Suite Fabric Manager User Guide.

Typical usage models for vFabrics include:

  • Separating a cluster into multiple virtual fabrics so independent applications running in different virtual fabrics can run with minimal or no effect on each other.
  • Separating classes of traffic. Each class of traffic runs in a different virtual fabric so that they don’t interfere with each other.

Each vFabric can be assigned quality of service (QoS) and security policies to control how common physical resources in the fabric are shared among the virtual fabrics. A virtual fabric is defined as a list of applications, a list of device groups, with a set of security and QoS along with other identifiers for the virtual fabric. Refer to Section 2.1 of the document Intel® Omni-Path Fabric (Intel® OP Fabric)  Suite Fabric Manager User Guide for more information on virtual fabric.

This document shows how to configure vFabrics, how to use various Intel® OPA command lines to display virtual fabrics, to display port counters, and to clear them. The Intel® MPI Benchmarks, part of Intel® Parallel Studio XE 2018 Cluster Edition, is used to generate traffic in these virtual fabrics. Finally, you can verify the packets sent and received in the virtual fabrics with the Intel® OPA command lines.

Preparation

The following setup and tests are run on two systems equipped with the Intel® Xeon® processor E5-2698 v3 @2.30 GHz and the Intel® Xeon Phi™ processor 7250 @ 1.40 GHz. The first system has the IP address 10.23.3.28 with hostname lb0, the second system has the IP address 10.23.3.182 with hostname knl-sb2. Both systems are running Red Hat Enterprise Linux* 64-bit 7.2. Also, on each system, an Intel® Omni-Path Host Fabric Interface (Intel® OP HFI) Peripheral Component Interconnect Express (PCIe) x16 is installed and connected directly to the other HFI. The Intel® Omni-Path Fabric Software version 10.4.2.0.7 is installed on each system. Also, the Intel® Parallel Studio XE 2018 Cluster Edition is installed on both systems to run the Intel® MPI Benchmarks.

In this test, IP over Fabric (IPoFabric) is also configured. The corresponding IP addresses on the Intel® Xeon® processor host and the Intel® Xeon Phi™ processor are 192.168.100.101 and 192.168.100.102, respectively.

Creating Virtual Fabrics

The Intel® OPA fabric supports redundant FMs. This is implemented in order to ensure management coverage of the fabric continues in the case of failure on the master FM. When there are redundant FMs, only one is selected as a master FM arbitrary, and others are standby FMs. Only the master FM can create vFabrics.

Using the above systems, you will create the vFabrics and generate traffic in these vFabrics. First, you need to determine the master FM. Both commands opareport and opafabricinfo can show which FM is a master FM. For example, issue the command opafabricinfo in either machine lb0 or knl-sb2:

# opafabricinfo
Fabric 0:0 Information:
SM: knl-sb2 Guid: 0x00117501017444e0 State: Master
SM: lb0 Guid: 0x0011750101790311 State: Inactive
Number of HFIs: 2
Number of Switches: 0
Number of Links: 1
Number of HFI Links: 1              (Internal: 0   External: 1)
Number of ISLs: 0                   (Internal: 0   External: 0)
Number of Degraded Links: 0         (HFI Links: 0   ISLs: 0)
Number of Omitted Links: 0          (HFI Links: 0   ISLs: 0)
-------------------------------------------------------------------------------

This indicates that the master FM resides in the machine knl-sb2.

Show the current virtual fabrics with the following command. You will see two virtual fabrics were configured by default: one is named Default; the other is Admin:

# opapaquery -o vfList
Getting VF List...
 Number of VFs: 2
 VF 1: Default
 VF 2: Admin
opapaquery completed: OK

In the following exercise, you will add two virtual fabrics named “VirtualFabric1” and “VirtualFabric2”. To create virtual fabrics, you can modify the /etc/opa-fm/opafm.xml configuration file in the host where the master FM resides. Note that existing tools such as opaxmlextract, opaxmlindent, opaxmlfilter, and opaxmlgenerate can help to manipulate the opafm.xml file.

The /etc/opa-fm/opafm.xml configuration file stores information about how the fabric is managed via the master FM. This file uses standard XML syntax. In the node where the master FM resides, knl-sb2 in this case, edit the FM configure file /etc/opa-fm/opaf.xml and add two virtual fabrics by adding the following lines in the virtual fabrics configuration section:

# vi /etc/opa-fm/opafm.xml

........................

<VirtualFabric><Name>VirtualFabric1</Name><Application>AllOthers</Application><BaseSL>1</BaseSL><Enable>1</Enable><MaxMTU>Unlimited</MaxMTU><MaxRate>Unlimited</MaxRate><Member>All</Member><QOS>1</QOS></VirtualFabric><VirtualFabric><Name>VirtualFabric2</Name><Application>AllOthers</Application><BaseSL>2</BaseSL><Enable>1</Enable><MaxMTU>Unlimited</MaxMTU><MaxRate>Unlimited</MaxRate><Member>All</Member><QOS>1</QOS></VirtualFabric>

........................

A virtual fabric is created by adding an XML element <VirtualFabric>. Inside the XML element, you can define many parameters for this virtual fabric. The following parameters are defined for the virtual fabrics used in this example:

  • Name: a unique name for this virtual fabric.
  • Application: a catchall that can be used to identify all applications.
  • BaseSL (base service level): allows a specified service level (0–15) to be used for the vFabric. In this example, VirtualFabric1 uses service level=1, and VirtualFabric2 uses service level=2.
  • Enable: vFabric is enabled if this field is set to 1. If set to 0, the vFabric is simply ignored (allows user to easily disable the vFabric without deleting it).
  • MaxMTU: the maximum of Maximum Transmission Unit (MTU) of the package. The actual value of MTU may be further reduced by hardware capability or by request. It can be set to unlimited.
  • MaxRate: the maximum static rate. It can be set to unlimited.
  • Member: the group name that specifies whether a device can talk to any other members.
  • QoS: 1 means the QoS for this vFabric is enabled.

A complete list of parameters is shown in Section 6.5.12 of the document Intel® OPA Fabric Suite Fabric Manager User Guide.

Log in as root or a user with privileges, and restart the master FM in knl-sb2 to read the new /etc/opa-fm/opafm.xml configuration file:

# systemctl restart opafm

Verify the master FM is up and running:

# systemctl status opafm
● opafm.service - OPA Fabric Manager
   Loaded: loaded (/usr/lib/systemd/system/opafm.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2017-11-14 08:39:58 EST; 18s ago
  Process: 36995 ExecStop=/usr/lib/opa-fm/bin/opafmd halt (code=exited, status=0/SUCCESS)
 …………

Now, display all current virtual fabrics:

# opapaquery -o vfList
Getting VF List...
 Number of VFs: 4
 VF 1: Default
 VF 2: Admin
 VF 3: VirtualFabric1
 VF 4: VirtualFabric2
opapaquery completed: OK

The output shows that the two new virtual fabrics “VirtualFabric1” and “VirtualFabric2” are added. To show the configuration of the recently added virtual fabrics, issue the command “opareport -o vfinfo”:

# opareport -o vfinfo
Getting All Node Records...
Done Getting All Node Records
Done Getting All Link Records
Done Getting All Cable Info Records
Done Getting All SM Info Records
Done Getting vFabric Records

vFabrics:
Index:0 Name:Default
PKey:0x8001   SL:0  Select:0x0   PktLifeTimeMult:2
MaxMtu:unlimited  MaxRate:unlimited   Options:0x00
QOS: Disabled  PreemptionRank: 0  HoQLife:    8 ms

Index:1 Name:Admin
PKey:0x7fff   SL:0  Select:0x1: PKEY   PktLifeTimeMult:2
MaxMtu:unlimited  MaxRate:unlimited   Options:0x01: Security
QOS: Disabled  PreemptionRank: 0  HoQLife:    8 ms

Index:2 Name:VirtualFabric1
PKey:0x2   SL:1  Select:0x3: PKEY SL   PktLifeTimeMult:2
MaxMtu:unlimited  MaxRate:unlimited   Options:0x03: Security QoS
QOS: Bandwidth:  33%  PreemptionRank: 0  HoQLife:    8 ms

Index:3 Name:VirtualFabric2
PKey:0x3   SL:2  Select:0x3: PKEY SL   PktLifeTimeMult:2
MaxMtu:unlimited  MaxRate:unlimited   Options:0x03: Security QoS
QOS: Bandwidth:  33%  PreemptionRank: 0  HoQLife:    8 ms

4 VFs

This confirms that VirtualFabric1 and VirtualFabric2 have service level (SL) 1 and SL 2, respectively. Get membership information for all vFabrics in every node in the fabric:

# opareport -o vfmember
Getting All Node Records...
Done Getting All Node Records
Done Getting All Link Records
Done Getting All Cable Info Records
Done Getting All SM Info Records
Done Getting vFabric Records
Getting All Port VL Tables...
Done Getting All Port VL Tables
VF Membership Report

knl-sb2
LID 1   FI      NodeGUID 0x00117501017444e0
Port 1
    Neighbor Node: lb0
    LID 2       FI      NodeGUID 0x0011750101790311
    Port 1
VF Membership:
    VF Name        VF Index     Base SL Base SC VL
    Default        0            0       0       0
    Admin          1            0       0       0
    VirtualFabric1 2            1       1       1
    VirtualFabric2 3            2       2       2


lb0
LID 2   FI      NodeGUID 0x0011750101790311
Port 1
    Neighbor Node: knl-sb2
    LID 1       FI      NodeGUID 0x00117501017444e0
    Port 1
VF Membership:
    VF Name        VF Index     Base SL Base SC VL
    Default        0            0       0       0
    Admin          1            0       0       0
    VirtualFabric1 2            1       1       1
    VirtualFabric2 3            2       2       2


    2 Reported Port(s)
-------------------------------------------------------------------------------

Virtual lanes (VLs) allow multiple logical flows over a single physical link. They are configurable from 0 to 8 VLs plus one management. The above output shows these two new virtual fabrics use VL 1 and VL 2, respectively. Note that local identifier (LID) is assigned to every port in the fabric.

Get the buffer control tables for all vFabrics on every node in the fabric:

# opareport -o  bfrctrl
Getting All Node Records...
Done Getting All Node Records
Done Getting All Link Records
Done Getting All Cable Info Records
Done Getting All SM Info Records
Done Getting vFabric Records
Done Getting Buffer Control Tables
BufferControlTable Report
    Port 0x00117501017444e0 1 FI knl-sb2 (LID 1)
        Remote Port 0x0011750101790311 1 FI lb0 (LID 2)
        BufferControlTable
            OverallBufferSpace   (AU/B):    2176/  139264
            Tx Buffer Depth     (LTP/B):     128/   16384
            Wire Depth          (LTP/B):      13/    1664
            TxOverallSharedLimit (AU/B):    1374/   87936
                VL | Dedicated  (   Bytes) |  Shared  (   Bytes) |  MTU
                 0 |       224  (   14336) |    1374  (   87936) |  10240
                 1 |       224  (   14336) |    1374  (   87936) |  10240
                 2 |       224  (   14336) |    1374  (   87936) |  10240
                15 |       130  (    8320) |    1374  (   87936) |   2048
    Port 0x0011750101790311 1 FI lb0 (LID 2)
        Remote Port 0x00117501017444e0 1 FI knl-sb2 (LID 1)
        BufferControlTable
            OverallBufferSpace   (AU/B):    2176/  139264
            Tx Buffer Depth     (LTP/B):     128/   16384
            Wire Depth          (LTP/B):      11/    1408
            TxOverallSharedLimit (AU/B):    1374/   87936
                VL | Dedicated  (   Bytes) |  Shared  (   Bytes) |  MTU
                 0 |       224  (   14336) |    1374  (   87936) |  10240
                 1 |       224  (   14336) |    1374  (   87936) |  10240
                 2 |       224  (   14336) |    1374  (   87936) |  10240
                15 |       130  (    8320) |    1374  (   87936) |   2048
2 Reported Port(s)
-------------------------------------------------------------------------------

For a detailed summary of all systems, users can issue the following command (a long report):

# opareport -V -o comps -d 10

Just before running traffic in the added virtual fabrics, show port status on VirtualFabric1 and VirtualFabric2 (–n mask 0x2 indicates port 1 and -w mask 0x6 indicates VL 1 and VL2):

# opapmaquery -o getportstatus -n 0x2 -w 0x6 | grep -v 0$
…………………………………
        VL Number    1
            Performance: Transmit
                 Xmit Data                                0 MB (0 Flits)
            Performance: Receive
                 Rcv Data                                 0 MB (0 Flits)
            Performance: Congestion
            Performance: Bubbles
            Errors: Other

        VL Number    2
            Performance: Transmit
                 Xmit Data                                0 MB (0 Flits)
            Performance: Receive
                 Rcv Data                                 0 MB (0 Flits)
            Performance: Congestion
            Performance: Bubbles
            Errors: Other

Observe that all port counters in VirtualFabric1 and VirtualFabric2 are null. Many job schedulers provide an integrated mechanism to launch jobs in a proper virtual fabric. When launching jobs manually, you can use the service level (SL) to identify the virtual fabric.

Find the LIDs assigned to the ports. Two LIDs are found, LID 0x0001 and 0x002:

# opareport -o lids
Getting All Node Records...
Done Getting All Node Records
Done Getting All Link Records
Done Getting All Cable Info Records
Done Getting All SM Info Records
Done Getting vFabric Records
LID Summary

2 LID(s) in Fabric:
   LID(Range) NodeGUID          Port Type Name
0x0001        0x00117501017444e0   1 FI knl-sb2
0x0002        0x0011750101790311   1 FI lb0
2 Reported LID(s)
-------------------------------------------------------------------------------

Every HFI and switch has port counters to hold error counters and performance counters. Show the port counters for LID 0x0001, port 1:

# opapaquery -o portcounters -l 1 -P 1 | grep -v 0$

In the above command, since the output is a long list and most of the lines show 0 at the end, you may want to output only the lines that do not end in 0 just to see anything not normal.

Similarly, show the port counters for LID 0x0002, port 1:

# opapaquery -o portcounters -l 2 -P 1 | grep -v 0$
Getting Port Counters...
PM controlled Port Counters (total) for LID 0x0002, port number 1:
Performance: Transmit
    Xmit Data                             3113 MB (389186568 Flits)
    Xmit Pkts                           944353
    MC Xmit Pkts                            15
Performance: Receive
    Rcv Data                              3054 MB (381812586 Flits)
    Rcv Pkts                           1002110

To clear counters on port 1 (-n 0x2) of VL 1 and VL 2 (-w 0x6):

# opapmaquery -o clearportstatus -n 0x2 -w 0x6
 Port Select Mask   0x0000000000000002
 Counter Sel Mask   0xffffffff

Testing and Verifying

In this section, after setting virtual fabrics, you will run MPI traffic on both VirtualFabric1 and VirtualFabric2 and verify that using the Intel® OPA command lines. You can use the Intel® MPI Benchmarks to generate traffic between two nodes. First, you need to set the secure shell password-less:

# ssh-keygen -t rsa
# ssh-copy-id root@192.168.100.101

And disable the firewall for the MPI traffic test:

# systemctl status firewalld
# systemctl stop firewalld
# systemctl status firewalld

Set up proper environment variables before using Intel® MPI Benchmarks:

# source /opt/intel/parallel_studio_xe_2018.0.033/psxevars.sh intel64
Intel(R) Parallel Studio XE 2018 for Linux*
Copyright (C) 2009-2017 Intel Corporation. All rights reserved.

Run the Intel® MPI Benchmarks IMB-MPI1 (benchmarks for MPI-1 functions) on VirtualFabric1 by setting the environment variable HFI_SL=1 (note that the flag –PSM2, Intel® Performance Scaled Messaging 2, was used for the Intel® OPA family of products):

# mpirun -genv HFI_SL=1 -PSM2 -host localhost -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv : -host 192.168.100.101 -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1
#------------------------------------------------------------
#    Intel (R) MPI Benchmarks 2018, MPI-1 part
#------------------------------------------------------------
# Date                  : Tue Nov 28 15:05:11 2017
# Machine               : x86_64
# System                : Linux
# Release               : 3.10.0-327.22.2.el7.x86_64
# Version               : #1 SMP Thu Jun 9 10:09:10 EDT 2016
# MPI Version           : 3.1
# MPI Thread Environment:


# Calling sequence was:

# /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv

# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM
#
#

# List of Benchmarks to run:

# Sendrecv

#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 2
#-----------------------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]   Mbytes/sec
            0         1000         2.77         2.77         2.77         0.00
            1         1000         2.68         2.68         2.68         0.75
            2         1000         2.62         2.62         2.62         1.53
            4         1000         2.61         2.61         2.61         3.06
            8         1000         2.64         2.64         2.64         6.06
           16         1000         4.47         4.47         4.47         7.16
           32         1000         4.36         4.36         4.36        14.67
           64         1000         4.37         4.37         4.37        29.31
          128         1000         4.41         4.41         4.41        58.07
          256         1000         4.42         4.42         4.42       115.94
          512         1000         4.59         4.59         4.59       223.34
         1024         1000         4.76         4.76         4.76       430.62
         2048         1000         5.02         5.02         5.02       815.80
         4096         1000         5.79         5.79         5.79      1414.85
         8192         1000         7.57         7.57         7.57      2164.05
        16384         1000        12.25        12.25        12.25      2674.27
        32768         1000        16.41        16.41        16.41      3993.64
        65536          640        36.22        36.22        36.22      3618.56
       131072          320        68.82        68.83        68.82      3808.70
       262144          160        80.24        80.31        80.27      6528.68
       524288           80       119.05       119.17       119.11      8798.73
      1048576           40       202.05       202.25       202.15     10369.08
      2097152           20       370.85       371.46       371.15     11291.52
      4194304           10       673.70       674.99       674.34     12427.81


# All processes entering MPI_Finalize

Note that the performance numbers presented in this document are for reference purposes only.

On a separate shell, run the Intel® MPI Benchmarks on VirtualFabric2 by setting the environment variable HFI_SL=2:

# mpirun -genv HFI_SL=2 -PSM2 -host localhost -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv : -host 192.168.100.101 -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1
#------------------------------------------------------------
#    Intel (R) MPI Benchmarks 2018, MPI-1 part
#------------------------------------------------------------
# Date                  : Tue Nov 28 15:06:47 2017
# Machine               : x86_64
# System                : Linux
# Release               : 3.10.0-327.22.2.el7.x86_64
# Version               : #1 SMP Thu Jun 9 10:09:10 EDT 2016
# MPI Version           : 3.1
# MPI Thread Environment:


# Calling sequence was:

# /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv

# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM
#
#

# List of Benchmarks to run:

# Sendrecv

#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 2
#-----------------------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]   Mbytes/sec
            0         1000         2.70         2.70         2.70         0.00
            1         1000         2.66         2.66         2.66         0.75
            2         1000         2.66         2.66         2.66         1.50
            4         1000         3.55         3.55         3.55         2.26
            8         1000         2.67         2.67         2.67         6.00
           16         1000         4.40         4.40         4.40         7.27
           32         1000         4.37         4.37         4.37        14.63
           64         1000         4.27         4.27         4.27        29.95
          128         1000         4.29         4.29         4.29        59.66
          256         1000         4.39         4.39         4.39       116.63
          512         1000         4.48         4.48         4.48       228.42
         1024         1000         4.63         4.64         4.63       441.85
         2048         1000         5.04         5.04         5.04       812.86
         4096         1000         5.81         5.81         5.81      1409.28
         8192         1000         7.54         7.54         7.54      2172.12
        16384         1000        11.93        11.93        11.93      2747.13
        32768         1000        16.24        16.25        16.24      4034.19
        65536          640        35.99        35.99        35.99      3641.52
       131072          320        67.15        67.16        67.15      3903.48
       262144          160        77.56        77.62        77.59      6754.15
       524288           80       118.23       118.37       118.30      8858.10
      1048576           40       206.30       206.63       206.46     10149.53
      2097152           20       367.50       368.05       367.77     11396.12
      4194304           10       676.80       677.80       677.30     12376.23


# All processes entering MPI_Finalize

Display the port status for port 1, VL 1, and VL 2 (shows result only if it is different than 0):

# opapmaquery -o getportstatus -n 0x2 -w 0x6 | grep -v 0$
Port Number             1
    VL Select Mask      0x00000006
    Performance: Transmit
        Xmit Data                             2289 MB (286211521 Flits)
        Xmit Pkts                           746057
    Performance: Receive
        Rcv Data                              2334 MB (291751815 Flits)
    Errors: Signal Integrity
        Link Qual Indicator                      5 (Excellent)
    Errors: Security
    Errors: Routing and Other Errors
    Performance: Congestion
        Xmit Wait                            10899
    Performance: Bubbles

        VL Number    1
            Performance: Transmit
                 Xmit Data                              736 MB (92044480 Flits)
                 Xmit Pkts                           148719
            Performance: Receive
                 Rcv Data                               736 MB (92037226 Flits)
                 Rcv Pkts                            148583
            Performance: Congestion
                 Xmit Wait                             6057
            Performance: Bubbles
            Errors: Other

        VL Number    2
            Performance: Transmit
                 Xmit Data                              736 MB (92039568 Flits)
                 Xmit Pkts                           147831
            Performance: Receive
                 Rcv Data                               736 MB (92038857 Flits)
                 Rcv Pkts                            148803
            Performance: Congestion
                 Xmit Wait                             4833
            Performance: Bubbles
            Errors: Other

Verify that the port counters at VirtualFabric1 and VirtualFabric2 are incrementing by displaying port counters of fabric port for LID 1, port 1, and VirtualFabric1:

# opapaquery -o vfPortCounters -l 1 -P 1 -V VirtualFabric1 | grep -v 0$
Getting Port Counters...
PM Controlled VF Port Counters (total) for node LID 0x0001, port number 1:
VF name: VirtualFabric1
Performance: Transmit
    Xmit Data                              736 MB (92044480 Flits)
    Xmit Pkts                           148719
Performance: Receive
    Rcv Data                               736 MB (92037226 Flits)
    Rcv Pkts                            148583
Routing and Other Errors:
Congestion:
    Xmit Wait                             6057
Bubbles:
ImageTime: Tue Nov 21 21:45:53 2017
opapaquery completed: OK

Similarly, for VirtualFabric2:

# opapaquery -o vfPortCounters -l 1 -P 1 -V VirtualFabric2 | grep -v 0$
Getting Port Counters...
PM Controlled VF Port Counters (total) for node LID 0x0001, port number 1:
VF name: VirtualFabric2
Performance: Transmit
    Xmit Data                              736 MB (92039568 Flits)
    Xmit Pkts                           147831
Performance: Receive
    Rcv Data                               736 MB (92038857 Flits)
    Rcv Pkts                            148803
Routing and Other Errors:
Congestion:
    Xmit Wait                             4833
Bubbles:
ImageTime: Tue Nov 21 21:50:23 2017
opapaquery completed: OK

View the data counters per virtual lane (shows only if different than 0):

# opapmaquery -o getdatacounters -n 0x2 -w 0x6 | grep -v 0$
 Port Select Mask   0x0000000000000002
 VL Select Mask     0x00000006
    Port Number     1
        Xmit Data                             2226 MB (278332959 Flits)
        Rcv Data                              2236 MB (279538592 Flits)
        Rcv Pkts                            506372
    Signal Integrity Errors:
        Link Qual. Indicator                     5 (Excellent)
    Congestion:
        Xmit Wait                            10899
    Bubbles:
        VL Number     1
             Xmit Data                              736 MB (92044480 Flits)
             Rcv Data                               736 MB (92037226 Flits)
             Xmit Pkts                           148719
             Rcv Pkts                            148583
             Xmit Wait                             6057
        VL Number     2
             Xmit Data                              736 MB (92039568 Flits)
             Rcv Data                               736 MB (92038857 Flits)
             Xmit Pkts                           147831
             Rcv Pkts                            148803
             Xmit Wait                             4833

Finally, view the error counters for 16 virtual lanes; this shows errors if different than 0:

# opapmaquery -o geterrorcounters -n 0x2 -w 0xffff | grep -v 0$
 Port Select Mask   0x0000000000000002
 VL Select Mask     0x0000ffff
    Port Number     1
    Signal Integrity Errors:
    Security Errors:
    Routing and Other Errors:
         VL Number     1
         VL Number     2
         VL Number     3
         VL Number     4
         VL Number     5
         VL Number     6
         VL Number     7
         VL Number     8
         VL Number     9
         VL Number     11
         VL Number     12
         VL Number     13
         VL Number     14
         VL Number     15

Conclusion

Configuring vFabrics in Intel® OPA requires users to manually edit the configuration file. This document shows how to set up two virtual fabrics by editing the opafm.xml file. The necessary command lines are shown to set up, to verify the newly created virtual fabrics, and to clear counters. The Intel® MPI Benchmarks are used to run traffic in each virtual fabric. Verification of traffic and errors in virtual fabric can be observed by looking at the counters.

References


Viewing all articles
Browse latest Browse all 3384

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>