This article shares three methods you can use when dealing with the firewall blocking the Message Passing Interface (MPI) communication among many machines. For example, when running an MPI program between two machines, you might see a communication error like this:
[proxy:0:1@knl-sb0] HYDU_sock_connect (../../utils/sock/sock.c:268): unable to connect from "knl-sb0" to "knc4" (No route to host) [proxy:0:1@knl-sb0] main (../../pm/pmiserv/pmip.c:461): unable to connect to server knc4 at port 39652 (check for firewalls!)
This symptom suggests the MPI ranks cannot communicate with each other, because the firewall blocks the MPI communication.
Below are three methods to help you solve this problem.
First Method: Stop the firewalld
deamon
The first and simplest method is to stop the firewall on the machine where you run the MPI program. First, check the status of the firewalld
deamon on a Red Hat Enterprise Linux* (RHEL*) and CentOS* system.
$ systemctl status firewalld firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2017-12-05 21:36:10 PST; 12min ago Main PID: 47030 (firewalld) CGroup: /system.slice/firewalld.service 47030 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid
The output shows that firewalld
is running. You can stop it and verify its status with the following command lines:
$ sudo systemctl stop firewalld $ systemctl status firewalld firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled) Active: inactive (dead) since Tue 2017-12-05 21:51:19 PST; 4s ago Process: 48062 ExecStart=/usr/sbin/firewalld --nofork --nopid $FIREWALLD_ARGS (code=exited, status=0/SUCCESS) Main PID: 48062 (code=exited, status=0/SUCCESS)
With firewalld
now stopped, you should be able to run your MPI program between the two machines (in this example, I use the Intel® MPI Benchmarks IMB-MPI1
as the MPI program).
$ mpirun -host localhost -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv : -host 10.23.3.61 -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 #------------------------------------------------------------ # Intel (R) MPI Benchmarks 2018, MPI-1 part #------------------------------------------------------------ # Date : Tue Dec 5 21:51:45 2017 # Machine : x86_64 # System : Linux # Release : 3.10.0-327.el7.x86_64 # Version : #1 SMP Thu Nov 19 22:10:57 UTC 2015 # MPI Version : 3.1 # MPI Thread Environment: # Calling sequence was: # /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # Sendrecv #--------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 2 #--------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 16.57 16.57 16.57 0.00 1 1000 16.57 16.57 16.57 0.12 2 1000 16.52 16.53 16.53 0.24 4 1000 16.58 16.58 16.58 0.48 8 1000 16.51 16.51 16.51 0.97 16 1000 16.20 16.20 16.20 1.98 32 1000 16.32 16.32 16.32 3.92 64 1000 16.55 16.55 16.55 7.73 128 1000 16.65 16.65 16.65 15.37 256 1000 29.07 29.09 29.08 17.60 512 1000 30.75 30.76 30.76 33.29 1024 1000 31.13 31.15 31.14 65.75 2048 1000 33.58 33.58 33.58 121.98 4096 1000 34.79 34.80 34.80 235.38
However, this method can pose a problem, because this machine is vulnerable to security issues. It may not be suitable in some scenarios. In that case, start the firewalld
deamon again, and then try the second method.
$ sudo systemctl start firewalld
Second Method: Use Rich Rule in firewalld
This method uses the Rich Rule feature in firewalld
to accept only IP v4 packets from the other machine whose IP address is 10.23.3.61.
$ sudo firewall-cmd --add-rich-rule='rule family="ipv4" source address="10.23.3.61" accept' Success
Verify the rule you just added.
$ firewall-cmd --list-rich-rules rule family="ipv4" source address="10.23.3.61" accept
Run the MPI program.
$ mpirun -host localhost -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv : -host 10.23.3.61 -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 #------------------------------------------------------------ # Intel (R) MPI Benchmarks 2018, MPI-1 part #------------------------------------------------------------ # Date : Tue Dec 5 22:01:17 2017 # Machine : x86_64 # System : Linux # Release : 3.10.0-327.el7.x86_64 # Version : #1 SMP Thu Nov 19 22:10:57 UTC 2015 # MPI Version : 3.1 # MPI Thread Environment: # Calling sequence was: # /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # Sendrecv #--------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 2 #--------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 16.88 16.88 16.88 0.00 1 1000 16.86 16.86 16.86 0.12 2 1000 16.57 16.57 16.57 0.24 4 1000 16.55 16.55 16.55 0.48 8 1000 16.40 16.40 16.40 0.98 16 1000 16.29 16.29 16.29 1.96 32 1000 16.63 16.63 16.63 3.85 64 1000 16.87 16.87 16.87 7.59 128 1000 17.03 17.04 17.03 15.03 256 1000 27.58 27.60 27.59 18.55 512 1000 27.52 27.54 27.53 37.18 1024 1000 26.87 26.89 26.88 76.16 2048 1000 28.62 28.64 28.63 143.02 4096 1000 30.27 30.27 30.27 270.62 ^C[mpiexec@knc4] Sending Ctrl-C to processes as requested [mpiexec@knc4] Press Ctrl-C again to force abort
You can remove a Rich Rule that you defined by entering the following command:
$ sudo firewall-cmd --remove-rich-rule='rule family="ipv4" source address="10.23.3.61" accept' success
Third Method: Add a Rule in iptables-service
to Accept Packets from Other Machines
In addition to firewalld, iptables-service
can also be used to manage the firewall on a RHEL and CentOS system. In this method, you can add a rule in iptables-service
to allow only traffic from the other machine.
First, download and install the iptables-services
package.
$ sudo yum install iptables-servicesi
Next, start the iptables-service
service.
$ sudo systemctl start iptables $ systemctl status iptables iptables.service - IPv4 firewall with iptables Loaded: loaded (/usr/lib/systemd/system/iptables.service; disabled; vendor preset: disabled) Active: active (exited) since Tue 2017-12-05 21:53:41 PST; 55s ago Process: 49042 ExecStart=/usr/libexec/iptables/iptables.init start (code=exited, status=0/SUCCESS) Main PID: 49042 (code=exited, status=0/SUCCESS) Dec 05 21:53:41 knc4-jf-intel-com systemd[1]: Starting IPv4 firewall with iptables... Dec 05 21:53:41 knc4-jf-intel-com iptables.init[49042]: iptables: Applying firewall rules: [ OK ] Dec 05 21:53:41 knc4-jf-intel-com systemd[1]: Started IPv4 firewall with iptables.
The firewall rules are defined in the /etc/sysconfig/iptables
file.
$ sudo cat /etc/sysconfig/iptables # sample configuration for iptables service # you can edit this manually or use system-config-firewall # please do not ask us to add additional ports/services to this default configuration *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -i lo -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT
Display the current defined rules; there shouldn’t be any. To add a rule to accept packets from the other machine, specify its IP address.
$ firewall-cmd --direct --get-all-rules $ sudo firewall-cmd --direct --add-rule ipv4 filter INPUT 0 -s 10.23.3.61 -j ACCEPT success $ firewall-cmd --direct --get-all-rules ipv4 filter INPUT 0 -s 10.23.3.61 -j ACCEPT
After adding the new rule, the above command line confirms that the new rule has been added. Run the MPI program again to verify it works.
$ mpirun -host localhost -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv : -host 10.23.3.61 -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 #------------------------------------------------------------ # Intel (R) MPI Benchmarks 2018, MPI-1 part #------------------------------------------------------------ # Date : Tue Dec 5 21:58:20 2017 # Machine : x86_64 # System : Linux # Release : 3.10.0-327.el7.x86_64 # Version : #1 SMP Thu Nov 19 22:10:57 UTC 2015 # MPI Version : 3.1 # MPI Thread Environment: # Calling sequence was: # /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # Sendrecv #--------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 2 #--------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 16.49 16.49 16.49 0.00 1 1000 16.40 16.40 16.40 0.12 2 1000 16.40 16.40 16.40 0.24 4 1000 16.86 16.86 16.86 0.47 8 1000 16.43 16.43 16.43 0.97 16 1000 16.32 16.32 16.32 1.96 32 1000 16.64 16.64 16.64 3.85 64 1000 16.90 16.90 16.90 7.57 128 1000 16.86 16.86 16.86 15.18 256 1000 29.58 29.60 29.59 17.30 512 1000 27.73 27.74 27.74 36.91 1024 1000 28.07 28.09 28.08 72.91 2048 1000 34.95 34.97 34.96 117.15 4096 1000 36.22 36.23 36.22 226.12 ^C[mpiexec@knc4] Sending Ctrl-C to processes as requested [mpiexec@knc4] Press Ctrl-C again to force abort
To remove this rule, use the following command.
$ sudo firewall-cmd --direct --remove-rule ipv4 filter INPUT 0 -s 10.23.3.61 -j ACCEPT success $ firewall-cmd --direct --get-all-rules
Conclusion
Firewalls can block MPI communication among the nodes. This article shared three methods you can use to allow communication among MPI ranks.