Mellanox: Difference between revisions
Line 22: | Line 22: | ||
Search domains (jlab.org, acc.jlab.org) | Search domains (jlab.org, acc.jlab.org) | ||
'''NOTE???''' if more then one port is configured (for example 1G DHCP and 40G manual), and all network connections are off/unplugged, Network Manager may keep overwriting ''/etc/resolv.conf'' swinging between settings of those two ports. When 40G port is plugged in, ''/etc/resolv.conf'' will stay in according to that port settings. | '''NOTE???''' if more then one port is configured (for example 1G DHCP and 40G manual), and all network connections are off/unplugged, Network Manager may keep overwriting ''/etc/resolv.conf'' swinging between settings of those two ports. When 40G port is plugged in, ''/etc/resolv.conf'' will stay in according to that port settings. '''Probably need to specify DNS servers and search domains for ALL ports ?''' | ||
== RHEL9.2 == | == RHEL9.2 == |
Revision as of 11:08, 29 October 2024
RHEL9.4
For 'all' cards:
yum install mstflint
lspci | grep Mellanox -> a1:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6] -> a1:00.1 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]
mstconfig -d a1:00.0 set LINK_TYPE_P1=ETH mstconfig -d a1:00.0 set LINK_TYPE_P2=ETH
Reboot machine
NOTE: Network Manager will overwrite /etc/resolv.conf in according to port settings, so ALL settings have to be specified. Run nmtui and configure port, specifying ALL following settings:
IPv4 CONFIGURATION <Manual> Addresses (like 129.57.167.109/24) Gateway (like 129.57.167.99) DNS Servers (129.57.90.255, 129.57.32.101) Search domains (jlab.org, acc.jlab.org)
NOTE??? if more then one port is configured (for example 1G DHCP and 40G manual), and all network connections are off/unplugged, Network Manager may keep overwriting /etc/resolv.conf swinging between settings of those two ports. When 40G port is plugged in, /etc/resolv.conf will stay in according to that port settings. Probably need to specify DNS servers and search domains for ALL ports ?
RHEL9.2
For ConnectX-5 card, default driver seems working. For ConnextX-6 do following:
yum install perl-sigtrap kernel-rpm-macros
cd /root cp /usr/downloads/MLNX_OFED_LINUX-23.10-1.1.9.0-rhel9.2-x86_64.tgz . tar xvf MLNX_OFED_LINUX-23.10-1.1.9.0-rhel9.2-x86_64.tgz rm MLNX_OFED_LINUX-23.10-1.1.9.0-rhel9.2-x86_64.tgz cd MLNX_OFED_LINUX-23.10-1.1.9.0-rhel9.2-x86_64 ./mlnxofedinstall --add-kernel-support --skip-repo
DO NOT DO IT Follow instructions if any (for example 'You may need to update your initramfs before next boot. To do that, run 'dracut -f' ')
/etc/init.d/openibd restart
To see installed cards, run
mst status
For every device shown, setting can be retrieved by 'mlxconfig -d <device> query', for example:
mlxconfig -d /dev/mst/mt4123_pciconf0 query
To convert device from InfiniBand to Ethernet, do follwoing:
mlxconfig -d /dev/mst/mt4123_pciconf0 set LINK_TYPE_P1=2
where _P1 means port number (for dual port card it will be _P1 and _P2), and '=2' means Ethernet. After all settings, reboot machine.
RHEL7.9
Depending on card version, use instructions below or above.
RHEL7.8
cd /root cp /usr/downloads/MLNX_OFED_LINUX-5.0-2.1.8.0-rhel7.8-x86_64.tgz . tar xvf MLNX_OFED_LINUX-5.0-2.1.8.0-rhel7.8-x86_64.tgz rm MLNX_OFED_LINUX-5.0-2.1.8.0-rhel7.8-x86_64.tgz cd MLNX_OFED_LINUX-5.0-2.1.8.0-rhel7.8-x86_64 ./mlnxofedinstall --add-kernel-support /etc/init.d/openibd restart
If last command complains about some modules, unload those or just reboot machine.
Run command
/sbin/connectx_port_config --show
to display port settings, change it with
/sbin/connectx_port_config
setting all ports to 'eth'.
Proceed to following section RHEL7.5.
RHEL7.5
Driver already included into OS, Mellanox card will be identified and configured by default. Still, have to install numactl to have nmtui command, will need it.
yum install numactl
Check (and modify if necessary) following files:
/etc/hostname: contains something like
clonfarm3.jlab.org
/etc/sysconfig/network:
NISDOMAIN=CCCHP NETWORKDELAY=30 GATEWAY=129.57.167.99
/etc/resolv.conf:
search jlab.org acc.jlab.org nameserver 129.57.90.255 nameserver 129.57.32.101
run nmtui, set port name like p3p1. Set 'IPv4 as Manual', IP address with mask (like 129.57.167.103/24) and gateway (like 129.57.167.99) only, ignore the rest; file /etc/sysconfig/network-scripts/ifcfg-p3p1 should looks like following:
HWADDR=F4:52:14:41:07:71 TYPE=Ethernet PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=none IPADDR=129.57.167.103 PREFIX=24 GATEWAY=129.57.167.99 DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=p3p1 UUID=c9f979fb-c7ca-3237-bacc-4118389d03af ONBOOT=yes AUTOCONNECT_PRIORITY=-999
NOTE: make sure 'ONBOOT=yes'.
DO NOT DO IT In file 'ifcfg-p6p1' set 'ONBOOT=yes', in file 'ifcfg-em1' set 'ONBOOT=no'.
Reboot machine. When it started to go back, unplug copper wire and plug mellanox in.
After machine is booted, check /etc/resolv.conf, sometimes it is loosing contents, restore it if needed.
Useful command: netstat -rn shows routing.
NOTE: you may want to install Mellanox driver anyway to have a tool for example to switch from ib to eth:
cd /root cp /usr/downloads/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.5-x86_64.tgz . tar xvf MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.5-x86_64.tgz rm MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.5-x86_64.tgz cd MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.5-x86_64
Install driver using following command (may need option '--force' sometimes):
./mlnxofedinstall --add-kernel-support
Restart it (if complains about some modules, unload those or just reboot machine):
/etc/init.d/openibd restart
Run command
/sbin/connectx_port_config --show
to display port settings, change it with
/sbin/connectx_port_config
setting all ports to 'eth'.
RHEL7.4
Run
yum install numactl
cd /root cp /usr/downloads/MLNX_OFED_LINUX-4.1-1.0.2.0-rhel7.4-x86_64.tgz . gunzip MLNX_OFED_LINUX-4.1-1.0.2.0-rhel7.4-x86_64.tgz tar xvf MLNX_OFED_LINUX-4.1-1.0.2.0-rhel7.4-x86_64.tar rm MLNX_OFED_LINUX-4.1-1.0.2.0-rhel7.4-x86_64.tar cd MLNX_OFED_LINUX-4.1-1.0.2.0-rhel7.4-x86_64 ./mlnxofedinstall --force
It will install everything and print following:
Please reboot your system for the changes to take effect. To load the new driver, run: /etc/init.d/openibd restart
Change following files:
/etc/hostname: contains something like
clondaq3.jlab.org
/etc/sysconfig/network:
NISDOMAIN=CCCHP GATEWAY=129.57.167.99
/etc/resolv.conf:
search jlab.org acc.jlab.org nameserver 129.57.90.255 nameserver 129.57.32.101
run nmtui, set p6p1. Set 'Manual', IP address with mask (like 129.57.167.226/24) and gateway (like 129.57.167.99) only, ignore the rest; file /etc/sysconfig/network-scripts/ifcfg-p6p1 should looks like following:
TYPE=Ethernet BOOTPROTO=none DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=no IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_PEERDNS=yes IPV6_PEERROUTES=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=p6p1 UUID=ec3cd5dc-8f9e-4e13-aa0f-a9c1ec1996c2 DEVICE=p6p1 ONBOOT=no PROXY_METHOD=none BROWSER_ONLY=no IPADDR=129.57.167.226 PREFIX=24 GATEWAY=129.57.167.99
In file 'ifcfg-p6p1' set 'ONBOOT=yes', in file 'ifcfg-em1' set 'ONBOOT=no'.
Reboot machine. When it started to go back, unplug copper wire and plug mellanox in.
After machine is booted, check /etc/resolv.conf and restore it if needed.
Useful command: netstat -rn shows routing.
RHEL7
Run
yum install numactl
cd /root cp /usr/downloads/mlnx-en-3.1-1.0.4.tgz . gunzip mlnx-en-3.1-1.0.4.tgz tar xvf mlnx-en-3.1-1.0.4.tar rm mlnx-en-3.1-1.0.4.tar cd mlnx-en-3.1-1.0.4 ./install.sh
#Run command # /sbin/connectx_port_config --show #to display port settings, change it with # /sbin/connectx_port_config #setting all ports to 'eth'.
Configure new 'eth' port as desired (left port seems p6p2 now ...). Use for example text interface to NetworkManager called 'nmtui'. Set 'Manual', IP address with mask (like 129.57.167.41/24) and gateway (like 129.57.167.99) only, ignore the rest. Reboot machine, unplug copper ethernet when starts booting, it should come back using fiber.
RHEL6
Run
yum install numactl
cd /root cp /usr/downloads/MLNX_OFED_LINUX-3.1-1.0.3-rhel6.7-x86_64.iso . mkdir tmp mount -o loop MLNX_OFED_LINUX-3.1-1.0.3-rhel6.7-x86_64.iso tmp cd tmp ./mlnxofedinstall
Run command (as instructed by installation process):
/etc/init.d/openibd restart
Run command
/sbin/connectx_port_config --show
to display port settings, change it with
/sbin/connectx_port_config
setting all ports to 'eth'.
After it installed, run /usr/bin/system-config-network and add new device eth0 as manual network with appropriate settings (do NOT specify DNS servers !). Shutdown computer, unplug copper cable and start computer, 10G fiber link should become default one. OR, if fiber already plugged in, set BOOT=no for copper port currently in use, set BOOT=yes for fiber port, and reboot.