This instruction is for setting up Nvidia GPU Fabric for multi-node with multi-GPU.
Authors: Liang Yan
Pre-Requirement
1. Install OFED Driver
mlnxofedinstall --without-dkms --add-kernel-support --kernel `uname -r` --without-fw-update --force --basic
2. Install Nvidia Software Stack
nvidia-driver
nvidia-fabricmanager
cuda-driver
3. Load Kernel module nvidia-peermem
This module is part of nvidia driver but is dependent on ofed driver. Make sure it will be generated and loaded.
4. Make sure System services: nvidia-fabricmanager is on
GPU Fabric setup
1. Make sure GPU Fabric(CX-7) is under Ethernet mode
We are using VPI Network right now. It could support InfiniBand and Ethernet; ensure it is in Ethernet Mode.
Mode Switch:
- BIOS change during provision(preferred)
- Command line change(must reboot)
mlx_pci=$(sudo ibdev2netdev -v | awk '{print $1}' | head -n1)
sudo mlxconfig -d $mlx_pci set LINK_TYPE_P1=ETH LINK_TYPE_P2=ETH
2. Configure the IP address and link-up interface. MTU 4200 is just an example here; after verification, it could be bigger(9000).
ip link set dev ${i} up
ip link set dev ${i} mtu 4200
ADDR="192.168.${subnet}.${octet}/24"
ip addr add dev ${i} ${ADDR}
3. Link discovery with LLDP
3.1 Install lldpd
apt-get install lldpd
3. 2 lldp config
echo "enabling lldp for interface: $i"
lldptool set-lldp -i $i adminStatus=rxtx
lldptool -T -i $i -V sysName enableTx=yes
lldptool -T -i $i -V portDesc enableTx=yes
lldptool -T -i $i -V sysDesc enableTx=yes
lldptool -T -i $i -V sysCap enableTx=yes
lldptool -T -i $i -V mngAddr enableTx=yes
lldptool -i $i -T -V portID subtype=PORT_ID_INTERFACE_NAME
4. Script Snapshot for all
#!/bin/bash
IFACES=$(ip -br addr | grep eth | grep -v -E 'eth0|eth1' | awk '{print $1}')
subnet=50
octet=1
for i in ${IFACES}; do
ip link set dev ${i} up
ip link set dev ${i} mtu 4200
ADDR="192.168.${subnet}.${octet}/24"
ip addr add dev ${i} ${ADDR}
subnet=$((subnet + 1))
echo "enabling lldp for interface: $i"
lldptool set-lldp -i $i adminStatus=rxtx
lldptool -T -i $i -V sysName enableTx=yes
lldptool -T -i $i -V portDesc enableTx=yes
lldptool -T -i $i -V sysDesc enableTx=yes
lldptool -T -i $i -V sysCap enableTx=yes
lldptool -T -i $i -V mngAddr enableTx=yes
lldptool -i $i -T -V portID subtype=PORT_ID_INTERFACE_NAME
done
ip -br addr
lldpctl neigh show
GPU Fabric validation
1. Validation
2. IP Output
3. RDMA Perftest
3.1 Install from apt
apt install perftest
3.2 Install from local
git clone https://github.com/linux-rdma/perftest
cd perftest
./autogen.sh
./configure CUDA_H_PATH=/usr/local/cuda/include/cuda.h
make -j32
3.3 P2P test from two nodes
Server
./ib_read_bw -a -q 20 --report_gbits -d mlx5_0
Client
./ib_read_bw -a -q 20 --report_gbits -d mlx5_0 192.168.50.1
3.4 Ouptu
Others:
Reference:
- https://enterprise-support.nvidia.com/s/article/howto-change-port-type-in-mellanox-connectx-3-adapter
- https://docs.nvidia.com/holoscan/sdk-user-guide/set_up_gpudirect_rdma.html#enabling-rdma-on-the-connectx-smartnic
- https://enterprise-support.nvidia.com/s/article/howto-enable-lldp-on-linux-servers-for-link-discovery
- https://github.com/linux-rdma/perftest/tree/master