This instruction is for setting up Nvidia GPU Fabric for multi-node with multi-GPU.

Authors: Liang Yan 

 

Pre-Requirement

1. Install OFED Driver

mlnxofedinstall --without-dkms --add-kernel-support --kernel `uname -r`  --without-fw-update --force --basic

 

2. Install Nvidia Software Stack

            nvidia-driver

            nvidia-fabricmanager

            cuda-driver

     

3. Load Kernel module nvidia-peermem 

This module is part of nvidia driver but is dependent on ofed driver. Make sure it will be generated and loaded.

 

 

4. Make sure System services: nvidia-fabricmanager is on

GPU Fabric setup

1. Make sure GPU Fabric(CX-7) is under Ethernet mode

We are using VPI Network right now. It could support InfiniBand and Ethernet; ensure it is in Ethernet Mode. 

Mode Switch:

  • BIOS change during provision(preferred)
  • Command line change(must reboot)

 

mlx_pci=$(sudo ibdev2netdev -v | awk '{print $1}' | head -n1)

sudo mlxconfig -d $mlx_pci set LINK_TYPE_P1=ETH LINK_TYPE_P2=ETH

 

2. Configure the IP address and link-up interface. MTU 4200 is just an example here; after verification, it could be bigger(9000).

 

        ip link set dev ${i} up

        ip link set dev ${i} mtu 4200

        ADDR="192.168.${subnet}.${octet}/24"

        ip addr add dev ${i} ${ADDR}

 

3. Link discovery with LLDP

3.1 Install lldpd

           

apt-get install lldpd

 

3. 2 lldp config

        echo "enabling lldp for interface: $i"

        lldptool set-lldp -i $i adminStatus=rxtx

        lldptool -T -i $i -V sysName enableTx=yes

        lldptool -T -i $i -V portDesc enableTx=yes

        lldptool -T -i $i -V sysDesc enableTx=yes

        lldptool -T -i $i -V sysCap enableTx=yes

        lldptool -T -i $i -V mngAddr enableTx=yes

        lldptool -i $i -T -V portID subtype=PORT_ID_INTERFACE_NAME


4. Script Snapshot for all

#!/bin/bash

IFACES=$(ip -br addr | grep eth | grep -v -E 'eth0|eth1' | awk '{print $1}')

subnet=50

octet=1

for i in ${IFACES}; do

        ip link set dev ${i} up

        ip link set dev ${i} mtu 4200

        ADDR="192.168.${subnet}.${octet}/24"

        ip addr add dev ${i} ${ADDR}

        subnet=$((subnet + 1))

 

        echo "enabling lldp for interface: $i"

        lldptool set-lldp -i $i adminStatus=rxtx

        lldptool -T -i $i -V sysName enableTx=yes

        lldptool -T -i $i -V portDesc enableTx=yes

        lldptool -T -i $i -V sysDesc enableTx=yes

        lldptool -T -i $i -V sysCap enableTx=yes

        lldptool -T -i $i -V mngAddr enableTx=yes

        lldptool -i $i -T -V portID subtype=PORT_ID_INTERFACE_NAME

done

 

ip -br addr

lldpctl neigh show

 

GPU Fabric validation

1. Validation


2. IP Output

 

3. RDMA Perftest

    3.1 Install from apt

apt install perftest

 

3.2 Install from local

 

git clone https://github.com/linux-rdma/perftest

cd perftest

./autogen.sh

./configure CUDA_H_PATH=/usr/local/cuda/include/cuda.h

make -j32

 

3.3 P2P test from two nodes

 

Server

./ib_read_bw -a -q 20 --report_gbits -d mlx5_0

Client

./ib_read_bw -a -q 20 --report_gbits -d mlx5_0 192.168.50.1

 

3.4 Ouptu

 

Others:

 

Reference: