NVIDIA GPU Passthrough (Tesla V100) on SLES

lyan 2021-02-18 18:07

Introduction

This document describes how to configure NVIDIA GPU passthrough for compute workloads using Tesla V100 or Quadro T1000 devices.

Scope:

GPU compute only (CUDA workloads)
Not focused on graphical desktop usage
Tested on SLES12SP3+ / SLES15

Requirements:

NVIDIA Tesla-class GPU (Maxwell / Pascal / Volta)
VT-d enabled in BIOS
Separate host display GPU or SSH access

Part One: Host Preparation

1. Host Environment Verification

1.1 Verify OS Version

cat /etc/issue

Expected:

SUSE Linux Enterprise Server 15 (x86_64)

---

1.2 Verify VT-d Support

dmesg | grep -e "Directed I/O"

Expected:

DMAR: Intel(R) Virtualization Technology for Directed I/O

If not enabled → enable VT-d in BIOS. ---

1.3 Verify GPU Devices

Check host display adapter:

lspci | grep -i vga

Check NVIDIA GPU:

lspci | grep -i nvidia

Example Tesla V100:

03:00.0 3D controller: NVIDIA Corporation GV100 [Tesla V100 PCIe]

---

2. Enable IOMMU

Edit:

/etc/default/grub

For Intel:

GRUB_CMDLINE_LINUX="intel_iommu=on iommu=pt rd.driver.pre=vfio-pci"

For AMD:

GRUB_CMDLINE_LINUX="amd_iommu=on iommu=pt rd.driver.pre=vfio-pci"

Regenerate GRUB:

grub2-mkconfig -o /boot/grub2/grub.cfg

Reboot and verify:

dmesg | grep -e DMAR -e IOMMU

---

3. Blacklist Nouveau

Edit:

/etc/modprobe.d/50-blacklist.conf

Add:

blacklist nouveau

---

4. Bind GPU to VFIO

Find vendor/device ID:

lspci -nn | grep 03:00.0

Example:

[10de:1db4]

Create:

/etc/modprobe.d/vfio.conf

Add:

options vfio-pci ids=10de:1db4

⚠ Some consumer GPUs require adding HDMI audio device ID as well. ---

5. Load VFIO Modules

Add to initrd config:

/etc/dracut.conf.d/gpu-passthrough.conf

add_drivers+="vfio vfio_iommu_type1 vfio_pci vfio_virqfd"

Rebuild initrd:

dracut --force

Verify after reboot:

lspci -k

Expected:

Kernel driver in use: vfio-pci

---

6. Windows Guest MSR Fix

Create:

/etc/modprobe.d/kvm.conf

options kvm ignore_msrs=1

---

7. Verify IOMMU Group Isolation

find /sys/kernel/iommu_groups/*/devices/*

Ensure GPU is isolated in its own group. ---

8. Enable UEFI (OVMF)

Install:

zypper in qemu-ovmf

Verify firmware files:

rpm -ql qemu-ovmf-x86_64

Restart libvirtd:

systemctl restart libvirtd

Part Two: Guest Preparation

1. Generic VM Configuration

Enable UEFI mode
Use Q35 machine type
Add NVIDIA PCI device
Keep emulated GPU (QXL/VGA) during install
Use virtio drivers for disk/network

---

2. Linux Guest Setup

2.1 Install NVIDIA Driver

Option A: Using official SLES repository

rpm -i nvidia-diag-driver-local-repo-sles*.rpm
zypper refresh
zypper install cuda-drivers
reboot

Option B: Using .run installer Requirements:

gcc-c++
kernel-devel
Secure Boot disabled

---

2.2 CUDA Test


cd /usr/local/cuda/samples/0_Simple/simpleTemplates
make
./simpleTemplates

Expected:

GPU Device 0: Tesla V100-PCIE-16GB
Compare OK

---

2.3 Display Issue

After driver installation:

virt-manager display may disconnect
Use SSH instead
Or install VNC inside guest

Optional:


systemctl stop display-manager
systemctl disable display-manager

---

3. Windows Guest

Important: Hide hypervisor from NVIDIA driver. In libvirt XML:

<features>
  <kvm>
    <hidden state='on'/>
  </kvm>
</features>

Install:

NVIDIA driver from official website
CUDA toolkit

Test CUDA samples in:

Program Files\NVIDIA GPU Computing Toolkit\CUDA\extras\demo_suite

Conclusion

GPU passthrough provides near-native performance for compute workloads.

Performance ≈ bare metal
Requires strict IOMMU isolation
Best suited for HPC and ML workloads

If properly configured, Tesla-class GPUs work reliably with VFIO and OVMF-based virtual machines.

NVIDIA GPU Passthrough (Tesla V100) on SLES

Introduction

Part One: Host Preparation

1. Host Environment Verification

1.1 Verify OS Version

1.2 Verify VT-d Support

1.3 Verify GPU Devices

2. Enable IOMMU

3. Blacklist Nouveau

4. Bind GPU to VFIO

5. Load VFIO Modules

6. Windows Guest MSR Fix

7. Verify IOMMU Group Isolation

8. Enable UEFI (OVMF)

Part Two: Guest Preparation

1. Generic VM Configuration

2. Linux Guest Setup

2.1 Install NVIDIA Driver

2.2 CUDA Test

2.3 Display Issue

3. Windows Guest

Conclusion

Leave a Comment

Top Posts

Hot Posts

Recent Posts

Tag Cloud