NVIDIA GPU Passthrough (Tesla V100) on SLES

Introduction

This document describes how to configure NVIDIA GPU passthrough for compute workloads using Tesla V100 or Quadro T1000 devices.

Scope:

  • GPU compute only (CUDA workloads)
  • Not focused on graphical desktop usage
  • Tested on SLES12SP3+ / SLES15

Requirements:

  • NVIDIA Tesla-class GPU (Maxwell / Pascal / Volta)
  • VT-d enabled in BIOS
  • Separate host display GPU or SSH access

Part One: Host Preparation

1. Host Environment Verification

1.1 Verify OS Version

cat /etc/issue
Expected:
SUSE Linux Enterprise Server 15 (x86_64)
---

1.2 Verify VT-d Support

dmesg | grep -e "Directed I/O"
Expected:
DMAR: Intel(R) Virtualization Technology for Directed I/O
If not enabled → enable VT-d in BIOS. ---

1.3 Verify GPU Devices

Check host display adapter:
lspci | grep -i vga
Check NVIDIA GPU:
lspci | grep -i nvidia
Example Tesla V100:
03:00.0 3D controller: NVIDIA Corporation GV100 [Tesla V100 PCIe]
---

2. Enable IOMMU

Edit:
/etc/default/grub
For Intel:
GRUB_CMDLINE_LINUX="intel_iommu=on iommu=pt rd.driver.pre=vfio-pci"
For AMD:
GRUB_CMDLINE_LINUX="amd_iommu=on iommu=pt rd.driver.pre=vfio-pci"
Regenerate GRUB:
grub2-mkconfig -o /boot/grub2/grub.cfg
Reboot and verify:
dmesg | grep -e DMAR -e IOMMU
---

3. Blacklist Nouveau

Edit:
/etc/modprobe.d/50-blacklist.conf
Add:
blacklist nouveau
---

4. Bind GPU to VFIO

Find vendor/device ID:
lspci -nn | grep 03:00.0
Example:
[10de:1db4]
Create:
/etc/modprobe.d/vfio.conf
Add:
options vfio-pci ids=10de:1db4
⚠ Some consumer GPUs require adding HDMI audio device ID as well. ---

5. Load VFIO Modules

Add to initrd config:
/etc/dracut.conf.d/gpu-passthrough.conf
add_drivers+="vfio vfio_iommu_type1 vfio_pci vfio_virqfd"
Rebuild initrd:
dracut --force
Verify after reboot:
lspci -k
Expected:
Kernel driver in use: vfio-pci
---

6. Windows Guest MSR Fix

Create:
/etc/modprobe.d/kvm.conf
options kvm ignore_msrs=1
---

7. Verify IOMMU Group Isolation

find /sys/kernel/iommu_groups/*/devices/*
Ensure GPU is isolated in its own group. ---

8. Enable UEFI (OVMF)

Install:
zypper in qemu-ovmf
Verify firmware files:
rpm -ql qemu-ovmf-x86_64
Restart libvirtd:
systemctl restart libvirtd

Part Two: Guest Preparation

1. Generic VM Configuration

  • Enable UEFI mode
  • Use Q35 machine type
  • Add NVIDIA PCI device
  • Keep emulated GPU (QXL/VGA) during install
  • Use virtio drivers for disk/network
---

2. Linux Guest Setup

2.1 Install NVIDIA Driver

Option A: Using official SLES repository
rpm -i nvidia-diag-driver-local-repo-sles*.rpm
zypper refresh
zypper install cuda-drivers
reboot
Option B: Using .run installer Requirements:
  • gcc-c++
  • kernel-devel
  • Secure Boot disabled
---

2.2 CUDA Test


cd /usr/local/cuda/samples/0_Simple/simpleTemplates
make
./simpleTemplates
Expected:
GPU Device 0: Tesla V100-PCIE-16GB
Compare OK
---

2.3 Display Issue

After driver installation:
  • virt-manager display may disconnect
  • Use SSH instead
  • Or install VNC inside guest
Optional:

systemctl stop display-manager
systemctl disable display-manager
---

3. Windows Guest

Important: Hide hypervisor from NVIDIA driver. In libvirt XML:
<features>
  <kvm>
    <hidden state='on'/>
  </kvm>
</features>
Install:
  • NVIDIA driver from official website
  • CUDA toolkit
Test CUDA samples in:
Program Files\NVIDIA GPU Computing Toolkit\CUDA\extras\demo_suite

Conclusion

GPU passthrough provides near-native performance for compute workloads.

  • Performance ≈ bare metal
  • Requires strict IOMMU isolation
  • Best suited for HPC and ML workloads

If properly configured, Tesla-class GPUs work reliably with VFIO and OVMF-based virtual machines.

← Previous Post
Next Post →

Leave a Comment