baird:~/:[0]# cat /etc/issue Welcome to SUSE Linux Enterprise Server 15 (x86_64) - Kernel \r (\l).
baird:~/:[0]# dmesg | grep -e "Directed I/O" [ 12.819760] DMAR: Intel(R) Virtualization Technology for Directed I/O
(if not enabled in the BIOS... reboot and enable it!)
baird:~/:[0]# lspci | grep -i "vga" 07:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200e [Pilot] ServerEngines (SEP1) (rev 05)
With a Tesla V100:
baird:~/:[0]# lspci | grep -i nvidia 03:00.0 3D controller: NVIDIA Corporation GV100 [Tesla V100 PCIe] (rev a1)
With a T1000 Mobile (available on Dell 5540):
linux-5540:~ # lspci | grep -i nvidia 01:00.0 3D controller: NVIDIA Corporation TU117GLM [Quadro T1000 Mobile] (rev a1)
iommu is disabled by default, you need to enable it at boot time through the grub.cfg configuration file:
For Intel:
vim /etc/default/grub # Make this line look like this GRUB_CMDLINE_LINUX="intel_iommu=on iommu=pt rd.driver.pre=vfio-pci"
For AMD:
GRUB_CMDLINE_LINUX="iommu=pt amd_iommu=on rd.driver.pre=vfio-pci"
Then regenerate the grub configuration file:
grub2-mkconfig -o /boot/grub2/grub.cfg
dmesg | grep -e DMAR -e IOMMU
baird:~/:[0]# vim /etc/modprobe.d/50-blacklist.conf
add at the end of the file:
blacklist nouveau
Find the vendor id and model id with "lspci -nn":
baird:~/:[0]# lspci -nn | grep 03:00.0 03:00.0 3D controller [0302]: NVIDIA Corporation GV100 [Tesla V100 PCIe] [10de:1db4] (rev a1)
Here is an example with a T1000:
linux-5540:/etc/modprobe.d # lspci -nn | grep 01:00 01:00.0 3D controller [0302]: NVIDIA Corporation TU117GLM [Quadro T1000 Mobile] [10de:1fb9] (rev a1)
baird:~/:[0]# cat /etc/modprobe.d/vfio.conf options vfio-pci ids=10de:1db4
Note: double check that your card doesn't need an extra "ids", this is common that on Public Nvidia Card (gamer one), you need to also add an Audio device in the list, or you won't be able to use your card.
You can add it to your initrd file:
baird:~/:[0]# cat /etc/dracut.conf.d/gpu-passthrough.conf add_drivers+="vfio vfio_iommu_type1 vfio_pci vfio_virqfd"
Then regenerate the initrd file:
dracut --force /boot/initrd $(uname -r)
Or use a /etc/modules-load.d/vfio-pci.conf file which contains:
pci_stub vfio vfio_iommu_type1 vfio_pci kvm kvm_intel
Check that after the boot you have something like the following:
linux-5540:~ # dmesg | grep vfio | grep 10 [ 2.672192] vfio_pci: add [10de:1fb9[ffffffff:ffffffff]] class 0x000000/00000000
or load the module by hand with "modprobe vfio-pci".
6 .Windows Guest and MSR
For windows guest you will probably need to disable MSR (model-specific register) to avoid crash of the guest:
create a file /etc/modprobe.d/kvm.conf and add:
options kvm ignore_msrs=1
baird:~/:[0]# find /sys/kernel/iommu_groups/*/devices/* /sys/kernel/iommu_groups/47/devices/0000:03:00.0 /sys/kernel/iommu_groups/49/devices/0000:07:00.0
Check that the Nvidia V100 card will be used by the vfio-pci:
baird:~/:[0]# lspci -k 03:00.0 3D controller: NVIDIA Corporation GV100 [Tesla V100 PCIe] (rev a1) Subsystem: NVIDIA Corporation Device 1214 Kernel driver in use: vfio-pci Kernel modules: nouveau
zypper in qemu-ovmf
nvram = [ "/usr/share/qemu/ovmf-x86_64-4m.bin:/usr/share/qemu/ovmf-x86_64-4m-vars.bin", "/usr/share/qemu/ovmf-x86_64-4m-code.bin:/usr/share/qemu/ovmf-x86_64-4m-vars.bin", "/usr/share/qemu/ovmf-x86_64-smm-ms-code.bin:/usr/share/qemu/ovmf-x86_64-smm-ms-vars.bin", "/usr/share/qemu/ovmf-x86_64-smm-opensuse-code.bin:/usr/share/qemu/ovmf-x86_64-smm-opensuse-vars.bin", "/usr/share/qemu/ovmf-x86_64-ms-4m-code.bin:/usr/share/qemu/ovmf-x86_64-ms-4m-vars.bin", "/usr/share/qemu/ovmf-x86_64-smm-suse-code.bin:/usr/share/qemu/ovmf-x86_64-smm-suse-vars.bin", "/usr/share/qemu/ovmf-x86_64-ms-code.bin:/usr/share/qemu/ovmf-x86_64-ms-vars.bin", "/usr/share/qemu/ovmf-x86_64-smm-code.bin:/usr/share/qemu/ovmf-x86_64-smm-vars.bin", "/usr/share/qemu/ovmf-x86_64-opensuse-4m-code.bin:/usr/share/qemu/ovmf-x86_64-opensuse-4m-vars.bin", "/usr/share/qemu/ovmf-x86_64-suse-4m-code.bin:/usr/share/qemu/ovmf-x86_64-suse-4m-vars.bin", "/usr/share/qemu/ovmf-x86_64-suse-code.bin:/usr/share/qemu/ovmf-x86_64-suse-vars.bin", "/usr/share/qemu/ovmf-x86_64-opensuse-code.bin:/usr/share/qemu/ovmf-x86_64-opensuse-vars.bin", "/usr/share/qemu/ovmf-x86_64-code.bin:/usr/share/qemu/ovmf-x86_64-vars.bin", ]
To get the list of the ovmf's bin files and vars files just do a:
rpm -ql qemu-ovmf-x86_64
Restart libvirtd on the host:
systemctl restart libvirtd
1 Generic configuration
Graphic: spice or VNC Device: qxl / VGA / Virtio
1.4 Add the pci host devices:
Use the PCI id of your nvidia card:
03:00.0
1.5 VM guest configuration
For best preference its better to use virtio driver for network and storage.
You should use if possible Q35 Machine type
http://www.nvidia.com/download/driverResults.aspx/131159/en-us
- install gcc-c++, and kernel-devel
- be sure secure boot is disable, because the Nvidia driver modules are unsigned: using yast2 and disabling "secure boot", and also going in the EFI menu while booting the VM, change to select the "sles" entry.
- launch the run file
cd /usr/local/cuda-9.1/samples/0_Simple/simpleTemplates make /usr/local/cuda-9.1/samples/0_Simple/simpleTemplates/:[0]# ./simpleTemplates runTest<float,32> GPU Device 0: "Tesla V100-PCIE-16GB" with compute capability 7.0 CUDA device [Tesla V100-PCIE-16GB] has 80 Multi-Processors Processing time: 495.006000 (ms) Compare OK runTest<int,64> GPU Device 0: "Tesla V100-PCIE-16GB" with compute capability 7.0 CUDA device [Tesla V100-PCIE-16GB] has 80 Multi-Processors Processing time: 0.203000 (ms) Compare OK [simpleTemplates] -> Test Results: 0 Failures
connection. You need to login via ssh or change to console interface or install a dedicated VNC server inside vm. To avoid flickering screen you can disable display-manager with "systemctl stop display-manager ; systemctl disable display-manager".
3 Windows Guest
Install Windows guest using libvirt or virt-manager. This is needed to hide the hypervisor from Nvidia driver installer using "<hidden state='on'/>" in the Guest definition.
3.1 Download driver from nvidia website: https://www.nvidia.com/Download/index.aspx
3.2 install the CUDA toolkit: https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64
3.3 you can find some Nvidia demo samples in "Program Files\Nvidia GPU Computing Toolkit\CUDA\v10.2\extras\demo_suite
Attached file win10.xml is an example of a libvirt XML Windows 10 Guest configuration.