OVS-DPDK: A Practical (and Kernel-Aware) Setup Guide
1. Background
DPDK (Data Plane Development Kit) is a set of user-space libraries and drivers designed for high-throughput, low-latency packet processing. Instead of relying on the kernel networking stack for every packet, DPDK enables polling-mode drivers (PMDs) in user space and uses optimizations such as:
- CPU pinning and isolation
- NUMA-aware memory placement
- Hugepages for stable TLB behavior and DMA mappings
- VFIO for secure, high-performance device assignment
This document walks through installing DPDK + Open vSwitch (OVS) from packages (when available), enabling OVS-DPDK, creating vhost-user ports, and attaching VMs via QEMU command line or libvirt. It also lists common issues you will likely hit in real deployments.
2. Pre-requirements
2.1 NIC must be supported by DPDK
Check the supported NIC list:
https://core.dpdk.org/supported/
2.2 Platform prerequisites
- CPU pinning (isolated cores for PMDs and VM vCPUs)
- NUMA awareness (recommended; not strictly required)
- Hugepages (required for most DPDK deployments)
- IOMMU/VFIO support (strongly recommended vs UIO)
2.3 Software baseline
- Kernel: > 3.2 (practically you want newer for VFIO/IOMMU stability)
- glibc: > 2.7
- QEMU: vhost-user works well with QEMU >= 2.2; vhost-user-client requires QEMU >= 2.7
3. Host Setup
3.1 CPU pinning and isolation
DPDK PMDs (in ovs-vswitchd) are polling threads and should run on dedicated cores. Similarly, VM vCPUs and the QEMU emulator thread should be pinned to avoid scheduler noise.
Recommended approach:
- Reserve a set of isolated cores for DPDK PMDs
- Reserve a separate set for VM vCPUs
- Pin QEMU emulator thread to the same NUMA node
Kernel boot hint (example):
GRUB_CMDLINE_LINUX_DEFAULT="quiet isolcpus=1,2,3 nohz_full=1,2,3 rcu_nocbs=1,2,3"
3.2 NUMA awareness
NUMA is not strictly required, but performance can collapse if OVS-DPDK, VMs, and NIC DMA are spread across nodes.
To identify the NUMA node of a NIC:
cat /sys/class/net/ethX/device/numa_node
Goal: place ovs-vswitchd PMD threads, hugepage memory, and VM vCPUs on the same NUMA node as the physical NIC whenever possible.
3.3 Hugepages
3.3.1 Static (GRUB)
Example for 1G hugepages:
default_hugepagesz=1G hugepagesz=1G hugepages=4
Regenerate grub and reboot:
grub2-mkconfig -o /boot/grub2/grub.cfg
3.3.2 Dynamic allocation
Example for 2MB pages:
echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
On NUMA systems, allocate per-node:
echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages
Example for 1G pages on a specific node:
echo 4 > /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
3.3.3 Mount hugepage filesystem
mkdir -p /mnt/huge
mount -t hugetlbfs nodev /mnt/huge
For 1G hugepages (optional mountpoint):
mkdir -p /mnt/huge_1GB
mount -t hugetlbfs -o pagesize=1G nodev /mnt/huge_1GB
Persist via /etc/fstab:
nodev /mnt/huge hugetlbfs defaults 0 0
nodev /mnt/huge_1GB hugetlbfs pagesize=1G 0 0
Note: If libvirt is used, restart it after hugepage changes:
systemctl restart libvirtd
3.4 VFIO / IOMMU
VFIO is preferred over UIO because it provides proper IOMMU isolation and typically better security and stability. Ensure VT-d/AMD-Vi is enabled in BIOS.
Add IOMMU flags via GRUB (Intel example):
intel_iommu=on iommu=pt
Rebuild grub and reboot, then validate:
dmesg | egrep -i "DMAR|IOMMU|VFIO"
4. OVS + DPDK
On some distros, OVS is built with DPDK enabled; on others you may need to build from source. Verify support by checking the OVS version output:
ovs-vswitchd --version
Example (shows DPDK enabled):
ovs-vswitchd (Open vSwitch) 2.10.1
DPDK 18.02.2
4.1 DPDK: bind the NIC to a DPDK-capable driver
Modern recommendation: use vfio-pci.
modprobe vfio
modprobe vfio_pci
Then use the DPDK devbind tool (path varies by distro; common names include dpdk-devbind.py):
dpdk-devbind.py --status
dpdk-devbind.py --bind=vfio-pci 0000:01:00.0
dpdk-devbind.py --unbind 0000:01:00.0
Legacy (less preferred) option: igb_uio (requires UIO modules):
modprobe uio
modprobe igb_uio
dpdk-devbind.py --bind=igb_uio 0000:01:00.0
Some NICs require vendor drivers or specific firmware/PMD combos; always validate with DPDK docs for your NIC family (mlx/ixgbe/i40e, etc.).
4.2 OVS-DPDK initialization
Initialize OVS and enable DPDK:
ovs-vsctl --no-wait init
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=0x6
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=1024
Restart OVS services (method depends on distro/service unit names):
systemctl restart openvswitch
4.3 Bridge and ports (netdev datapath)
4.3.1 Create a DPDK bridge
ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
4.3.2 Add a DPDK physical port
ovs-vsctl add-port br0 dpdk-p0 \
-- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:01:00.0
Some NICs (e.g., ConnectX-3) may map multiple ports under one PCI function; use class/mac selection:
ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
options:dpdk-devargs="class=eth,mac=00:11:22:33:44:55"
4.3.3 Add vhost-user ports
OVS supports:
- dpdkvhostuser (OVS is server, QEMU is client) — older, less flexible
- dpdkvhostuserclient (OVS is client, QEMU is server) — preferred; supports OVS restart without forcing VM restart
Recommendation: use dpdkvhostuserclient when possible (requires QEMU >= 2.7).
vhost-user (OVS server):
ovs-vsctl add-port br0 vhost-user1 \
-- set Interface vhost-user1 type=dpdkvhostuser
vhost-user-client (OVS client):
ovs-vsctl add-port br0 vhostclient0 \
-- set Interface vhostclient0 type=dpdkvhostuserclient \
options:vhost-server-path=/var/run/openvswitch/vhostclient0
5. VM Setup
5.1 QEMU command line example
Key points:
- Back guest memory with hugepages
- Use vhost-user netdev + virtio-net-pci
- Pin vCPUs and emulator thread (not shown here: use taskset/cgroups/libvirt)
qemu-system-x86_64 \
-name vm-dpdk \
-enable-kvm -cpu host \
-m 4096 \
-object memory-backend-file,id=mem,size=4096M,mem-path=/mnt/huge,share=on \
-numa node,memdev=mem \
-mem-prealloc \
-smp sockets=1,cores=2 \
-drive file=/path/to/disk.img,if=virtio,format=qcow2 \
-chardev socket,id=char0,path=/var/run/openvswitch/vhostclient0,server=on,wait=off \
-netdev type=vhost-user,id=net0,chardev=char0,vhostforce \
-device virtio-net-pci,netdev=net0,mac=52:54:00:3c:d1:ae,mrg_rxbuf=on \
-nographic
Notes:
- For dpdkvhostuserclient, QEMU should act as server for the socket (hence
server=onin many setups). - For dpdkvhostuser, QEMU acts as client and connects to OVS-managed socket.
5.2 Libvirt configuration
5.2.1 Hugepage backing
<memoryBacking>
<hugepages>
<page size='1048576' unit='KiB'/>
</hugepages>
</memoryBacking>
5.2.2 vhost-user interface
For vhost-user client mode (recommended), libvirt typically sets the VM side as mode='server' depending on which side owns the socket. Match it with your OVS port type.
<interface type='vhostuser'>
<mac address='52:54:00:55:55:56'/>
<source type='unix' path='/var/run/openvswitch/vhostclient0' mode='server'/>
<model type='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</interface>
5.2.3 CPU/NUMA tuning (typical example)
<vcpu placement='static'>6</vcpu>
<cputune>
<shares>4096</shares>
<vcpupin vcpu='0' cpuset='0'/>
<vcpupin vcpu='1' cpuset='2'/>
<vcpupin vcpu='2' cpuset='4'/>
<vcpupin vcpu='3' cpuset='6'/>
<emulatorpin cpuset='0,2,4,6'/>
</cputune>
<numatune>
<memory mode='strict' nodeset='0'/>
</numatune>
5.3 Guest OS: hugepages + CPU isolation (optional)
If your guest runs packet processing too, also configure hugepages and isolate vCPUs inside the guest:
GRUB_CMDLINE_LINUX_DEFAULT="quiet default_hugepagesz=1G hugepagesz=1G hugepages=2 isolcpus=1,2,3"
6. Common issues
- Hugepages not mounted: OVS-DPDK starts but PMD/alloc fails; VM memory-backend-file fails.
- OVS not built with DPDK:
ovs-vswitchd --versionshows no DPDK; vhost-user types unavailable. - NIC not bound correctly: DPDK PMD cannot claim device; check
dpdk-devbind.py --status. - Wrong NUMA placement: huge latency/jitter; ensure NIC/PMD/VM are on same node.
- vhost socket permission: QEMU/libvirt cannot open
/var/run/openvswitch/...; fix ownership/SELinux/AppArmor as applicable. - Port mode mismatch: dpdkvhostuser vs dpdkvhostuserclient requires different socket ownership (server/client).
7. Reference
- https://wiki.qemu.org/Documentation/vhost-user-ovs-dpdk
- http://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/
- https://github.com/qemu/qemu/blob/master/tests/vhost-user-test.c
- https://github.com/openvswitch/ovs/blob/master/Documentation/intro/install/dpdk.rst