Begining-To-End DPDK Guide
This tutorial walks through installing DPDK and Open vSwitch with DPDK (OVS-DPDK) using Debian-built packages, then building a working setup with:
- OVS-DPDK running on the host
- A VM connected via vhost-user
- DPDK (
testpmd) running inside the VM for forwarding tests
The instructions assume Intel Niantic NICs. Mellanox ConnectX3-Pro can work but is mostly untested with these packages; a workaround section is included.
Test Environment
Hardware / OS
- 1× HPE DL360 Gen9
- Debian Linux (kernel 4.4.7 at time of writing)
BIOS settings
- Power Profile: Maximum Performance (disables C-states / P-states)
- Intel Turbo Boost: Disabled
CPU Core Mask Model (Key Concepts)
Before configuring anything, make sure you understand how cores are assigned:
1) Linux kernel core
- Linux will always run on core 0
- Do not isolate core 0
2) DPDK EAL core
- The DPDK EAL runs on one core, controlled by
DPDK_OPTS - Example:
-c 0x1→ EAL on core 0
3) OVS PMD cores
- OVS Poll Mode Drivers (PMDs) must run on isolated cores
- Configured via:
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6
6means0x6→ binary110→ cores 1 and 2
4) testpmd core masks
testpmduses a core mask too- Example:
-c 0x7→ cores 0,1,2 are in the mask - The lowest-numbered core becomes the master core (core 0); forwarding uses the other cores (1 and 2)
5) VM CPU isolation
- VM vCPUs should be isolated too
- Example VM core set: 3,4,5
- Also isolate their hyperthread siblings
Hyperthreading note
This tutorial does not require disabling HT in BIOS. Instead, isolate logical cores and their HT pairs. Example: isolate 1,2 and their HT siblings 17,18 so you can test 1/2/4 logical-core configurations.
To view HT sibling pairs:
for i in $(ls /sys/devices/system/cpu/ | grep ^cpu | grep -v freq | grep -v idle); do
echo -e "$i\t" $(cat /sys/devices/system/cpu/$i/topology/thread_siblings_list)
done
Step 1 — Setup DPDK on the Host
1.1 Install packages and enable hugepages + core isolation
Install the Debian packages:
apt-get install linux-headers-amd64 openvswitch-switch-dpdk
update-alternatives --set ovs-vswitchd /usr/lib/openvswitch-switch-dpdk/ovs-vswitchd-dpdk
Edit GRUB kernel cmdline:
vim /etc/default/grub
Add hugepages and isolation (example):
GRUB_CMDLINE_LINUX_DEFAULT="quiet default_hugepagesz=1G hugepagesz=1G hugepages=16 isolcpus=1-5,17-21"
Apply and reboot:
update-grub
reboot
NUMA warning: all cores used for PMDs/VM/testpmd should be on the same NUMA node. Use lscpu to confirm.
1.2 Optional system tuning (low-latency-ish)
echo 0 > /proc/sys/kernel/randomize_va_space
echo 0 > /proc/sys/net/ipv4/ip_forward
rmmod ipmi_si
rmmod ipmi_msghandler
rmmod ipmi_devintf
rmmod ipc_ich
rmmod bridge
1.3 Bind NICs and sanity-check with testpmd
Stop OVS first:
systemctl stop openvswitch-switch
Load DPDK driver and bind NIC ports:
modprobe igb_uio
ls -la /sys/class/net
dpdk_nic_bind --status
# Example binds (PCI IDs are examples)
dpdk_nic_bind --bind=igb_uio 08:00.0
dpdk_nic_bind --bind=igb_uio 08:00.1
Run testpmd to validate DPDK:
testpmd -d /usr/lib/x86_64-linux-gnu/dpdk/librte_pmd_ixgbe.so.1.1 \
-c 0x7 -n 4 -- -i --nb-cores=2
If you get the prompt testpmd>, DPDK is likely working. Verify forwarding config (optional):
testpmd> show config fwd
testpmd> quit
Step 2 — Configure OVS-DPDK
2.1 Configure DPDK_OPTS (EAL core + memory)
Edit:
cat /etc/default/openvswitch-switch
Set DPDK_OPTS (EAL on core 0):
export DPDK_OPTS="--dpdk -c 0x1 -n 2 --socket-mem 4096"
Restart OVS:
systemctl restart openvswitch-switch
Create QEMU bridge config once:
mkdir -p /etc/qemu
touch /etc/qemu/bridge.conf
2.2 Create bridge and ports
ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
ovs-vsctl add-port br0 vhost-user-0 -- set Interface vhost-user-0 type=dpdkvhostuser
ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuser
2.3 Mellanox notes (if dpdk ports fail to add)
Some Mellanox setups require binding + module reload before OVS starts, and running testpmd
after OVS starts to “kick” the device.
If ovs-vsctl add-port ... type=dpdk fails:
- Delete the ports:
ovs-vsctl del-port br0 dpdk0
ovs-vsctl del-port br0 dpdk1
- Add pre-start steps to
/etc/init.d/openvswitch-switch(example):
modprobe uio_pci_generic
dpdk_nic_bind --bind=uio_pci_generic 08:00.0
rmmod mlx4_en
rmmod mlx4_ib
rmmod mlx4_core
modprobe -a mlx4_en mlx4_ib mlx4_core
export DPDK_OPTS="--dpdk -c f -n 4 --socket-mem 4096"
- Add post-start testpmd trigger near the end:
echo quit | testpmd -d /usr/lib/x86_64-linux-gnu/dpdk/librte_pmd_mlx4.so.1.1 \
-c 0x3 -n 4 -w 0000:${PCI_ID}.0 -w 0000:${PCI_ID}.1 -- --nb-cores=2 -i
You may need to remove/re-add dpdk ports after reboot in some cases.
2.4 OpenFlow rules (dpdk ↔ vhost mapping)
Inspect ports:
ovs-ofctl show br0
Delete default flows (important):
ovs-ofctl del-flows br0
Add symmetric forwarding rules:
ovs-ofctl add-flow br0 in_port=1,action=output:4
ovs-ofctl add-flow br0 in_port=4,action=output:1
ovs-ofctl add-flow br0 in_port=2,action=output:3
ovs-ofctl add-flow br0 in_port=3,action=output:2
Verify:
ovs-ofctl dump-flows br0
Note: flows must be re-added after restarting openvswitch-switch.
2.5 NUMA placement decision
Goal: run VM on the same NUMA node as ovs-vswitchd and the NIC.
Check which NUMA node the NIC uses:
cat /sys/class/net/ethX/device/numa_node
Find ovs-vswitchd PID and check memory usage:
ps -ef | grep ovs-vswitchd
numastat <PID>
Step 3 — Run a VM on the Host
3.1 Create VM image (example)
qemu-img create -f qcow2 hlx-vm 10G
qemu-system-x86_64 -enable-kvm -smp 4 \
-cdrom /path/to/iso -boot d hlx-vm
3.2 VM hugepages + isolcpus
Inside the VM, set (example):
GRUB_CMDLINE_LINUX_DEFAULT="quiet default_hugepagesz=1G hugepagesz=1G hugepages=2 isolcpus=1,2,3"
Run update-grub and reboot the VM.
3.3 VM networking (static IP example)
auto lo
iface lo inet loopback
allow-hotplug eth0
iface eth0 inet static
address 192.168.122.155
gateway 192.168.122.1
netmask 255.255.255.0
Run the VM using your script (qemu_linux.pl) with vhost-user sockets and NUMA binding, e.g.:
perl scripts/qemu_linux.pl ... --numa-node 0 --core-list 3,4,5,6 ...
Step 4 — Run DPDK (testpmd) in the VM
4.1 Install DPDK inside VM
If VM cannot reach external repos, copy a local repo tarball from host and add:
deb file:///home/me/ovs-dpdk-16.04 cattleprod main
Then:
apt-cdrom add
apt-get update
apt-get install dpdk
4.2 Bind virtio ports inside VM
You should see:
eth0(management)eth1,eth2(vhost-user virtio ports)
Bind to igb_uio:
modprobe igb_uio
dpdk_nic_bind --status
dpdk_nic_bind --bind=igb_uio 00:04.0
dpdk_nic_bind --bind=igb_uio 00:05.0
4.3 Run testpmd in the VM
testpmd -d /usr/lib/x86_64-linux-gnu/dpdk/librte_pmd_virtio.so.1.1 \
-c 0x7 -n 4 -w 0000:00:04.0 -w 0000:00:05.0 \
-- --burst=64 --disable-hw-vlan --txd=2048 --rxd=2048 --txqflags=0xf00 -i
Start forwarding:
testpmd> set fwd mac_retry
testpmd> start
testpmd> show config fwd
4.4 Critical: pin the “forwarding” qemu thread to a different host core
A common issue: all qemu threads end up burning one core (often core 3), killing throughput.
After testpmd> start, find the qemu thread whose runtime increases and pin it to a different core.
On host, list qemu threads and see the CPU core column:
ps -eLF | grep -i qemu | less -S
Pin that specific thread PID to a different core:
taskset -pc 4 <THREAD_PID>
Re-check distribution:
ps -eLF | grep -i qemu | less -S
This step is often required to get stable throughput/latency results. At this point, the host OVS-DPDK + vhost-user + VM testpmd forwarding chain should be working, and you can proceed with traffic generation/testing (e.g., Spirent).