Begining-To-End DPDK Guide

This tutorial walks through installing DPDK and Open vSwitch with DPDK (OVS-DPDK) using Debian-built packages, then building a working setup with:

  • OVS-DPDK running on the host
  • A VM connected via vhost-user
  • DPDK (testpmd) running inside the VM for forwarding tests

The instructions assume Intel Niantic NICs. Mellanox ConnectX3-Pro can work but is mostly untested with these packages; a workaround section is included.


Test Environment

Hardware / OS

  • HPE DL360 Gen9
  • Debian Linux (kernel 4.4.7 at time of writing)

BIOS settings

  • Power Profile: Maximum Performance (disables C-states / P-states)
  • Intel Turbo Boost: Disabled

CPU Core Mask Model (Key Concepts)

Before configuring anything, make sure you understand how cores are assigned:

1) Linux kernel core

  • Linux will always run on core 0
  • Do not isolate core 0

2) DPDK EAL core

  • The DPDK EAL runs on one core, controlled by DPDK_OPTS
  • Example: -c 0x1 → EAL on core 0

3) OVS PMD cores

  • OVS Poll Mode Drivers (PMDs) must run on isolated cores
  • Configured via:
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6
  • 6 means 0x6 → binary 110 → cores 1 and 2

4) testpmd core masks

  • testpmd uses a core mask too
  • Example: -c 0x7 → cores 0,1,2 are in the mask
  • The lowest-numbered core becomes the master core (core 0); forwarding uses the other cores (1 and 2)

5) VM CPU isolation

  • VM vCPUs should be isolated too
  • Example VM core set: 3,4,5
  • Also isolate their hyperthread siblings

Hyperthreading note

This tutorial does not require disabling HT in BIOS. Instead, isolate logical cores and their HT pairs. Example: isolate 1,2 and their HT siblings 17,18 so you can test 1/2/4 logical-core configurations.

To view HT sibling pairs:

for i in $(ls /sys/devices/system/cpu/ | grep ^cpu | grep -v freq | grep -v idle); do
  echo -e "$i\t" $(cat /sys/devices/system/cpu/$i/topology/thread_siblings_list)
done

Step 1 — Setup DPDK on the Host

1.1 Install packages and enable hugepages + core isolation

Install the Debian packages:

apt-get install linux-headers-amd64 openvswitch-switch-dpdk
update-alternatives --set ovs-vswitchd /usr/lib/openvswitch-switch-dpdk/ovs-vswitchd-dpdk

Edit GRUB kernel cmdline:

vim /etc/default/grub

Add hugepages and isolation (example):

GRUB_CMDLINE_LINUX_DEFAULT="quiet default_hugepagesz=1G hugepagesz=1G hugepages=16 isolcpus=1-5,17-21"

Apply and reboot:

update-grub
reboot

NUMA warning: all cores used for PMDs/VM/testpmd should be on the same NUMA node. Use lscpu to confirm.

1.2 Optional system tuning (low-latency-ish)

echo 0 > /proc/sys/kernel/randomize_va_space
echo 0 > /proc/sys/net/ipv4/ip_forward

rmmod ipmi_si
rmmod ipmi_msghandler
rmmod ipmi_devintf
rmmod ipc_ich
rmmod bridge

1.3 Bind NICs and sanity-check with testpmd

Stop OVS first:

systemctl stop openvswitch-switch

Load DPDK driver and bind NIC ports:

modprobe igb_uio
ls -la /sys/class/net
dpdk_nic_bind --status

# Example binds (PCI IDs are examples)
dpdk_nic_bind --bind=igb_uio 08:00.0
dpdk_nic_bind --bind=igb_uio 08:00.1

Run testpmd to validate DPDK:

testpmd -d /usr/lib/x86_64-linux-gnu/dpdk/librte_pmd_ixgbe.so.1.1 \
  -c 0x7 -n 4 -- -i --nb-cores=2

If you get the prompt testpmd>, DPDK is likely working. Verify forwarding config (optional):

testpmd> show config fwd
testpmd> quit

Step 2 — Configure OVS-DPDK

2.1 Configure DPDK_OPTS (EAL core + memory)

Edit:

cat /etc/default/openvswitch-switch

Set DPDK_OPTS (EAL on core 0):

export DPDK_OPTS="--dpdk -c 0x1 -n 2 --socket-mem 4096"

Restart OVS:

systemctl restart openvswitch-switch

Create QEMU bridge config once:

mkdir -p /etc/qemu
touch /etc/qemu/bridge.conf

2.2 Create bridge and ports

ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
ovs-vsctl add-port br0 vhost-user-0 -- set Interface vhost-user-0 type=dpdkvhostuser
ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuser

2.3 Mellanox notes (if dpdk ports fail to add)

Some Mellanox setups require binding + module reload before OVS starts, and running testpmd after OVS starts to “kick” the device.

If ovs-vsctl add-port ... type=dpdk fails:

  1. Delete the ports:
ovs-vsctl del-port br0 dpdk0
ovs-vsctl del-port br0 dpdk1
  1. Add pre-start steps to /etc/init.d/openvswitch-switch (example):
modprobe uio_pci_generic
dpdk_nic_bind --bind=uio_pci_generic 08:00.0

rmmod mlx4_en
rmmod mlx4_ib
rmmod mlx4_core
modprobe -a mlx4_en mlx4_ib mlx4_core

export DPDK_OPTS="--dpdk -c f -n 4 --socket-mem 4096"
  1. Add post-start testpmd trigger near the end:
echo quit | testpmd -d /usr/lib/x86_64-linux-gnu/dpdk/librte_pmd_mlx4.so.1.1 \
  -c 0x3 -n 4 -w 0000:${PCI_ID}.0 -w 0000:${PCI_ID}.1 -- --nb-cores=2 -i

You may need to remove/re-add dpdk ports after reboot in some cases.

2.4 OpenFlow rules (dpdk ↔ vhost mapping)

Inspect ports:

ovs-ofctl show br0

Delete default flows (important):

ovs-ofctl del-flows br0

Add symmetric forwarding rules:

ovs-ofctl add-flow br0 in_port=1,action=output:4
ovs-ofctl add-flow br0 in_port=4,action=output:1
ovs-ofctl add-flow br0 in_port=2,action=output:3
ovs-ofctl add-flow br0 in_port=3,action=output:2

Verify:

ovs-ofctl dump-flows br0

Note: flows must be re-added after restarting openvswitch-switch.

2.5 NUMA placement decision

Goal: run VM on the same NUMA node as ovs-vswitchd and the NIC.

Check which NUMA node the NIC uses:

cat /sys/class/net/ethX/device/numa_node

Find ovs-vswitchd PID and check memory usage:

ps -ef | grep ovs-vswitchd
numastat <PID>

Step 3 — Run a VM on the Host

3.1 Create VM image (example)

qemu-img create -f qcow2 hlx-vm 10G
qemu-system-x86_64 -enable-kvm -smp 4 \
  -cdrom /path/to/iso -boot d hlx-vm

3.2 VM hugepages + isolcpus

Inside the VM, set (example):

GRUB_CMDLINE_LINUX_DEFAULT="quiet default_hugepagesz=1G hugepagesz=1G hugepages=2 isolcpus=1,2,3"

Run update-grub and reboot the VM.

3.3 VM networking (static IP example)

auto lo
iface lo inet loopback

allow-hotplug eth0
iface eth0 inet static
  address 192.168.122.155
  gateway 192.168.122.1
  netmask 255.255.255.0

Run the VM using your script (qemu_linux.pl) with vhost-user sockets and NUMA binding, e.g.:

perl scripts/qemu_linux.pl ... --numa-node 0 --core-list 3,4,5,6 ...

Step 4 — Run DPDK (testpmd) in the VM

4.1 Install DPDK inside VM

If VM cannot reach external repos, copy a local repo tarball from host and add:

deb file:///home/me/ovs-dpdk-16.04 cattleprod main

Then:

apt-cdrom add
apt-get update
apt-get install dpdk

4.2 Bind virtio ports inside VM

You should see:

  • eth0 (management)
  • eth1, eth2 (vhost-user virtio ports)

Bind to igb_uio:

modprobe igb_uio
dpdk_nic_bind --status
dpdk_nic_bind --bind=igb_uio 00:04.0
dpdk_nic_bind --bind=igb_uio 00:05.0

4.3 Run testpmd in the VM

testpmd -d /usr/lib/x86_64-linux-gnu/dpdk/librte_pmd_virtio.so.1.1 \
  -c 0x7 -n 4 -w 0000:00:04.0 -w 0000:00:05.0 \
  -- --burst=64 --disable-hw-vlan --txd=2048 --rxd=2048 --txqflags=0xf00 -i

Start forwarding:

testpmd> set fwd mac_retry
testpmd> start
testpmd> show config fwd

4.4 Critical: pin the “forwarding” qemu thread to a different host core

A common issue: all qemu threads end up burning one core (often core 3), killing throughput. After testpmd> start, find the qemu thread whose runtime increases and pin it to a different core.

On host, list qemu threads and see the CPU core column:

ps -eLF | grep -i qemu | less -S

Pin that specific thread PID to a different core:

taskset -pc 4 <THREAD_PID>

Re-check distribution:

ps -eLF | grep -i qemu | less -S

This step is often required to get stable throughput/latency results. At this point, the host OVS-DPDK + vhost-user + VM testpmd forwarding chain should be working, and you can proceed with traffic generation/testing (e.g., Spirent).

← Previous Post
Next Post →

Leave a Comment