kprobe vs kretprobe — Code Deep Dive

1. Overview

The Linux kernel provides several dynamic tracing mechanisms, among which kprobe and kretprobe are the most fundamental and widely used. Both allow injecting callbacks into kernel functions at runtime, but they serve different purposes and trigger at different points in execution.

kprobe: Can insert breakpoints at any probeable kernel address, providing two callback points — pre_handler (before instruction execution) and post_handler (after instruction execution).

kretprobe: Designed specifically for function-level tracing, providing entry_handler (function entry) and handler (function return) callbacks, with a built-in per-instance private data channel.

2. kprobe post_handler In Depth

2.1 Implementation Mechanism

kprobe works by inserting an int3 breakpoint instruction at the target address. When the CPU hits that address, the following flow is triggered:

 int3 trap fires  call pre_handler   single-step the original instruction  debug trap fires  call post_handler   resume normal execution
Key Point: What "can't get the return value" actually means
post_handler fires immediately after the probed instruction completes single-stepping — the function body hasn't finished executing yet. The rax register holds the intermediate state after that one instruction, not the function's final return value. "Can't get the return value" refers to the real kernel function's return value, not the probe code's own return value.

2.2 Code Example

// kprobe struct definition
struct kprobe kp = {
    .symbol_name  = "do_sys_open",
    .pre_handler  = my_pre_handler,   // before instruction
    .post_handler = my_post_handler,  // after instruction
};

// post_handler callback
static void my_post_handler(struct kprobe *p,
    struct pt_regs *regs, unsigned long flags) {
    // regs->ax is the intermediate state after the instruction
    // NOT the function return value!
}

2.3 Characteristics & Limitations

Probe granularity: Any probeable kernel address (instruction-level)

Performance overhead: Two traps per hit (int3 + single-step debug exception) — relatively high

Return value access: Not possible. The function hasn't finished executing when post_handler fires

Concurrency safety: No built-in per-instance data isolation. Must manage global/per-cpu variables yourself

3. kprobe Arbitrary Address Probing

The core advantage of kprobe is the ability to probe any instruction inside a function, not just the entry/exit. Two ways to specify the probe point:

3.1 Method 1: symbol_name + offset

The kernel resolves the symbol via kallsyms, then adds your offset to get the final address. This is the most common way to probe mid-function.

struct kprobe kp = {
    .symbol_name = "do_sys_open",
    .offset = 0x42,  // 66 bytes into the function
};

3.2 Method 2: Raw Address

struct kprobe kp = {
    .addr = (kprobe_opcode_t *)0xffffffff81234567,
};

// Compute at runtime via kallsyms
unsigned long addr = kallsyms_lookup_name("do_sys_open") + 0x10;
kp.addr = (kprobe_opcode_t *)addr;

3.3 Practical Example: Probing Mid-Function

Disassemble to find the target instruction's offset:

ffffffff81200000 <do_sys_open>:
  81200000: push   rbp
  81200004: sub    rsp, 0x20
  8120000b: call   getname
  81200010: test   rax, rax  ← probe here (offset=0x10)
  81200013: je     error_path

Set .offset = 0x10 to probe test rax, rax. In your post_handlerregs->ax will contain what getname() returned — no need for kretprobe on getname at all.

⚠ Constraint
The probe address must be a valid instruction boundary. Probing in the middle of a multi-byte instruction will corrupt it. The kernel does basic validation, but getting the offset right is your responsibility via disassembly.

4. kretprobe In Depth

4.1 Implementation: Trampoline, NOT "kprobe at ret"

You might intuitively think kretprobe is "a kprobe placed at the ret instruction." It's not. A function can have multiple ret instructions (early returns, error paths, different branches) and the compiler may tail-call optimize some returns away. Statically finding all exit points would be fragile and impractical.

What you might think:          What actually happens:

kprobe at ret #1  ✘            kprobe at function ENTRY (offset 0)
kprobe at ret #2  ✘               ↓
kprobe at ret #3  ✘            Replace return address on stack
kprobe at ret #4  ✘               ↓
                               ALL ret paths land on trampoline

4.2 Internal Flow

// ① Function entry kprobe fires
entry_kprobe_hit:
    save  original_return_addr = stack[rsp]    // remember real caller
    stack[rsp] = &kretprobe_trampoline         // hijack return address

// ② Function executes normally, hits any ret...

// ③ ret jumps to trampoline instead of real caller
kretprobe_trampoline:
    call  user_handler(ri, regs)    // your callback, rax = return value
    jmp   original_return_addr      // restore, jump back to real caller
Correct Mental Model
kretprobe = kprobe at entry + stack return address replacement. One probe point covers all exit paths. Side effects: backtraces break (real return address is temporarily gone from the stack), and it doesn't work on functions that never return (e.g., do_exit()panic()).

4.3 Code Example (with ri->data private data channel)

// Private data struct — per-instance isolation
struct my_data { ktime_t entry_time; unsigned long arg0; };

// entry_handler: record args + timestamp at function entry
static int my_entry(struct kretprobe_instance *ri, struct pt_regs *regs) {
    struct my_data *d = (struct my_data *)ri->data;
    d->entry_time = ktime_get_ns();
    d->arg0 = regs->di;  // first argument (x86_64)
    return 0;
}

// handler: get return value + compute latency at function return
static int my_ret(struct kretprobe_instance *ri, struct pt_regs *regs) {
    struct my_data *d = (struct my_data *)ri->data;
    u64 duration = ktime_get_ns() - d->entry_time;
    long retval = regs_return_value(regs);
    return 0;
}

static struct kretprobe krp = {
    .kp.symbol_name = "do_sys_open",
    .handler = my_ret, .entry_handler = my_entry,
    .data_size = sizeof(struct my_data), .maxactive = 20,
};

4.4 What kprobe Can Do That kretprobe Cannot

kretprobe is more powerful for function-level tracing, but kprobe has unique advantages:

1. Arbitrary address probing: kprobe can probe any instruction inside a function. kretprobe is locked to function boundaries.

2. No maxactive limit: kretprobe has a fixed instance pool. If concurrent calls exceed maxactive, excess calls are silently missed. kprobe doesn't have this problem.

3. No return address manipulation: kretprobe modifies the stack, which confuses stack unwinders, backtraces, and tools like perf. kprobe's int3 + single-step approach is cleaner.

4. Works on non-returning functions: Functions like do_exit() and panic() never return, so kretprobe's handler will never fire. kprobe at the entry works fine.

Relationship Summary
kretprobe is a specialized tool built on top of kprobe (it literally contains a kprobe internally). It trades flexibility for convenience. kprobe gives you instruction-level surgery; kretprobe gives you function-level observability.

5. Execution Timeline

The diagram below shows the complete execution flow comparison between kprobe and kretprobe:

Linux Kernel Dynamic Tracing Timelinekprobe (pre_handler + post_handler) vs kretprobe (entry_handler + handler)kprobe探测单条指令 · 两次 traptime →正常执行Normal Exec① int3trap #1断点触发pre_handler指令执行前回调可读取 regs② Single-Step单步执行原指令③ Debugtrap #2单步异常post_handler指令执行后回调可见指令影响恢复执行Resume函数体继续...Function body✘ 拿不到函数返回值kretprobe探测函数入口+返回 · trampoline 机制time →正常执行Normal① int3入口 trap内部 kprobeentry_handler记录入参 + 时间戳写入 ri→data② 替换返回地址→ trampoline函数体正常执行Function body executes normally(无额外开销)③ ret→ trampoline跳转触发handler获取返回值 + 计算耗时读取 ri→data恢复Resumeri→data 私有数据通道 (per-instance)✔ 可获取函数返回值核心区别 Key Differences触发时机 TRIGGERkprobe post:单条指令后kretprobe:整个函数返回时实现机制 MECHANISMkprobe:int3 + single-stepkretprobe:trampoline 替换返回值 RETURN VALUEkprobe post:✘ 不可获取kretprobe:✔ 可获取性能开销 OVERHEADkprobe:2× trap ~1μskretprobe:1× trap ~0.5μsLEGENDpre_handlerpost_handlerentry_handlerhandler (return)trap / 内部操作数据传递kprobe 探测指令级行为(2 traps)· kretprobe 探测函数级行为(trampoline 替换返回地址 + ri→data 私有通道)

图 1:kprobe 与 kretprobe 执行时序对比

6. Core Comparison Tables

6.1 kprobe post_handler vs kretprobe handler

Dimension kprobe post_handler kretprobe handler
Trigger point After probed instruction executes When entire function returns
Mechanism int3 + single-step Return address replacement (trampoline)
Return value ✘ Not accessible ✔ regs_return_value(regs)
Latency measurement ✘ Not suitable ✔ entry + handler pair
Probe granularity Any address (instruction-level) Function-level only
Overhead 2 traps per hit (~1μs) 1 trap per hit (~0.5μs)
Private data ✘ No built-in mechanism ✔ ri->data per-instance

6.2 kretprobe entry_handler vs kprobe pre_handler

Dimension kretprobe entry_handler kprobe pre_handler
Trigger location Function entry only Any address
Private data passing ✔ ri->data per-instance ✘ No built-in mechanism
Paired return handler ✔ Naturally paired ✘ None
Concurrency safety ✔ maxactive instances Must handle yourself
Standalone use ✘ Requires kretprobe ✔ Can be used alone

7. Performance Overhead Analysis

Mechanism Trap Count Additional Operations Typical Overhead
kprobe (pre+post) 2 int3 + single-step debug exception ~1μs/hit
kretprobe 1 (entry) Replace/restore return address ~0.5μs/hit
kprobe (pre only) 1* int3 only (*no post_handler) ~0.5μs/hit
⚠ Note: maxactive Limitation
kretprobe has a maxactive limit. When concurrent function calls exceed maxactive, excess calls are silently missed. In production, monitor the nmissed counter and set maxactive based on expected concurrency.

8. The Full Picture: Dynamic vs Static vs eBPF

kprobe/kretprobe are dynamic tracing mechanisms. The kernel also provides static tracing (tracepoint) and the eBPF programmable framework.

8.1 Hierarchy

Linux Kernel Tracing Mechanisms
├── Dynamic Tracing
│   ├── kprobe / kretprobe    ← any kernel function, inserted at runtime
│   └── uprobe / uretprobe    ← userspace functions
│
└── Static Tracing
    └── tracepoint            ← hooks pre-placed by kernel developers

8.2 Dynamic vs Static Tracing

Dimension Dynamic (kprobe) Static (tracepoint)
Probe points Any probeable address Pre-defined locations by kernel devs
Stability May change across kernel versions ABI-stable, cross-version compatible
Overhead Higher (int3 trap) Lower (compile-time, nop-like)
Parameter access Via registers, need ABI knowledge Structured parameters, direct access
Coverage Nearly all kernel functions Only pre-instrumented locations
Use case Debugging, deep analysis Production monitoring, stable tracing

8.3 eBPF Program Types → Tracing Mechanism Mapping

eBPF itself is not a tracing mechanism — it's a programmable framework that runs on top of probe points:

eBPF Program Type Underlying Mechanism Description
kprobe kprobe Dynamic probe at function entry
kretprobe kretprobe Dynamic probe at function return
tracepoint tracepoint Static tracepoint, structured parameters
raw_tracepoint tracepoint (raw) Bypasses parameter parsing, direct raw data access, faster
tracepoint_return Fires when tracepoint returns
raw_tracepoint_return Raw tracepoint, fires on return
fentry / fexit ftrace (direct) Lighter function entry/exit probing, no int3
tracepoint vs raw_tracepoint
tracepoint parses parameters into structured format before passing to the eBPF program — convenient but adds overhead. raw_tracepoint bypasses this layer, passing raw parameters directly — better performance but you must parse data structures yourself.

8.4 How to Choose

Production monitoring: Prefer tracepoint / fentry/fexit. ABI-stable and low overhead.

Deep debugging / performance analysis: Use kprobe/kretprobe. Probe anywhere, not limited to pre-placed hooks.

High-performance tracing: Use raw_tracepoint or fentry/fexit. Avoid parameter parsing overhead or int3 traps.

Userspace tracing: Use uprobe/uretprobe. Similar to kprobe but targets userspace programs.

9. Decision Guide

9.1 Use kprobe when

You need to probe arbitrary locations inside a function (not just entry/exit)
You only need to observe register state changes after a specific instruction
You're probing functions that never return (do_exit, panic)

9.2 Use kretprobe when

You need to capture function return values (return codes, pointers, error codes)
You need to measure function execution latency (entry_handler records start, handler computes delta)
You need to correlate arguments with return values (via ri->data private data channel)

9.3 Use tracepoint / eBPF when

Production environment needs stable ABI that won't break across kernel versions
You want structured parameter access without manually parsing registers
Performance-sensitive — you need the lowest overhead probing possible

10. Summary

kprobe post_handler: Probes the state after a single instruction executes. Fine-grained but cannot access the function's return value.

kretprobe: Probes the moment a function returns. Implemented by hijacking the return address to a trampoline at entry — NOT by placing a kprobe at ret.

kprobe arbitrary address: Via symbol_name + offset or raw addr, you can probe any instruction inside a function.

tracepoint / eBPF: Static tracing provides stable ABI and lower overhead. eBPF is a programmable framework that runs on top of probe points.

Function latency & return values → kretprobe. Register state after a specific instruction → kprobe. Stable production tracing → tracepoint + eBPF.

← Previous Post

Leave a Comment