kprobe vs kretprobe — Code Deep Dive

lyan 2026-02-28 14:17

1. Overview

The Linux kernel provides several dynamic tracing mechanisms, among which kprobe and kretprobe are the most fundamental and widely used. Both allow injecting callbacks into kernel functions at runtime, but they serve different purposes and trigger at different points in execution.

kprobe: Can insert breakpoints at any probeable kernel address, providing two callback points — pre_handler (before instruction execution) and post_handler (after instruction execution).

kretprobe: Designed specifically for function-level tracing, providing entry_handler (function entry) and handler (function return) callbacks, with a built-in per-instance private data channel.

2. kprobe post_handler In Depth

2.1 Implementation Mechanism

kprobe works by inserting an int3 breakpoint instruction at the target address. When the CPU hits that address, the following flow is triggered:

① int3 trap fires → call pre_handler → ② single-step the original instruction → debug trap fires → call post_handler → ③ resume normal execution

Key Point: What "can't get the return value" actually means

post_handler fires immediately after the probed instruction completes single-stepping — the function body hasn't finished executing yet. The rax register holds the intermediate state after that one instruction, not the function's final return value. "Can't get the return value" refers to the real kernel function's return value, not the probe code's own return value.

2.2 Code Example

// kprobe struct definition
struct kprobe kp = {
    .symbol_name  = "do_sys_open",
    .pre_handler  = my_pre_handler,   // before instruction
    .post_handler = my_post_handler,  // after instruction
};

// post_handler callback
static void my_post_handler(struct kprobe *p,
    struct pt_regs *regs, unsigned long flags) {
    // regs->ax is the intermediate state after the instruction
    // NOT the function return value!
}

2.3 Characteristics & Limitations

Probe granularity: Any probeable kernel address (instruction-level)

Performance overhead: Two traps per hit (int3 + single-step debug exception) — relatively high

Return value access: Not possible. The function hasn't finished executing when post_handler fires

Concurrency safety: No built-in per-instance data isolation. Must manage global/per-cpu variables yourself

3. kprobe Arbitrary Address Probing

The core advantage of kprobe is the ability to probe any instruction inside a function, not just the entry/exit. Two ways to specify the probe point:

3.1 Method 1: symbol_name + offset

The kernel resolves the symbol via kallsyms, then adds your offset to get the final address. This is the most common way to probe mid-function.

struct kprobe kp = {
    .symbol_name = "do_sys_open",
    .offset = 0x42,  // 66 bytes into the function
};

3.2 Method 2: Raw Address

struct kprobe kp = {
    .addr = (kprobe_opcode_t *)0xffffffff81234567,
};

// Compute at runtime via kallsyms
unsigned long addr = kallsyms_lookup_name("do_sys_open") + 0x10;
kp.addr = (kprobe_opcode_t *)addr;

3.3 Practical Example: Probing Mid-Function

Disassemble to find the target instruction's offset:

ffffffff81200000 <do_sys_open>:
  81200000: push   rbp
  81200004: sub    rsp, 0x20
  8120000b: call   getname
  81200010: test   rax, rax  ← probe here (offset=0x10)
  81200013: je     error_path

Set .offset = 0x10 to probe test rax, rax. In your post_handler, regs->ax will contain what getname() returned — no need for kretprobe on getname at all.

⚠ Constraint

The probe address must be a valid instruction boundary. Probing in the middle of a multi-byte instruction will corrupt it. The kernel does basic validation, but getting the offset right is your responsibility via disassembly.

4. kretprobe In Depth

4.1 Implementation: Trampoline, NOT "kprobe at ret"

You might intuitively think kretprobe is "a kprobe placed at the ret instruction." It's not. A function can have multiple ret instructions (early returns, error paths, different branches) and the compiler may tail-call optimize some returns away. Statically finding all exit points would be fragile and impractical.

What you might think:          What actually happens:

kprobe at ret #1  ✘            kprobe at function ENTRY (offset 0)
kprobe at ret #2  ✘               ↓
kprobe at ret #3  ✘            Replace return address on stack
kprobe at ret #4  ✘               ↓
                               ALL ret paths land on trampoline

4.2 Internal Flow

// ① Function entry kprobe fires
entry_kprobe_hit:
    save  original_return_addr = stack[rsp]    // remember real caller
    stack[rsp] = &kretprobe_trampoline         // hijack return address

// ② Function executes normally, hits any ret...

// ③ ret jumps to trampoline instead of real caller
kretprobe_trampoline:
    call  user_handler(ri, regs)    // your callback, rax = return value
    jmp   original_return_addr      // restore, jump back to real caller

Correct Mental Model

kretprobe = kprobe at entry + stack return address replacement. One probe point covers all exit paths. Side effects: backtraces break (real return address is temporarily gone from the stack), and it doesn't work on functions that never return (e.g., do_exit(), panic()).

4.3 Code Example (with ri->data private data channel)

// Private data struct — per-instance isolation
struct my_data { ktime_t entry_time; unsigned long arg0; };

// entry_handler: record args + timestamp at function entry
static int my_entry(struct kretprobe_instance *ri, struct pt_regs *regs) {
    struct my_data *d = (struct my_data *)ri->data;
    d->entry_time = ktime_get_ns();
    d->arg0 = regs->di;  // first argument (x86_64)
    return 0;
}

// handler: get return value + compute latency at function return
static int my_ret(struct kretprobe_instance *ri, struct pt_regs *regs) {
    struct my_data *d = (struct my_data *)ri->data;
    u64 duration = ktime_get_ns() - d->entry_time;
    long retval = regs_return_value(regs);
    return 0;
}

static struct kretprobe krp = {
    .kp.symbol_name = "do_sys_open",
    .handler = my_ret, .entry_handler = my_entry,
    .data_size = sizeof(struct my_data), .maxactive = 20,
};

4.4 What kprobe Can Do That kretprobe Cannot

kretprobe is more powerful for function-level tracing, but kprobe has unique advantages:

1. Arbitrary address probing: kprobe can probe any instruction inside a function. kretprobe is locked to function boundaries.

2. No maxactive limit: kretprobe has a fixed instance pool. If concurrent calls exceed maxactive, excess calls are silently missed. kprobe doesn't have this problem.

3. No return address manipulation: kretprobe modifies the stack, which confuses stack unwinders, backtraces, and tools like perf. kprobe's int3 + single-step approach is cleaner.

4. Works on non-returning functions: Functions like do_exit() and panic() never return, so kretprobe's handler will never fire. kprobe at the entry works fine.

Relationship Summary

kretprobe is a specialized tool built on top of kprobe (it literally contains a kprobe internally). It trades flexibility for convenience. kprobe gives you instruction-level surgery; kretprobe gives you function-level observability.

5. Execution Timeline

The diagram below shows the complete execution flow comparison between kprobe and kretprobe:

图 1：kprobe 与 kretprobe 执行时序对比

6. Core Comparison Tables

6.1 kprobe post_handler vs kretprobe handler

Dimension	kprobe post_handler	kretprobe handler
Trigger point	After probed instruction executes	When entire function returns
Mechanism	int3 + single-step	Return address replacement (trampoline)
Return value	✘ Not accessible	✔ regs_return_value(regs)
Latency measurement	✘ Not suitable	✔ entry + handler pair
Probe granularity	Any address (instruction-level)	Function-level only
Overhead	2 traps per hit (~1μs)	1 trap per hit (~0.5μs)
Private data	✘ No built-in mechanism	✔ ri->data per-instance

6.2 kretprobe entry_handler vs kprobe pre_handler

Dimension	kretprobe entry_handler	kprobe pre_handler
Trigger location	Function entry only	Any address
Private data passing	✔ ri->data per-instance	✘ No built-in mechanism
Paired return handler	✔ Naturally paired	✘ None
Concurrency safety	✔ maxactive instances	Must handle yourself
Standalone use	✘ Requires kretprobe	✔ Can be used alone

7. Performance Overhead Analysis

Mechanism	Trap Count	Additional Operations	Typical Overhead
kprobe (pre+post)	2	int3 + single-step debug exception	~1μs/hit
kretprobe	1 (entry)	Replace/restore return address	~0.5μs/hit
kprobe (pre only)	1*	int3 only (*no post_handler)	~0.5μs/hit

⚠ Note: maxactive Limitation

kretprobe has a maxactive limit. When concurrent function calls exceed maxactive, excess calls are silently missed. In production, monitor the nmissed counter and set maxactive based on expected concurrency.

8. The Full Picture: Dynamic vs Static vs eBPF

kprobe/kretprobe are dynamic tracing mechanisms. The kernel also provides static tracing (tracepoint) and the eBPF programmable framework.

8.1 Hierarchy

Linux Kernel Tracing Mechanisms
├── Dynamic Tracing
│   ├── kprobe / kretprobe    ← any kernel function, inserted at runtime
│   └── uprobe / uretprobe    ← userspace functions
│
└── Static Tracing
    └── tracepoint            ← hooks pre-placed by kernel developers

8.2 Dynamic vs Static Tracing

Dimension	Dynamic (kprobe)	Static (tracepoint)
Probe points	Any probeable address	Pre-defined locations by kernel devs
Stability	May change across kernel versions	ABI-stable, cross-version compatible
Overhead	Higher (int3 trap)	Lower (compile-time, nop-like)
Parameter access	Via registers, need ABI knowledge	Structured parameters, direct access
Coverage	Nearly all kernel functions	Only pre-instrumented locations
Use case	Debugging, deep analysis	Production monitoring, stable tracing

8.3 eBPF Program Types → Tracing Mechanism Mapping

eBPF itself is not a tracing mechanism — it's a programmable framework that runs on top of probe points:

eBPF Program Type	Underlying Mechanism	Description
kprobe	kprobe	Dynamic probe at function entry
kretprobe	kretprobe	Dynamic probe at function return
tracepoint	tracepoint	Static tracepoint, structured parameters
raw_tracepoint	tracepoint (raw)	Bypasses parameter parsing, direct raw data access, faster
tracepoint_return	—	Fires when tracepoint returns
raw_tracepoint_return	—	Raw tracepoint, fires on return
fentry / fexit	ftrace (direct)	Lighter function entry/exit probing, no int3

tracepoint vs raw_tracepoint

tracepoint parses parameters into structured format before passing to the eBPF program — convenient but adds overhead. raw_tracepoint bypasses this layer, passing raw parameters directly — better performance but you must parse data structures yourself.

8.4 How to Choose

Production monitoring: Prefer tracepoint / fentry/fexit. ABI-stable and low overhead.

Deep debugging / performance analysis: Use kprobe/kretprobe. Probe anywhere, not limited to pre-placed hooks.

High-performance tracing: Use raw_tracepoint or fentry/fexit. Avoid parameter parsing overhead or int3 traps.

Userspace tracing: Use uprobe/uretprobe. Similar to kprobe but targets userspace programs.

9. Decision Guide

9.1 Use kprobe when

You need to probe arbitrary locations inside a function (not just entry/exit)

You only need to observe register state changes after a specific instruction

You're probing functions that never return (do_exit, panic)

9.2 Use kretprobe when

You need to capture function return values (return codes, pointers, error codes)

You need to measure function execution latency (entry_handler records start, handler computes delta)

You need to correlate arguments with return values (via ri->data private data channel)

9.3 Use tracepoint / eBPF when

Production environment needs stable ABI that won't break across kernel versions

You want structured parameter access without manually parsing registers

Performance-sensitive — you need the lowest overhead probing possible

10. Summary

kprobe post_handler: Probes the state after a single instruction executes. Fine-grained but cannot access the function's return value.

kretprobe: Probes the moment a function returns. Implemented by hijacking the return address to a trampoline at entry — NOT by placing a kprobe at ret.

kprobe arbitrary address: Via symbol_name + offset or raw addr, you can probe any instruction inside a function.

tracepoint / eBPF: Static tracing provides stable ABI and lower overhead. eBPF is a programmable framework that runs on top of probe points.

Function latency & return values → kretprobe. Register state after a specific instruction → kprobe. Stable production tracing → tracepoint + eBPF.