kprobe vs kretprobe — Code Deep Dive
1. Overview
The Linux kernel provides several dynamic tracing mechanisms, among which kprobe and kretprobe are the most fundamental and widely used. Both allow injecting callbacks into kernel functions at runtime, but they serve different purposes and trigger at different points in execution.
kprobe: Can insert breakpoints at any probeable kernel address, providing two callback points — pre_handler (before instruction execution) and post_handler (after instruction execution).
kretprobe: Designed specifically for function-level tracing, providing entry_handler (function entry) and handler (function return) callbacks, with a built-in per-instance private data channel.
2. kprobe post_handler In Depth
2.1 Implementation Mechanism
kprobe works by inserting an int3 breakpoint instruction at the target address. When the CPU hits that address, the following flow is triggered:
post_handler fires immediately after the probed instruction completes single-stepping — the function body hasn't finished executing yet. The rax register holds the intermediate state after that one instruction, not the function's final return value. "Can't get the return value" refers to the real kernel function's return value, not the probe code's own return value.2.2 Code Example
// kprobe struct definition
struct kprobe kp = {
.symbol_name = "do_sys_open",
.pre_handler = my_pre_handler, // before instruction
.post_handler = my_post_handler, // after instruction
};
// post_handler callback
static void my_post_handler(struct kprobe *p,
struct pt_regs *regs, unsigned long flags) {
// regs->ax is the intermediate state after the instruction
// NOT the function return value!
}
2.3 Characteristics & Limitations
Probe granularity: Any probeable kernel address (instruction-level)
Performance overhead: Two traps per hit (int3 + single-step debug exception) — relatively high
Return value access: Not possible. The function hasn't finished executing when post_handler fires
Concurrency safety: No built-in per-instance data isolation. Must manage global/per-cpu variables yourself
3. kprobe Arbitrary Address Probing
The core advantage of kprobe is the ability to probe any instruction inside a function, not just the entry/exit. Two ways to specify the probe point:
3.1 Method 1: symbol_name + offset
The kernel resolves the symbol via kallsyms, then adds your offset to get the final address. This is the most common way to probe mid-function.
struct kprobe kp = {
.symbol_name = "do_sys_open",
.offset = 0x42, // 66 bytes into the function
};
3.2 Method 2: Raw Address
struct kprobe kp = {
.addr = (kprobe_opcode_t *)0xffffffff81234567,
};
// Compute at runtime via kallsyms
unsigned long addr = kallsyms_lookup_name("do_sys_open") + 0x10;
kp.addr = (kprobe_opcode_t *)addr;
3.3 Practical Example: Probing Mid-Function
Disassemble to find the target instruction's offset:
ffffffff81200000 <do_sys_open>:
81200000: push rbp
81200004: sub rsp, 0x20
8120000b: call getname
81200010: test rax, rax ← probe here (offset=0x10)
81200013: je error_path
Set .offset = 0x10 to probe test rax, rax. In your post_handler, regs->ax will contain what getname() returned — no need for kretprobe on getname at all.
4. kretprobe In Depth
4.1 Implementation: Trampoline, NOT "kprobe at ret"
You might intuitively think kretprobe is "a kprobe placed at the ret instruction." It's not. A function can have multiple ret instructions (early returns, error paths, different branches) and the compiler may tail-call optimize some returns away. Statically finding all exit points would be fragile and impractical.
What you might think: What actually happens:
kprobe at ret #1 ✘ kprobe at function ENTRY (offset 0)
kprobe at ret #2 ✘ ↓
kprobe at ret #3 ✘ Replace return address on stack
kprobe at ret #4 ✘ ↓
ALL ret paths land on trampoline
4.2 Internal Flow
// ① Function entry kprobe fires
entry_kprobe_hit:
save original_return_addr = stack[rsp] // remember real caller
stack[rsp] = &kretprobe_trampoline // hijack return address
// ② Function executes normally, hits any ret...
// ③ ret jumps to trampoline instead of real caller
kretprobe_trampoline:
call user_handler(ri, regs) // your callback, rax = return value
jmp original_return_addr // restore, jump back to real caller
do_exit(), panic()).4.3 Code Example (with ri->data private data channel)
// Private data struct — per-instance isolation
struct my_data { ktime_t entry_time; unsigned long arg0; };
// entry_handler: record args + timestamp at function entry
static int my_entry(struct kretprobe_instance *ri, struct pt_regs *regs) {
struct my_data *d = (struct my_data *)ri->data;
d->entry_time = ktime_get_ns();
d->arg0 = regs->di; // first argument (x86_64)
return 0;
}
// handler: get return value + compute latency at function return
static int my_ret(struct kretprobe_instance *ri, struct pt_regs *regs) {
struct my_data *d = (struct my_data *)ri->data;
u64 duration = ktime_get_ns() - d->entry_time;
long retval = regs_return_value(regs);
return 0;
}
static struct kretprobe krp = {
.kp.symbol_name = "do_sys_open",
.handler = my_ret, .entry_handler = my_entry,
.data_size = sizeof(struct my_data), .maxactive = 20,
};
4.4 What kprobe Can Do That kretprobe Cannot
kretprobe is more powerful for function-level tracing, but kprobe has unique advantages:
1. Arbitrary address probing: kprobe can probe any instruction inside a function. kretprobe is locked to function boundaries.
2. No maxactive limit: kretprobe has a fixed instance pool. If concurrent calls exceed maxactive, excess calls are silently missed. kprobe doesn't have this problem.
3. No return address manipulation: kretprobe modifies the stack, which confuses stack unwinders, backtraces, and tools like perf. kprobe's int3 + single-step approach is cleaner.
4. Works on non-returning functions: Functions like do_exit() and panic() never return, so kretprobe's handler will never fire. kprobe at the entry works fine.
5. Execution Timeline
The diagram below shows the complete execution flow comparison between kprobe and kretprobe:
图 1:kprobe 与 kretprobe 执行时序对比
6. Core Comparison Tables
6.1 kprobe post_handler vs kretprobe handler
| Dimension | kprobe post_handler | kretprobe handler |
|---|---|---|
| Trigger point | After probed instruction executes | When entire function returns |
| Mechanism | int3 + single-step | Return address replacement (trampoline) |
| Return value | ✘ Not accessible | ✔ regs_return_value(regs) |
| Latency measurement | ✘ Not suitable | ✔ entry + handler pair |
| Probe granularity | Any address (instruction-level) | Function-level only |
| Overhead | 2 traps per hit (~1μs) | 1 trap per hit (~0.5μs) |
| Private data | ✘ No built-in mechanism | ✔ ri->data per-instance |
6.2 kretprobe entry_handler vs kprobe pre_handler
| Dimension | kretprobe entry_handler | kprobe pre_handler |
|---|---|---|
| Trigger location | Function entry only | Any address |
| Private data passing | ✔ ri->data per-instance | ✘ No built-in mechanism |
| Paired return handler | ✔ Naturally paired | ✘ None |
| Concurrency safety | ✔ maxactive instances | Must handle yourself |
| Standalone use | ✘ Requires kretprobe | ✔ Can be used alone |
7. Performance Overhead Analysis
| Mechanism | Trap Count | Additional Operations | Typical Overhead |
|---|---|---|---|
| kprobe (pre+post) | 2 | int3 + single-step debug exception | ~1μs/hit |
| kretprobe | 1 (entry) | Replace/restore return address | ~0.5μs/hit |
| kprobe (pre only) | 1* | int3 only (*no post_handler) | ~0.5μs/hit |
maxactive limit. When concurrent function calls exceed maxactive, excess calls are silently missed. In production, monitor the nmissed counter and set maxactive based on expected concurrency.8. The Full Picture: Dynamic vs Static vs eBPF
kprobe/kretprobe are dynamic tracing mechanisms. The kernel also provides static tracing (tracepoint) and the eBPF programmable framework.
8.1 Hierarchy
Linux Kernel Tracing Mechanisms
├── Dynamic Tracing
│ ├── kprobe / kretprobe ← any kernel function, inserted at runtime
│ └── uprobe / uretprobe ← userspace functions
│
└── Static Tracing
└── tracepoint ← hooks pre-placed by kernel developers
8.2 Dynamic vs Static Tracing
| Dimension | Dynamic (kprobe) | Static (tracepoint) |
|---|---|---|
| Probe points | Any probeable address | Pre-defined locations by kernel devs |
| Stability | May change across kernel versions | ABI-stable, cross-version compatible |
| Overhead | Higher (int3 trap) | Lower (compile-time, nop-like) |
| Parameter access | Via registers, need ABI knowledge | Structured parameters, direct access |
| Coverage | Nearly all kernel functions | Only pre-instrumented locations |
| Use case | Debugging, deep analysis | Production monitoring, stable tracing |
8.3 eBPF Program Types → Tracing Mechanism Mapping
eBPF itself is not a tracing mechanism — it's a programmable framework that runs on top of probe points:
| eBPF Program Type | Underlying Mechanism | Description |
|---|---|---|
| kprobe | kprobe | Dynamic probe at function entry |
| kretprobe | kretprobe | Dynamic probe at function return |
| tracepoint | tracepoint | Static tracepoint, structured parameters |
| raw_tracepoint | tracepoint (raw) | Bypasses parameter parsing, direct raw data access, faster |
| tracepoint_return | — | Fires when tracepoint returns |
| raw_tracepoint_return | — | Raw tracepoint, fires on return |
| fentry / fexit | ftrace (direct) | Lighter function entry/exit probing, no int3 |
tracepoint parses parameters into structured format before passing to the eBPF program — convenient but adds overhead. raw_tracepoint bypasses this layer, passing raw parameters directly — better performance but you must parse data structures yourself.8.4 How to Choose
Production monitoring: Prefer tracepoint / fentry/fexit. ABI-stable and low overhead.
Deep debugging / performance analysis: Use kprobe/kretprobe. Probe anywhere, not limited to pre-placed hooks.
High-performance tracing: Use raw_tracepoint or fentry/fexit. Avoid parameter parsing overhead or int3 traps.
Userspace tracing: Use uprobe/uretprobe. Similar to kprobe but targets userspace programs.
9. Decision Guide
9.1 Use kprobe when
9.2 Use kretprobe when
ri->data private data channel)9.3 Use tracepoint / eBPF when
10. Summary
kprobe post_handler: Probes the state after a single instruction executes. Fine-grained but cannot access the function's return value.
kretprobe: Probes the moment a function returns. Implemented by hijacking the return address to a trampoline at entry — NOT by placing a kprobe at ret.
kprobe arbitrary address: Via symbol_name + offset or raw addr, you can probe any instruction inside a function.
tracepoint / eBPF: Static tracing provides stable ABI and lower overhead. eBPF is a programmable framework that runs on top of probe points.
Function latency & return values → kretprobe. Register state after a specific instruction → kprobe. Stable production tracing → tracepoint + eBPF.