I tried different pieces of code which uses __builtin_frame_address(1)
(with both gcc version 7.5.0 and 10.3.0) to verify whether it works as
expected on riscv64. The result is negative.
What the compiler had generated is as below:
31 fp = (unsigned long)__builtin_frame_address(1);
0xffffffff80006024 <+200>: ld s1,0(s0)
It takes '0(s0)' as the address of frame 1 (caller), but the actual address
should be '-16(s0)'.
| ... | <-+
+-----------------+ |
| return address | |
| previous fp | |
| saved registers | |
| local variables | |
$fp --> | ... | |
+-----------------+ |
| return address | |
| previous fp --------+
| saved registers |
$sp --> | local variables |
+-----------------+
This leads the kernel can not dump the full stack trace on riscv.
[ 7.222126][ T1] Call Trace:
[ 7.222804][ T1] [<
ffffffff80006058>] dump_backtrace+0x2c/0x3a
This problem is not exposed on most riscv builds just because the '0(s0)'
occasionally is the address frame 2 (caller's caller), if only ra and fp
are stored in frame 1 (caller).
| ... | <-+
+-----------------+ |
| return address | |
$fp --> | previous fp | |
+-----------------+ |
| return address | |
| previous fp --------+
| saved registers |
$sp --> | local variables |
+-----------------+
This could be a *bug* of gcc that should be fixed. But as noted in gcc
manual "Calling this function with a nonzero argument can have
unpredictable effects, including crashing the calling program.", let's
remove the '__builtin_frame_address(1)' in backtrace code.
With this fix now it can show full stack trace:
[ 10.444838][ T1] Call Trace:
[ 10.446199][ T1] [<
ffffffff8000606c>] dump_backtrace+0x2c/0x3a
[ 10.447711][ T1] [<
ffffffff800060ac>] show_stack+0x32/0x3e
[ 10.448710][ T1] [<
ffffffff80a005c0>] dump_stack_lvl+0x58/0x7a
[ 10.449941][ T1] [<
ffffffff80a005f6>] dump_stack+0x14/0x1c
[ 10.450929][ T1] [<
ffffffff804c04ee>] ubsan_epilogue+0x10/0x5a
[ 10.451869][ T1] [<
ffffffff804c092e>] __ubsan_handle_load_invalid_value+0x6c/0x78
[ 10.453049][ T1] [<
ffffffff8018f834>] __pagevec_release+0x62/0x64
[ 10.455476][ T1] [<
ffffffff80190830>] truncate_inode_pages_range+0x132/0x5be
[ 10.456798][ T1] [<
ffffffff80190ce0>] truncate_inode_pages+0x24/0x30
[ 10.457853][ T1] [<
ffffffff8045bb04>] kill_bdev+0x32/0x3c
...
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Fixes:
eac2f3059e02 ("riscv: stacktrace: fix the riscv stacktrace when CONFIG_FRAME_POINTER enabled")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
bool (*fn)(void *, unsigned long), void *arg)
{
unsigned long fp, sp, pc;
+ int level = 0;
if (regs) {
fp = frame_pointer(regs);
sp = user_stack_pointer(regs);
pc = instruction_pointer(regs);
} else if (task == NULL || task == current) {
- fp = (unsigned long)__builtin_frame_address(1);
- sp = (unsigned long)__builtin_frame_address(0);
- pc = (unsigned long)__builtin_return_address(0);
+ fp = (unsigned long)__builtin_frame_address(0);
+ sp = sp_in_global;
+ pc = (unsigned long)walk_stackframe;
} else {
/* task blocked in __switch_to */
fp = task->thread.s[0];
unsigned long low, high;
struct stackframe *frame;
- if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc)))
+ if (unlikely(!__kernel_text_address(pc) || (level++ >= 1 && !fn(arg, pc))))
break;
/* Validate frame pointer */