forked from GrapheneOS/linux-hardened
-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed compiling kernel 5.15.25 using gcc 11.2.0 #65
Comments
You can also "fix" the issue by compiling with CONFIG_WERROR disabled. This is done on the arch linux package so I assume it is fine. |
@pete842 See commit 86cffecdeaa2, which disabled this warning in v5.16 and was not backported to the 5.15 stable branch. |
@tsautereau-anssi thank you, I will have a look 👌 |
anthraxx
pushed a commit
that referenced
this issue
Nov 3, 2022
[ Upstream commit 230db82 ] When a console stack dump is initiated with CONFIG_GCOV_PROFILE_ALL enabled, show_trace_log_lvl() gets out of sync with the ORC unwinder, causing the stack trace to show all text addresses as unreliable: # echo l > /proc/sysrq-trigger [ 477.521031] sysrq: Show backtrace of all active CPUs [ 477.523813] NMI backtrace for cpu 0 [ 477.524492] CPU: 0 PID: 1021 Comm: bash Not tainted 6.0.0 #65 [ 477.525295] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014 [ 477.526439] Call Trace: [ 477.526854] <TASK> [ 477.527216] ? dump_stack_lvl+0xc7/0x114 [ 477.527801] ? dump_stack+0x13/0x1f [ 477.528331] ? nmi_cpu_backtrace.cold+0xb5/0x10d [ 477.528998] ? lapic_can_unplug_cpu+0xa0/0xa0 [ 477.529641] ? nmi_trigger_cpumask_backtrace+0x16a/0x1f0 [ 477.530393] ? arch_trigger_cpumask_backtrace+0x1d/0x30 [ 477.531136] ? sysrq_handle_showallcpus+0x1b/0x30 [ 477.531818] ? __handle_sysrq.cold+0x4e/0x1ae [ 477.532451] ? write_sysrq_trigger+0x63/0x80 [ 477.533080] ? proc_reg_write+0x92/0x110 [ 477.533663] ? vfs_write+0x174/0x530 [ 477.534265] ? handle_mm_fault+0x16f/0x500 [ 477.534940] ? ksys_write+0x7b/0x170 [ 477.535543] ? __x64_sys_write+0x1d/0x30 [ 477.536191] ? do_syscall_64+0x6b/0x100 [ 477.536809] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 477.537609] </TASK> This happens when the compiled code for show_stack() has a single word on the stack, and doesn't use a tail call to show_stack_log_lvl(). (CONFIG_GCOV_PROFILE_ALL=y is the only known case of this.) Then the __unwind_start() skip logic hits an off-by-one bug and fails to unwind all the way to the intended starting frame. Fix it by reverting the following commit: f1d9a2a ("x86/unwind/orc: Don't skip the first frame for inactive tasks") The original justification for that commit no longer exists. That original issue was later fixed in a different way, with the following commit: f2ac57a ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels") Fixes: f1d9a2a ("x86/unwind/orc: Don't skip the first frame for inactive tasks") Signed-off-by: Chen Zhongjin <[email protected]> [jpoimboe: rewrite commit log] Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
anthraxx
pushed a commit
that referenced
this issue
Nov 3, 2022
[ Upstream commit 230db82 ] When a console stack dump is initiated with CONFIG_GCOV_PROFILE_ALL enabled, show_trace_log_lvl() gets out of sync with the ORC unwinder, causing the stack trace to show all text addresses as unreliable: # echo l > /proc/sysrq-trigger [ 477.521031] sysrq: Show backtrace of all active CPUs [ 477.523813] NMI backtrace for cpu 0 [ 477.524492] CPU: 0 PID: 1021 Comm: bash Not tainted 6.0.0 #65 [ 477.525295] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014 [ 477.526439] Call Trace: [ 477.526854] <TASK> [ 477.527216] ? dump_stack_lvl+0xc7/0x114 [ 477.527801] ? dump_stack+0x13/0x1f [ 477.528331] ? nmi_cpu_backtrace.cold+0xb5/0x10d [ 477.528998] ? lapic_can_unplug_cpu+0xa0/0xa0 [ 477.529641] ? nmi_trigger_cpumask_backtrace+0x16a/0x1f0 [ 477.530393] ? arch_trigger_cpumask_backtrace+0x1d/0x30 [ 477.531136] ? sysrq_handle_showallcpus+0x1b/0x30 [ 477.531818] ? __handle_sysrq.cold+0x4e/0x1ae [ 477.532451] ? write_sysrq_trigger+0x63/0x80 [ 477.533080] ? proc_reg_write+0x92/0x110 [ 477.533663] ? vfs_write+0x174/0x530 [ 477.534265] ? handle_mm_fault+0x16f/0x500 [ 477.534940] ? ksys_write+0x7b/0x170 [ 477.535543] ? __x64_sys_write+0x1d/0x30 [ 477.536191] ? do_syscall_64+0x6b/0x100 [ 477.536809] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 477.537609] </TASK> This happens when the compiled code for show_stack() has a single word on the stack, and doesn't use a tail call to show_stack_log_lvl(). (CONFIG_GCOV_PROFILE_ALL=y is the only known case of this.) Then the __unwind_start() skip logic hits an off-by-one bug and fails to unwind all the way to the intended starting frame. Fix it by reverting the following commit: f1d9a2a ("x86/unwind/orc: Don't skip the first frame for inactive tasks") The original justification for that commit no longer exists. That original issue was later fixed in a different way, with the following commit: f2ac57a ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels") Fixes: f1d9a2a ("x86/unwind/orc: Don't skip the first frame for inactive tasks") Signed-off-by: Chen Zhongjin <[email protected]> [jpoimboe: rewrite commit log] Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
anthraxx
pushed a commit
that referenced
this issue
Nov 3, 2022
[ Upstream commit 230db82 ] When a console stack dump is initiated with CONFIG_GCOV_PROFILE_ALL enabled, show_trace_log_lvl() gets out of sync with the ORC unwinder, causing the stack trace to show all text addresses as unreliable: # echo l > /proc/sysrq-trigger [ 477.521031] sysrq: Show backtrace of all active CPUs [ 477.523813] NMI backtrace for cpu 0 [ 477.524492] CPU: 0 PID: 1021 Comm: bash Not tainted 6.0.0 #65 [ 477.525295] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014 [ 477.526439] Call Trace: [ 477.526854] <TASK> [ 477.527216] ? dump_stack_lvl+0xc7/0x114 [ 477.527801] ? dump_stack+0x13/0x1f [ 477.528331] ? nmi_cpu_backtrace.cold+0xb5/0x10d [ 477.528998] ? lapic_can_unplug_cpu+0xa0/0xa0 [ 477.529641] ? nmi_trigger_cpumask_backtrace+0x16a/0x1f0 [ 477.530393] ? arch_trigger_cpumask_backtrace+0x1d/0x30 [ 477.531136] ? sysrq_handle_showallcpus+0x1b/0x30 [ 477.531818] ? __handle_sysrq.cold+0x4e/0x1ae [ 477.532451] ? write_sysrq_trigger+0x63/0x80 [ 477.533080] ? proc_reg_write+0x92/0x110 [ 477.533663] ? vfs_write+0x174/0x530 [ 477.534265] ? handle_mm_fault+0x16f/0x500 [ 477.534940] ? ksys_write+0x7b/0x170 [ 477.535543] ? __x64_sys_write+0x1d/0x30 [ 477.536191] ? do_syscall_64+0x6b/0x100 [ 477.536809] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 477.537609] </TASK> This happens when the compiled code for show_stack() has a single word on the stack, and doesn't use a tail call to show_stack_log_lvl(). (CONFIG_GCOV_PROFILE_ALL=y is the only known case of this.) Then the __unwind_start() skip logic hits an off-by-one bug and fails to unwind all the way to the intended starting frame. Fix it by reverting the following commit: f1d9a2a ("x86/unwind/orc: Don't skip the first frame for inactive tasks") The original justification for that commit no longer exists. That original issue was later fixed in a different way, with the following commit: f2ac57a ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels") Fixes: f1d9a2a ("x86/unwind/orc: Don't skip the first frame for inactive tasks") Signed-off-by: Chen Zhongjin <[email protected]> [jpoimboe: rewrite commit log] Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
anthraxx
pushed a commit
that referenced
this issue
Nov 3, 2022
[ Upstream commit 230db82 ] When a console stack dump is initiated with CONFIG_GCOV_PROFILE_ALL enabled, show_trace_log_lvl() gets out of sync with the ORC unwinder, causing the stack trace to show all text addresses as unreliable: # echo l > /proc/sysrq-trigger [ 477.521031] sysrq: Show backtrace of all active CPUs [ 477.523813] NMI backtrace for cpu 0 [ 477.524492] CPU: 0 PID: 1021 Comm: bash Not tainted 6.0.0 #65 [ 477.525295] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014 [ 477.526439] Call Trace: [ 477.526854] <TASK> [ 477.527216] ? dump_stack_lvl+0xc7/0x114 [ 477.527801] ? dump_stack+0x13/0x1f [ 477.528331] ? nmi_cpu_backtrace.cold+0xb5/0x10d [ 477.528998] ? lapic_can_unplug_cpu+0xa0/0xa0 [ 477.529641] ? nmi_trigger_cpumask_backtrace+0x16a/0x1f0 [ 477.530393] ? arch_trigger_cpumask_backtrace+0x1d/0x30 [ 477.531136] ? sysrq_handle_showallcpus+0x1b/0x30 [ 477.531818] ? __handle_sysrq.cold+0x4e/0x1ae [ 477.532451] ? write_sysrq_trigger+0x63/0x80 [ 477.533080] ? proc_reg_write+0x92/0x110 [ 477.533663] ? vfs_write+0x174/0x530 [ 477.534265] ? handle_mm_fault+0x16f/0x500 [ 477.534940] ? ksys_write+0x7b/0x170 [ 477.535543] ? __x64_sys_write+0x1d/0x30 [ 477.536191] ? do_syscall_64+0x6b/0x100 [ 477.536809] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 477.537609] </TASK> This happens when the compiled code for show_stack() has a single word on the stack, and doesn't use a tail call to show_stack_log_lvl(). (CONFIG_GCOV_PROFILE_ALL=y is the only known case of this.) Then the __unwind_start() skip logic hits an off-by-one bug and fails to unwind all the way to the intended starting frame. Fix it by reverting the following commit: f1d9a2a ("x86/unwind/orc: Don't skip the first frame for inactive tasks") The original justification for that commit no longer exists. That original issue was later fixed in a different way, with the following commit: f2ac57a ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels") Fixes: f1d9a2a ("x86/unwind/orc: Don't skip the first frame for inactive tasks") Signed-off-by: Chen Zhongjin <[email protected]> [jpoimboe: rewrite commit log] Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
anthraxx
pushed a commit
that referenced
this issue
Nov 3, 2022
[ Upstream commit 230db82 ] When a console stack dump is initiated with CONFIG_GCOV_PROFILE_ALL enabled, show_trace_log_lvl() gets out of sync with the ORC unwinder, causing the stack trace to show all text addresses as unreliable: # echo l > /proc/sysrq-trigger [ 477.521031] sysrq: Show backtrace of all active CPUs [ 477.523813] NMI backtrace for cpu 0 [ 477.524492] CPU: 0 PID: 1021 Comm: bash Not tainted 6.0.0 #65 [ 477.525295] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014 [ 477.526439] Call Trace: [ 477.526854] <TASK> [ 477.527216] ? dump_stack_lvl+0xc7/0x114 [ 477.527801] ? dump_stack+0x13/0x1f [ 477.528331] ? nmi_cpu_backtrace.cold+0xb5/0x10d [ 477.528998] ? lapic_can_unplug_cpu+0xa0/0xa0 [ 477.529641] ? nmi_trigger_cpumask_backtrace+0x16a/0x1f0 [ 477.530393] ? arch_trigger_cpumask_backtrace+0x1d/0x30 [ 477.531136] ? sysrq_handle_showallcpus+0x1b/0x30 [ 477.531818] ? __handle_sysrq.cold+0x4e/0x1ae [ 477.532451] ? write_sysrq_trigger+0x63/0x80 [ 477.533080] ? proc_reg_write+0x92/0x110 [ 477.533663] ? vfs_write+0x174/0x530 [ 477.534265] ? handle_mm_fault+0x16f/0x500 [ 477.534940] ? ksys_write+0x7b/0x170 [ 477.535543] ? __x64_sys_write+0x1d/0x30 [ 477.536191] ? do_syscall_64+0x6b/0x100 [ 477.536809] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 477.537609] </TASK> This happens when the compiled code for show_stack() has a single word on the stack, and doesn't use a tail call to show_stack_log_lvl(). (CONFIG_GCOV_PROFILE_ALL=y is the only known case of this.) Then the __unwind_start() skip logic hits an off-by-one bug and fails to unwind all the way to the intended starting frame. Fix it by reverting the following commit: f1d9a2a ("x86/unwind/orc: Don't skip the first frame for inactive tasks") The original justification for that commit no longer exists. That original issue was later fixed in a different way, with the following commit: f2ac57a ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels") Fixes: f1d9a2a ("x86/unwind/orc: Don't skip the first frame for inactive tasks") Signed-off-by: Chen Zhongjin <[email protected]> [jpoimboe: rewrite commit log] Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
anthraxx
pushed a commit
that referenced
this issue
Nov 3, 2022
[ Upstream commit 230db82 ] When a console stack dump is initiated with CONFIG_GCOV_PROFILE_ALL enabled, show_trace_log_lvl() gets out of sync with the ORC unwinder, causing the stack trace to show all text addresses as unreliable: # echo l > /proc/sysrq-trigger [ 477.521031] sysrq: Show backtrace of all active CPUs [ 477.523813] NMI backtrace for cpu 0 [ 477.524492] CPU: 0 PID: 1021 Comm: bash Not tainted 6.0.0 #65 [ 477.525295] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014 [ 477.526439] Call Trace: [ 477.526854] <TASK> [ 477.527216] ? dump_stack_lvl+0xc7/0x114 [ 477.527801] ? dump_stack+0x13/0x1f [ 477.528331] ? nmi_cpu_backtrace.cold+0xb5/0x10d [ 477.528998] ? lapic_can_unplug_cpu+0xa0/0xa0 [ 477.529641] ? nmi_trigger_cpumask_backtrace+0x16a/0x1f0 [ 477.530393] ? arch_trigger_cpumask_backtrace+0x1d/0x30 [ 477.531136] ? sysrq_handle_showallcpus+0x1b/0x30 [ 477.531818] ? __handle_sysrq.cold+0x4e/0x1ae [ 477.532451] ? write_sysrq_trigger+0x63/0x80 [ 477.533080] ? proc_reg_write+0x92/0x110 [ 477.533663] ? vfs_write+0x174/0x530 [ 477.534265] ? handle_mm_fault+0x16f/0x500 [ 477.534940] ? ksys_write+0x7b/0x170 [ 477.535543] ? __x64_sys_write+0x1d/0x30 [ 477.536191] ? do_syscall_64+0x6b/0x100 [ 477.536809] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 477.537609] </TASK> This happens when the compiled code for show_stack() has a single word on the stack, and doesn't use a tail call to show_stack_log_lvl(). (CONFIG_GCOV_PROFILE_ALL=y is the only known case of this.) Then the __unwind_start() skip logic hits an off-by-one bug and fails to unwind all the way to the intended starting frame. Fix it by reverting the following commit: f1d9a2a ("x86/unwind/orc: Don't skip the first frame for inactive tasks") The original justification for that commit no longer exists. That original issue was later fixed in a different way, with the following commit: f2ac57a ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels") Fixes: f1d9a2a ("x86/unwind/orc: Don't skip the first frame for inactive tasks") Signed-off-by: Chen Zhongjin <[email protected]> [jpoimboe: rewrite commit log] Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
anthraxx
pushed a commit
that referenced
this issue
Dec 8, 2024
[ Upstream commit ea4c990fa9e19ffef0648e40c566b94ba5ab31be ] When the qp is in error state, the status of WQEs in the queue should be set to error. Or else the following will appear. [ 920.617269] WARNING: CPU: 1 PID: 21 at drivers/infiniband/sw/rxe/rxe_comp.c:756 rxe_completer+0x989/0xcc0 [rdma_rxe] [ 920.617744] Modules linked in: rnbd_client(O) rtrs_client(O) rtrs_core(O) rdma_ucm rdma_cm iw_cm ib_cm crc32_generic rdma_rxe ip6_udp_tunnel udp_tunnel ib_uverbs ib_core loop brd null_blk ipv6 [ 920.618516] CPU: 1 PID: 21 Comm: ksoftirqd/1 Tainted: G O 6.1.113-storage+ #65 [ 920.618986] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 920.619396] RIP: 0010:rxe_completer+0x989/0xcc0 [rdma_rxe] [ 920.619658] Code: 0f b6 84 24 3a 02 00 00 41 89 84 24 44 04 00 00 e9 2a f7 ff ff 39 ca bb 03 00 00 00 b8 0e 00 00 00 48 0f 45 d8 e9 15 f7 ff ff <0f> 0b e9 cb f8 ff ff 41 bf f5 ff ff ff e9 08 f8 ff ff 49 8d bc 24 [ 920.620482] RSP: 0018:ffff97b7c00bbc38 EFLAGS: 00010246 [ 920.620817] RAX: 0000000000000000 RBX: 000000000000000c RCX: 0000000000000008 [ 920.621183] RDX: ffff960dc396ebc0 RSI: 0000000000005400 RDI: ffff960dc4e2fbac [ 920.621548] RBP: 0000000000000000 R08: 0000000000000001 R09: ffffffffac406450 [ 920.621884] R10: ffffffffac4060c0 R11: 0000000000000001 R12: ffff960dc4e2f800 [ 920.622254] R13: ffff960dc4e2f928 R14: ffff97b7c029c580 R15: 0000000000000000 [ 920.622609] FS: 0000000000000000(0000) GS:ffff960ef7d00000(0000) knlGS:0000000000000000 [ 920.622979] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 920.623245] CR2: 00007fa056965e90 CR3: 00000001107f1000 CR4: 00000000000006e0 [ 920.623680] Call Trace: [ 920.623815] <TASK> [ 920.623933] ? __warn+0x79/0xc0 [ 920.624116] ? rxe_completer+0x989/0xcc0 [rdma_rxe] [ 920.624356] ? report_bug+0xfb/0x150 [ 920.624594] ? handle_bug+0x3c/0x60 [ 920.624796] ? exc_invalid_op+0x14/0x70 [ 920.624976] ? asm_exc_invalid_op+0x16/0x20 [ 920.625203] ? rxe_completer+0x989/0xcc0 [rdma_rxe] [ 920.625474] ? rxe_completer+0x329/0xcc0 [rdma_rxe] [ 920.625749] rxe_do_task+0x80/0x110 [rdma_rxe] [ 920.626037] rxe_requester+0x625/0xde0 [rdma_rxe] [ 920.626310] ? rxe_cq_post+0xe2/0x180 [rdma_rxe] [ 920.626583] ? do_complete+0x18d/0x220 [rdma_rxe] [ 920.626812] ? rxe_completer+0x1a3/0xcc0 [rdma_rxe] [ 920.627050] rxe_do_task+0x80/0x110 [rdma_rxe] [ 920.627285] tasklet_action_common.constprop.0+0xa4/0x120 [ 920.627522] handle_softirqs+0xc2/0x250 [ 920.627728] ? sort_range+0x20/0x20 [ 920.627942] run_ksoftirqd+0x1f/0x30 [ 920.628158] smpboot_thread_fn+0xc7/0x1b0 [ 920.628334] kthread+0xd6/0x100 [ 920.628504] ? kthread_complete_and_exit+0x20/0x20 [ 920.628709] ret_from_fork+0x1f/0x30 [ 920.628892] </TASK> Fixes: ae720bd ("RDMA/rxe: Generate error completion for error requester QP state") Signed-off-by: Zhu Yanjun <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
anthraxx
pushed a commit
that referenced
this issue
Dec 9, 2024
[ Upstream commit ea4c990fa9e19ffef0648e40c566b94ba5ab31be ] When the qp is in error state, the status of WQEs in the queue should be set to error. Or else the following will appear. [ 920.617269] WARNING: CPU: 1 PID: 21 at drivers/infiniband/sw/rxe/rxe_comp.c:756 rxe_completer+0x989/0xcc0 [rdma_rxe] [ 920.617744] Modules linked in: rnbd_client(O) rtrs_client(O) rtrs_core(O) rdma_ucm rdma_cm iw_cm ib_cm crc32_generic rdma_rxe ip6_udp_tunnel udp_tunnel ib_uverbs ib_core loop brd null_blk ipv6 [ 920.618516] CPU: 1 PID: 21 Comm: ksoftirqd/1 Tainted: G O 6.1.113-storage+ #65 [ 920.618986] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 920.619396] RIP: 0010:rxe_completer+0x989/0xcc0 [rdma_rxe] [ 920.619658] Code: 0f b6 84 24 3a 02 00 00 41 89 84 24 44 04 00 00 e9 2a f7 ff ff 39 ca bb 03 00 00 00 b8 0e 00 00 00 48 0f 45 d8 e9 15 f7 ff ff <0f> 0b e9 cb f8 ff ff 41 bf f5 ff ff ff e9 08 f8 ff ff 49 8d bc 24 [ 920.620482] RSP: 0018:ffff97b7c00bbc38 EFLAGS: 00010246 [ 920.620817] RAX: 0000000000000000 RBX: 000000000000000c RCX: 0000000000000008 [ 920.621183] RDX: ffff960dc396ebc0 RSI: 0000000000005400 RDI: ffff960dc4e2fbac [ 920.621548] RBP: 0000000000000000 R08: 0000000000000001 R09: ffffffffac406450 [ 920.621884] R10: ffffffffac4060c0 R11: 0000000000000001 R12: ffff960dc4e2f800 [ 920.622254] R13: ffff960dc4e2f928 R14: ffff97b7c029c580 R15: 0000000000000000 [ 920.622609] FS: 0000000000000000(0000) GS:ffff960ef7d00000(0000) knlGS:0000000000000000 [ 920.622979] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 920.623245] CR2: 00007fa056965e90 CR3: 00000001107f1000 CR4: 00000000000006e0 [ 920.623680] Call Trace: [ 920.623815] <TASK> [ 920.623933] ? __warn+0x79/0xc0 [ 920.624116] ? rxe_completer+0x989/0xcc0 [rdma_rxe] [ 920.624356] ? report_bug+0xfb/0x150 [ 920.624594] ? handle_bug+0x3c/0x60 [ 920.624796] ? exc_invalid_op+0x14/0x70 [ 920.624976] ? asm_exc_invalid_op+0x16/0x20 [ 920.625203] ? rxe_completer+0x989/0xcc0 [rdma_rxe] [ 920.625474] ? rxe_completer+0x329/0xcc0 [rdma_rxe] [ 920.625749] rxe_do_task+0x80/0x110 [rdma_rxe] [ 920.626037] rxe_requester+0x625/0xde0 [rdma_rxe] [ 920.626310] ? rxe_cq_post+0xe2/0x180 [rdma_rxe] [ 920.626583] ? do_complete+0x18d/0x220 [rdma_rxe] [ 920.626812] ? rxe_completer+0x1a3/0xcc0 [rdma_rxe] [ 920.627050] rxe_do_task+0x80/0x110 [rdma_rxe] [ 920.627285] tasklet_action_common.constprop.0+0xa4/0x120 [ 920.627522] handle_softirqs+0xc2/0x250 [ 920.627728] ? sort_range+0x20/0x20 [ 920.627942] run_ksoftirqd+0x1f/0x30 [ 920.628158] smpboot_thread_fn+0xc7/0x1b0 [ 920.628334] kthread+0xd6/0x100 [ 920.628504] ? kthread_complete_and_exit+0x20/0x20 [ 920.628709] ret_from_fork+0x1f/0x30 [ 920.628892] </TASK> Fixes: ae720bd ("RDMA/rxe: Generate error completion for error requester QP state") Signed-off-by: Zhu Yanjun <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
anthraxx
pushed a commit
that referenced
this issue
Dec 9, 2024
[ Upstream commit ea4c990fa9e19ffef0648e40c566b94ba5ab31be ] When the qp is in error state, the status of WQEs in the queue should be set to error. Or else the following will appear. [ 920.617269] WARNING: CPU: 1 PID: 21 at drivers/infiniband/sw/rxe/rxe_comp.c:756 rxe_completer+0x989/0xcc0 [rdma_rxe] [ 920.617744] Modules linked in: rnbd_client(O) rtrs_client(O) rtrs_core(O) rdma_ucm rdma_cm iw_cm ib_cm crc32_generic rdma_rxe ip6_udp_tunnel udp_tunnel ib_uverbs ib_core loop brd null_blk ipv6 [ 920.618516] CPU: 1 PID: 21 Comm: ksoftirqd/1 Tainted: G O 6.1.113-storage+ #65 [ 920.618986] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 920.619396] RIP: 0010:rxe_completer+0x989/0xcc0 [rdma_rxe] [ 920.619658] Code: 0f b6 84 24 3a 02 00 00 41 89 84 24 44 04 00 00 e9 2a f7 ff ff 39 ca bb 03 00 00 00 b8 0e 00 00 00 48 0f 45 d8 e9 15 f7 ff ff <0f> 0b e9 cb f8 ff ff 41 bf f5 ff ff ff e9 08 f8 ff ff 49 8d bc 24 [ 920.620482] RSP: 0018:ffff97b7c00bbc38 EFLAGS: 00010246 [ 920.620817] RAX: 0000000000000000 RBX: 000000000000000c RCX: 0000000000000008 [ 920.621183] RDX: ffff960dc396ebc0 RSI: 0000000000005400 RDI: ffff960dc4e2fbac [ 920.621548] RBP: 0000000000000000 R08: 0000000000000001 R09: ffffffffac406450 [ 920.621884] R10: ffffffffac4060c0 R11: 0000000000000001 R12: ffff960dc4e2f800 [ 920.622254] R13: ffff960dc4e2f928 R14: ffff97b7c029c580 R15: 0000000000000000 [ 920.622609] FS: 0000000000000000(0000) GS:ffff960ef7d00000(0000) knlGS:0000000000000000 [ 920.622979] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 920.623245] CR2: 00007fa056965e90 CR3: 00000001107f1000 CR4: 00000000000006e0 [ 920.623680] Call Trace: [ 920.623815] <TASK> [ 920.623933] ? __warn+0x79/0xc0 [ 920.624116] ? rxe_completer+0x989/0xcc0 [rdma_rxe] [ 920.624356] ? report_bug+0xfb/0x150 [ 920.624594] ? handle_bug+0x3c/0x60 [ 920.624796] ? exc_invalid_op+0x14/0x70 [ 920.624976] ? asm_exc_invalid_op+0x16/0x20 [ 920.625203] ? rxe_completer+0x989/0xcc0 [rdma_rxe] [ 920.625474] ? rxe_completer+0x329/0xcc0 [rdma_rxe] [ 920.625749] rxe_do_task+0x80/0x110 [rdma_rxe] [ 920.626037] rxe_requester+0x625/0xde0 [rdma_rxe] [ 920.626310] ? rxe_cq_post+0xe2/0x180 [rdma_rxe] [ 920.626583] ? do_complete+0x18d/0x220 [rdma_rxe] [ 920.626812] ? rxe_completer+0x1a3/0xcc0 [rdma_rxe] [ 920.627050] rxe_do_task+0x80/0x110 [rdma_rxe] [ 920.627285] tasklet_action_common.constprop.0+0xa4/0x120 [ 920.627522] handle_softirqs+0xc2/0x250 [ 920.627728] ? sort_range+0x20/0x20 [ 920.627942] run_ksoftirqd+0x1f/0x30 [ 920.628158] smpboot_thread_fn+0xc7/0x1b0 [ 920.628334] kthread+0xd6/0x100 [ 920.628504] ? kthread_complete_and_exit+0x20/0x20 [ 920.628709] ret_from_fork+0x1f/0x30 [ 920.628892] </TASK> Fixes: ae720bd ("RDMA/rxe: Generate error completion for error requester QP state") Signed-off-by: Zhu Yanjun <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
It seems after
gcc
update from version 10.3.0 to 11.2.0, I am not able to compile the kernel anymoreWhen I try to compile the linux kernel 5.15.25 with linux-hardened patches using gcc 11.2.0, I get the following output:
As this suggests, the only impact of the patch on this issue is the addition of
__attribute__((alloc_size(1)))
at function__kmalloc(size_t size, gfp_t flags)
in./include/linux/slab.h:428
. I can confirm that removing this and this only does "fix the problem".Everything works fine when using gcc version 10.4.0.
I'm fully aware that this is not directly linked to the patch itself, and it most likely has to do with a gcc regression or something.
However, I was hopping not being alone with this issue, and hopefully find some help here.
My guess is that this is a bogus warning / false positive.
The text was updated successfully, but these errors were encountered: