-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Programs compiled with "-fsanitize=address" crashes with: "Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING." #69
Comments
Looking at the build log you provided, I guess it's caused by our commit Revert "mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes". |
I tested a basically no-op program:
If compiled with
|
I think this may be expected as the issues with asan were the reason behind reverting the patch. This is mentioned in revert description There is a tracking issue on KSPP about working around the problem but no progress so far. |
@Bernhard40 thanks for filling in the info here. The mentioned revert and the reason related to ASan expectations/limitations is indeed the culprit why this doesn't work with linux-hardened. This unfortunately is expected until upstream finds a solution to this issue. |
…te() [ Upstream commit f020d95 ] When peer delete failed in a disconnect operation, use-after-free detected by KFENCE in below log. It is because for each vdev_id and address, it has only one struct ath10k_peer, it is allocated in ath10k_peer_map_event(). When connected to an AP, it has more than one HTT_T2H_MSG_TYPE_PEER_MAP reported from firmware, then the array peer_map of struct ath10k will be set muti-elements to the same ath10k_peer in ath10k_peer_map_event(). When peer delete failed in ath10k_sta_state(), the ath10k_peer will be free for the 1st peer id in array peer_map of struct ath10k, and then use-after-free happened for the 2nd peer id because they map to the same ath10k_peer. And clean up all peers in array peer_map for the ath10k_peer, then user-after-free disappeared peer map event log: [ 306.911021] wlan0: authenticate with b0:2a:43:e6:75:0e [ 306.957187] ath10k_pci 0000:01:00.0: mac vdev 0 peer create b0:2a:43:e6:75:0e (new sta) sta 1 / 32 peer 1 / 33 [ 306.957395] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 306.957404] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 306.986924] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 166 peer unmap event log: [ 435.715691] wlan0: deauthenticating from b0:2a:43:e6:75:0e by local choice (Reason: 3=DEAUTH_LEAVING) [ 435.716802] ath10k_pci 0000:01:00.0: mac vdev 0 peer delete b0:2a:43:e6:75:0e sta ffff990e0e9c2b50 (sta gone) [ 435.717177] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 435.717186] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 435.717193] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 166 use-after-free log: [21705.888627] wlan0: deauthenticating from d0:76:8f:82:be:75 by local choice (Reason: 3=DEAUTH_LEAVING) [21713.799910] ath10k_pci 0000:01:00.0: failed to delete peer d0:76:8f:82:be:75 for vdev 0: -110 [21713.799925] ath10k_pci 0000:01:00.0: found sta peer d0:76:8f:82:be:75 (ptr 0000000000000000 id 102) entry on vdev 0 after it was supposedly removed [21713.799968] ================================================================== [21713.799991] BUG: KFENCE: use-after-free read in ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.799991] [21713.799997] Use-after-free read at 0x00000000abe1c75e (in kfence-#69): [21713.800010] ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.800041] drv_sta_state+0x115/0x677 [mac80211] [21713.800059] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800076] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800093] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800110] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800137] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800153] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800161] genl_rcv_msg+0x38e/0x3be [21713.800166] netlink_rcv_skb+0x89/0xf7 [21713.800171] genl_rcv+0x28/0x36 [21713.800176] netlink_unicast+0x179/0x24b [21713.800181] netlink_sendmsg+0x3a0/0x40e [21713.800187] sock_sendmsg+0x72/0x76 [21713.800192] ____sys_sendmsg+0x16d/0x1e3 [21713.800196] ___sys_sendmsg+0x95/0xd1 [21713.800200] __sys_sendmsg+0x85/0xbf [21713.800205] do_syscall_64+0x43/0x55 [21713.800210] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [21713.800213] [21713.800219] kfence-#69: 0x000000009149b0d5-0x000000004c0697fb, size=1064, cache=kmalloc-2k [21713.800219] [21713.800224] allocated by task 13 on cpu 0 at 21705.501373s: [21713.800241] ath10k_peer_map_event+0x7e/0x154 [ath10k_core] [21713.800254] ath10k_htt_t2h_msg_handler+0x586/0x1039 [ath10k_core] [21713.800265] ath10k_htt_htc_t2h_msg_handler+0x12/0x28 [ath10k_core] [21713.800277] ath10k_htc_rx_completion_handler+0x14c/0x1b5 [ath10k_core] [21713.800283] ath10k_pci_process_rx_cb+0x195/0x1df [ath10k_pci] [21713.800294] ath10k_ce_per_engine_service+0x55/0x74 [ath10k_core] [21713.800305] ath10k_ce_per_engine_service_any+0x76/0x84 [ath10k_core] [21713.800310] ath10k_pci_napi_poll+0x49/0x144 [ath10k_pci] [21713.800316] net_rx_action+0xdc/0x361 [21713.800320] __do_softirq+0x163/0x29a [21713.800325] asm_call_irq_on_stack+0x12/0x20 [21713.800331] do_softirq_own_stack+0x3c/0x48 [21713.800337] __irq_exit_rcu+0x9b/0x9d [21713.800342] common_interrupt+0xc9/0x14d [21713.800346] asm_common_interrupt+0x1e/0x40 [21713.800351] ksoftirqd_should_run+0x5/0x16 [21713.800357] smpboot_thread_fn+0x148/0x211 [21713.800362] kthread+0x150/0x15f [21713.800367] ret_from_fork+0x22/0x30 [21713.800370] [21713.800374] freed by task 708 on cpu 1 at 21713.799953s: [21713.800498] ath10k_sta_state+0x2c6/0xb8a [ath10k_core] [21713.800515] drv_sta_state+0x115/0x677 [mac80211] [21713.800532] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800548] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800565] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800581] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800598] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800614] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800619] genl_rcv_msg+0x38e/0x3be [21713.800623] netlink_rcv_skb+0x89/0xf7 [21713.800628] genl_rcv+0x28/0x36 [21713.800632] netlink_unicast+0x179/0x24b [21713.800637] netlink_sendmsg+0x3a0/0x40e [21713.800642] sock_sendmsg+0x72/0x76 [21713.800646] ____sys_sendmsg+0x16d/0x1e3 [21713.800651] ___sys_sendmsg+0x95/0xd1 [21713.800655] __sys_sendmsg+0x85/0xbf [21713.800659] do_syscall_64+0x43/0x55 [21713.800663] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1 Fixes: d0eeafa ("ath10k: Clean up peer when sta goes away.") Signed-off-by: Wen Gong <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
…te() [ Upstream commit f020d95 ] When peer delete failed in a disconnect operation, use-after-free detected by KFENCE in below log. It is because for each vdev_id and address, it has only one struct ath10k_peer, it is allocated in ath10k_peer_map_event(). When connected to an AP, it has more than one HTT_T2H_MSG_TYPE_PEER_MAP reported from firmware, then the array peer_map of struct ath10k will be set muti-elements to the same ath10k_peer in ath10k_peer_map_event(). When peer delete failed in ath10k_sta_state(), the ath10k_peer will be free for the 1st peer id in array peer_map of struct ath10k, and then use-after-free happened for the 2nd peer id because they map to the same ath10k_peer. And clean up all peers in array peer_map for the ath10k_peer, then user-after-free disappeared peer map event log: [ 306.911021] wlan0: authenticate with b0:2a:43:e6:75:0e [ 306.957187] ath10k_pci 0000:01:00.0: mac vdev 0 peer create b0:2a:43:e6:75:0e (new sta) sta 1 / 32 peer 1 / 33 [ 306.957395] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 306.957404] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 306.986924] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 166 peer unmap event log: [ 435.715691] wlan0: deauthenticating from b0:2a:43:e6:75:0e by local choice (Reason: 3=DEAUTH_LEAVING) [ 435.716802] ath10k_pci 0000:01:00.0: mac vdev 0 peer delete b0:2a:43:e6:75:0e sta ffff990e0e9c2b50 (sta gone) [ 435.717177] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 435.717186] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 435.717193] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 166 use-after-free log: [21705.888627] wlan0: deauthenticating from d0:76:8f:82:be:75 by local choice (Reason: 3=DEAUTH_LEAVING) [21713.799910] ath10k_pci 0000:01:00.0: failed to delete peer d0:76:8f:82:be:75 for vdev 0: -110 [21713.799925] ath10k_pci 0000:01:00.0: found sta peer d0:76:8f:82:be:75 (ptr 0000000000000000 id 102) entry on vdev 0 after it was supposedly removed [21713.799968] ================================================================== [21713.799991] BUG: KFENCE: use-after-free read in ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.799991] [21713.799997] Use-after-free read at 0x00000000abe1c75e (in kfence-#69): [21713.800010] ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.800041] drv_sta_state+0x115/0x677 [mac80211] [21713.800059] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800076] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800093] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800110] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800137] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800153] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800161] genl_rcv_msg+0x38e/0x3be [21713.800166] netlink_rcv_skb+0x89/0xf7 [21713.800171] genl_rcv+0x28/0x36 [21713.800176] netlink_unicast+0x179/0x24b [21713.800181] netlink_sendmsg+0x3a0/0x40e [21713.800187] sock_sendmsg+0x72/0x76 [21713.800192] ____sys_sendmsg+0x16d/0x1e3 [21713.800196] ___sys_sendmsg+0x95/0xd1 [21713.800200] __sys_sendmsg+0x85/0xbf [21713.800205] do_syscall_64+0x43/0x55 [21713.800210] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [21713.800213] [21713.800219] kfence-#69: 0x000000009149b0d5-0x000000004c0697fb, size=1064, cache=kmalloc-2k [21713.800219] [21713.800224] allocated by task 13 on cpu 0 at 21705.501373s: [21713.800241] ath10k_peer_map_event+0x7e/0x154 [ath10k_core] [21713.800254] ath10k_htt_t2h_msg_handler+0x586/0x1039 [ath10k_core] [21713.800265] ath10k_htt_htc_t2h_msg_handler+0x12/0x28 [ath10k_core] [21713.800277] ath10k_htc_rx_completion_handler+0x14c/0x1b5 [ath10k_core] [21713.800283] ath10k_pci_process_rx_cb+0x195/0x1df [ath10k_pci] [21713.800294] ath10k_ce_per_engine_service+0x55/0x74 [ath10k_core] [21713.800305] ath10k_ce_per_engine_service_any+0x76/0x84 [ath10k_core] [21713.800310] ath10k_pci_napi_poll+0x49/0x144 [ath10k_pci] [21713.800316] net_rx_action+0xdc/0x361 [21713.800320] __do_softirq+0x163/0x29a [21713.800325] asm_call_irq_on_stack+0x12/0x20 [21713.800331] do_softirq_own_stack+0x3c/0x48 [21713.800337] __irq_exit_rcu+0x9b/0x9d [21713.800342] common_interrupt+0xc9/0x14d [21713.800346] asm_common_interrupt+0x1e/0x40 [21713.800351] ksoftirqd_should_run+0x5/0x16 [21713.800357] smpboot_thread_fn+0x148/0x211 [21713.800362] kthread+0x150/0x15f [21713.800367] ret_from_fork+0x22/0x30 [21713.800370] [21713.800374] freed by task 708 on cpu 1 at 21713.799953s: [21713.800498] ath10k_sta_state+0x2c6/0xb8a [ath10k_core] [21713.800515] drv_sta_state+0x115/0x677 [mac80211] [21713.800532] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800548] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800565] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800581] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800598] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800614] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800619] genl_rcv_msg+0x38e/0x3be [21713.800623] netlink_rcv_skb+0x89/0xf7 [21713.800628] genl_rcv+0x28/0x36 [21713.800632] netlink_unicast+0x179/0x24b [21713.800637] netlink_sendmsg+0x3a0/0x40e [21713.800642] sock_sendmsg+0x72/0x76 [21713.800646] ____sys_sendmsg+0x16d/0x1e3 [21713.800651] ___sys_sendmsg+0x95/0xd1 [21713.800655] __sys_sendmsg+0x85/0xbf [21713.800659] do_syscall_64+0x43/0x55 [21713.800663] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1 Fixes: d0eeafa ("ath10k: Clean up peer when sta goes away.") Signed-off-by: Wen Gong <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
…te() [ Upstream commit f020d95 ] When peer delete failed in a disconnect operation, use-after-free detected by KFENCE in below log. It is because for each vdev_id and address, it has only one struct ath10k_peer, it is allocated in ath10k_peer_map_event(). When connected to an AP, it has more than one HTT_T2H_MSG_TYPE_PEER_MAP reported from firmware, then the array peer_map of struct ath10k will be set muti-elements to the same ath10k_peer in ath10k_peer_map_event(). When peer delete failed in ath10k_sta_state(), the ath10k_peer will be free for the 1st peer id in array peer_map of struct ath10k, and then use-after-free happened for the 2nd peer id because they map to the same ath10k_peer. And clean up all peers in array peer_map for the ath10k_peer, then user-after-free disappeared peer map event log: [ 306.911021] wlan0: authenticate with b0:2a:43:e6:75:0e [ 306.957187] ath10k_pci 0000:01:00.0: mac vdev 0 peer create b0:2a:43:e6:75:0e (new sta) sta 1 / 32 peer 1 / 33 [ 306.957395] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 306.957404] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 306.986924] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 166 peer unmap event log: [ 435.715691] wlan0: deauthenticating from b0:2a:43:e6:75:0e by local choice (Reason: 3=DEAUTH_LEAVING) [ 435.716802] ath10k_pci 0000:01:00.0: mac vdev 0 peer delete b0:2a:43:e6:75:0e sta ffff990e0e9c2b50 (sta gone) [ 435.717177] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 435.717186] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 435.717193] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 166 use-after-free log: [21705.888627] wlan0: deauthenticating from d0:76:8f:82:be:75 by local choice (Reason: 3=DEAUTH_LEAVING) [21713.799910] ath10k_pci 0000:01:00.0: failed to delete peer d0:76:8f:82:be:75 for vdev 0: -110 [21713.799925] ath10k_pci 0000:01:00.0: found sta peer d0:76:8f:82:be:75 (ptr 0000000000000000 id 102) entry on vdev 0 after it was supposedly removed [21713.799968] ================================================================== [21713.799991] BUG: KFENCE: use-after-free read in ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.799991] [21713.799997] Use-after-free read at 0x00000000abe1c75e (in kfence-#69): [21713.800010] ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.800041] drv_sta_state+0x115/0x677 [mac80211] [21713.800059] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800076] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800093] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800110] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800137] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800153] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800161] genl_rcv_msg+0x38e/0x3be [21713.800166] netlink_rcv_skb+0x89/0xf7 [21713.800171] genl_rcv+0x28/0x36 [21713.800176] netlink_unicast+0x179/0x24b [21713.800181] netlink_sendmsg+0x3a0/0x40e [21713.800187] sock_sendmsg+0x72/0x76 [21713.800192] ____sys_sendmsg+0x16d/0x1e3 [21713.800196] ___sys_sendmsg+0x95/0xd1 [21713.800200] __sys_sendmsg+0x85/0xbf [21713.800205] do_syscall_64+0x43/0x55 [21713.800210] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [21713.800213] [21713.800219] kfence-#69: 0x000000009149b0d5-0x000000004c0697fb, size=1064, cache=kmalloc-2k [21713.800219] [21713.800224] allocated by task 13 on cpu 0 at 21705.501373s: [21713.800241] ath10k_peer_map_event+0x7e/0x154 [ath10k_core] [21713.800254] ath10k_htt_t2h_msg_handler+0x586/0x1039 [ath10k_core] [21713.800265] ath10k_htt_htc_t2h_msg_handler+0x12/0x28 [ath10k_core] [21713.800277] ath10k_htc_rx_completion_handler+0x14c/0x1b5 [ath10k_core] [21713.800283] ath10k_pci_process_rx_cb+0x195/0x1df [ath10k_pci] [21713.800294] ath10k_ce_per_engine_service+0x55/0x74 [ath10k_core] [21713.800305] ath10k_ce_per_engine_service_any+0x76/0x84 [ath10k_core] [21713.800310] ath10k_pci_napi_poll+0x49/0x144 [ath10k_pci] [21713.800316] net_rx_action+0xdc/0x361 [21713.800320] __do_softirq+0x163/0x29a [21713.800325] asm_call_irq_on_stack+0x12/0x20 [21713.800331] do_softirq_own_stack+0x3c/0x48 [21713.800337] __irq_exit_rcu+0x9b/0x9d [21713.800342] common_interrupt+0xc9/0x14d [21713.800346] asm_common_interrupt+0x1e/0x40 [21713.800351] ksoftirqd_should_run+0x5/0x16 [21713.800357] smpboot_thread_fn+0x148/0x211 [21713.800362] kthread+0x150/0x15f [21713.800367] ret_from_fork+0x22/0x30 [21713.800370] [21713.800374] freed by task 708 on cpu 1 at 21713.799953s: [21713.800498] ath10k_sta_state+0x2c6/0xb8a [ath10k_core] [21713.800515] drv_sta_state+0x115/0x677 [mac80211] [21713.800532] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800548] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800565] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800581] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800598] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800614] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800619] genl_rcv_msg+0x38e/0x3be [21713.800623] netlink_rcv_skb+0x89/0xf7 [21713.800628] genl_rcv+0x28/0x36 [21713.800632] netlink_unicast+0x179/0x24b [21713.800637] netlink_sendmsg+0x3a0/0x40e [21713.800642] sock_sendmsg+0x72/0x76 [21713.800646] ____sys_sendmsg+0x16d/0x1e3 [21713.800651] ___sys_sendmsg+0x95/0xd1 [21713.800655] __sys_sendmsg+0x85/0xbf [21713.800659] do_syscall_64+0x43/0x55 [21713.800663] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1 Fixes: d0eeafa ("ath10k: Clean up peer when sta goes away.") Signed-off-by: Wen Gong <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
…te() [ Upstream commit f020d95 ] When peer delete failed in a disconnect operation, use-after-free detected by KFENCE in below log. It is because for each vdev_id and address, it has only one struct ath10k_peer, it is allocated in ath10k_peer_map_event(). When connected to an AP, it has more than one HTT_T2H_MSG_TYPE_PEER_MAP reported from firmware, then the array peer_map of struct ath10k will be set muti-elements to the same ath10k_peer in ath10k_peer_map_event(). When peer delete failed in ath10k_sta_state(), the ath10k_peer will be free for the 1st peer id in array peer_map of struct ath10k, and then use-after-free happened for the 2nd peer id because they map to the same ath10k_peer. And clean up all peers in array peer_map for the ath10k_peer, then user-after-free disappeared peer map event log: [ 306.911021] wlan0: authenticate with b0:2a:43:e6:75:0e [ 306.957187] ath10k_pci 0000:01:00.0: mac vdev 0 peer create b0:2a:43:e6:75:0e (new sta) sta 1 / 32 peer 1 / 33 [ 306.957395] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 306.957404] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 306.986924] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 166 peer unmap event log: [ 435.715691] wlan0: deauthenticating from b0:2a:43:e6:75:0e by local choice (Reason: 3=DEAUTH_LEAVING) [ 435.716802] ath10k_pci 0000:01:00.0: mac vdev 0 peer delete b0:2a:43:e6:75:0e sta ffff990e0e9c2b50 (sta gone) [ 435.717177] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 435.717186] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 435.717193] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 166 use-after-free log: [21705.888627] wlan0: deauthenticating from d0:76:8f:82:be:75 by local choice (Reason: 3=DEAUTH_LEAVING) [21713.799910] ath10k_pci 0000:01:00.0: failed to delete peer d0:76:8f:82:be:75 for vdev 0: -110 [21713.799925] ath10k_pci 0000:01:00.0: found sta peer d0:76:8f:82:be:75 (ptr 0000000000000000 id 102) entry on vdev 0 after it was supposedly removed [21713.799968] ================================================================== [21713.799991] BUG: KFENCE: use-after-free read in ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.799991] [21713.799997] Use-after-free read at 0x00000000abe1c75e (in kfence-#69): [21713.800010] ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.800041] drv_sta_state+0x115/0x677 [mac80211] [21713.800059] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800076] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800093] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800110] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800137] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800153] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800161] genl_rcv_msg+0x38e/0x3be [21713.800166] netlink_rcv_skb+0x89/0xf7 [21713.800171] genl_rcv+0x28/0x36 [21713.800176] netlink_unicast+0x179/0x24b [21713.800181] netlink_sendmsg+0x3a0/0x40e [21713.800187] sock_sendmsg+0x72/0x76 [21713.800192] ____sys_sendmsg+0x16d/0x1e3 [21713.800196] ___sys_sendmsg+0x95/0xd1 [21713.800200] __sys_sendmsg+0x85/0xbf [21713.800205] do_syscall_64+0x43/0x55 [21713.800210] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [21713.800213] [21713.800219] kfence-#69: 0x000000009149b0d5-0x000000004c0697fb, size=1064, cache=kmalloc-2k [21713.800219] [21713.800224] allocated by task 13 on cpu 0 at 21705.501373s: [21713.800241] ath10k_peer_map_event+0x7e/0x154 [ath10k_core] [21713.800254] ath10k_htt_t2h_msg_handler+0x586/0x1039 [ath10k_core] [21713.800265] ath10k_htt_htc_t2h_msg_handler+0x12/0x28 [ath10k_core] [21713.800277] ath10k_htc_rx_completion_handler+0x14c/0x1b5 [ath10k_core] [21713.800283] ath10k_pci_process_rx_cb+0x195/0x1df [ath10k_pci] [21713.800294] ath10k_ce_per_engine_service+0x55/0x74 [ath10k_core] [21713.800305] ath10k_ce_per_engine_service_any+0x76/0x84 [ath10k_core] [21713.800310] ath10k_pci_napi_poll+0x49/0x144 [ath10k_pci] [21713.800316] net_rx_action+0xdc/0x361 [21713.800320] __do_softirq+0x163/0x29a [21713.800325] asm_call_irq_on_stack+0x12/0x20 [21713.800331] do_softirq_own_stack+0x3c/0x48 [21713.800337] __irq_exit_rcu+0x9b/0x9d [21713.800342] common_interrupt+0xc9/0x14d [21713.800346] asm_common_interrupt+0x1e/0x40 [21713.800351] ksoftirqd_should_run+0x5/0x16 [21713.800357] smpboot_thread_fn+0x148/0x211 [21713.800362] kthread+0x150/0x15f [21713.800367] ret_from_fork+0x22/0x30 [21713.800370] [21713.800374] freed by task 708 on cpu 1 at 21713.799953s: [21713.800498] ath10k_sta_state+0x2c6/0xb8a [ath10k_core] [21713.800515] drv_sta_state+0x115/0x677 [mac80211] [21713.800532] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800548] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800565] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800581] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800598] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800614] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800619] genl_rcv_msg+0x38e/0x3be [21713.800623] netlink_rcv_skb+0x89/0xf7 [21713.800628] genl_rcv+0x28/0x36 [21713.800632] netlink_unicast+0x179/0x24b [21713.800637] netlink_sendmsg+0x3a0/0x40e [21713.800642] sock_sendmsg+0x72/0x76 [21713.800646] ____sys_sendmsg+0x16d/0x1e3 [21713.800651] ___sys_sendmsg+0x95/0xd1 [21713.800655] __sys_sendmsg+0x85/0xbf [21713.800659] do_syscall_64+0x43/0x55 [21713.800663] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1 Fixes: d0eeafa ("ath10k: Clean up peer when sta goes away.") Signed-off-by: Wen Gong <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
…te() [ Upstream commit f020d95 ] When peer delete failed in a disconnect operation, use-after-free detected by KFENCE in below log. It is because for each vdev_id and address, it has only one struct ath10k_peer, it is allocated in ath10k_peer_map_event(). When connected to an AP, it has more than one HTT_T2H_MSG_TYPE_PEER_MAP reported from firmware, then the array peer_map of struct ath10k will be set muti-elements to the same ath10k_peer in ath10k_peer_map_event(). When peer delete failed in ath10k_sta_state(), the ath10k_peer will be free for the 1st peer id in array peer_map of struct ath10k, and then use-after-free happened for the 2nd peer id because they map to the same ath10k_peer. And clean up all peers in array peer_map for the ath10k_peer, then user-after-free disappeared peer map event log: [ 306.911021] wlan0: authenticate with b0:2a:43:e6:75:0e [ 306.957187] ath10k_pci 0000:01:00.0: mac vdev 0 peer create b0:2a:43:e6:75:0e (new sta) sta 1 / 32 peer 1 / 33 [ 306.957395] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 306.957404] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 306.986924] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 166 peer unmap event log: [ 435.715691] wlan0: deauthenticating from b0:2a:43:e6:75:0e by local choice (Reason: 3=DEAUTH_LEAVING) [ 435.716802] ath10k_pci 0000:01:00.0: mac vdev 0 peer delete b0:2a:43:e6:75:0e sta ffff990e0e9c2b50 (sta gone) [ 435.717177] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 435.717186] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 435.717193] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 166 use-after-free log: [21705.888627] wlan0: deauthenticating from d0:76:8f:82:be:75 by local choice (Reason: 3=DEAUTH_LEAVING) [21713.799910] ath10k_pci 0000:01:00.0: failed to delete peer d0:76:8f:82:be:75 for vdev 0: -110 [21713.799925] ath10k_pci 0000:01:00.0: found sta peer d0:76:8f:82:be:75 (ptr 0000000000000000 id 102) entry on vdev 0 after it was supposedly removed [21713.799968] ================================================================== [21713.799991] BUG: KFENCE: use-after-free read in ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.799991] [21713.799997] Use-after-free read at 0x00000000abe1c75e (in kfence-#69): [21713.800010] ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.800041] drv_sta_state+0x115/0x677 [mac80211] [21713.800059] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800076] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800093] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800110] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800137] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800153] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800161] genl_rcv_msg+0x38e/0x3be [21713.800166] netlink_rcv_skb+0x89/0xf7 [21713.800171] genl_rcv+0x28/0x36 [21713.800176] netlink_unicast+0x179/0x24b [21713.800181] netlink_sendmsg+0x3a0/0x40e [21713.800187] sock_sendmsg+0x72/0x76 [21713.800192] ____sys_sendmsg+0x16d/0x1e3 [21713.800196] ___sys_sendmsg+0x95/0xd1 [21713.800200] __sys_sendmsg+0x85/0xbf [21713.800205] do_syscall_64+0x43/0x55 [21713.800210] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [21713.800213] [21713.800219] kfence-#69: 0x000000009149b0d5-0x000000004c0697fb, size=1064, cache=kmalloc-2k [21713.800219] [21713.800224] allocated by task 13 on cpu 0 at 21705.501373s: [21713.800241] ath10k_peer_map_event+0x7e/0x154 [ath10k_core] [21713.800254] ath10k_htt_t2h_msg_handler+0x586/0x1039 [ath10k_core] [21713.800265] ath10k_htt_htc_t2h_msg_handler+0x12/0x28 [ath10k_core] [21713.800277] ath10k_htc_rx_completion_handler+0x14c/0x1b5 [ath10k_core] [21713.800283] ath10k_pci_process_rx_cb+0x195/0x1df [ath10k_pci] [21713.800294] ath10k_ce_per_engine_service+0x55/0x74 [ath10k_core] [21713.800305] ath10k_ce_per_engine_service_any+0x76/0x84 [ath10k_core] [21713.800310] ath10k_pci_napi_poll+0x49/0x144 [ath10k_pci] [21713.800316] net_rx_action+0xdc/0x361 [21713.800320] __do_softirq+0x163/0x29a [21713.800325] asm_call_irq_on_stack+0x12/0x20 [21713.800331] do_softirq_own_stack+0x3c/0x48 [21713.800337] __irq_exit_rcu+0x9b/0x9d [21713.800342] common_interrupt+0xc9/0x14d [21713.800346] asm_common_interrupt+0x1e/0x40 [21713.800351] ksoftirqd_should_run+0x5/0x16 [21713.800357] smpboot_thread_fn+0x148/0x211 [21713.800362] kthread+0x150/0x15f [21713.800367] ret_from_fork+0x22/0x30 [21713.800370] [21713.800374] freed by task 708 on cpu 1 at 21713.799953s: [21713.800498] ath10k_sta_state+0x2c6/0xb8a [ath10k_core] [21713.800515] drv_sta_state+0x115/0x677 [mac80211] [21713.800532] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800548] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800565] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800581] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800598] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800614] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800619] genl_rcv_msg+0x38e/0x3be [21713.800623] netlink_rcv_skb+0x89/0xf7 [21713.800628] genl_rcv+0x28/0x36 [21713.800632] netlink_unicast+0x179/0x24b [21713.800637] netlink_sendmsg+0x3a0/0x40e [21713.800642] sock_sendmsg+0x72/0x76 [21713.800646] ____sys_sendmsg+0x16d/0x1e3 [21713.800651] ___sys_sendmsg+0x95/0xd1 [21713.800655] __sys_sendmsg+0x85/0xbf [21713.800659] do_syscall_64+0x43/0x55 [21713.800663] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1 Fixes: d0eeafa ("ath10k: Clean up peer when sta goes away.") Signed-off-by: Wen Gong <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
…te() [ Upstream commit f020d95 ] When peer delete failed in a disconnect operation, use-after-free detected by KFENCE in below log. It is because for each vdev_id and address, it has only one struct ath10k_peer, it is allocated in ath10k_peer_map_event(). When connected to an AP, it has more than one HTT_T2H_MSG_TYPE_PEER_MAP reported from firmware, then the array peer_map of struct ath10k will be set muti-elements to the same ath10k_peer in ath10k_peer_map_event(). When peer delete failed in ath10k_sta_state(), the ath10k_peer will be free for the 1st peer id in array peer_map of struct ath10k, and then use-after-free happened for the 2nd peer id because they map to the same ath10k_peer. And clean up all peers in array peer_map for the ath10k_peer, then user-after-free disappeared peer map event log: [ 306.911021] wlan0: authenticate with b0:2a:43:e6:75:0e [ 306.957187] ath10k_pci 0000:01:00.0: mac vdev 0 peer create b0:2a:43:e6:75:0e (new sta) sta 1 / 32 peer 1 / 33 [ 306.957395] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 306.957404] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 306.986924] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 166 peer unmap event log: [ 435.715691] wlan0: deauthenticating from b0:2a:43:e6:75:0e by local choice (Reason: 3=DEAUTH_LEAVING) [ 435.716802] ath10k_pci 0000:01:00.0: mac vdev 0 peer delete b0:2a:43:e6:75:0e sta ffff990e0e9c2b50 (sta gone) [ 435.717177] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 435.717186] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 435.717193] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 166 use-after-free log: [21705.888627] wlan0: deauthenticating from d0:76:8f:82:be:75 by local choice (Reason: 3=DEAUTH_LEAVING) [21713.799910] ath10k_pci 0000:01:00.0: failed to delete peer d0:76:8f:82:be:75 for vdev 0: -110 [21713.799925] ath10k_pci 0000:01:00.0: found sta peer d0:76:8f:82:be:75 (ptr 0000000000000000 id 102) entry on vdev 0 after it was supposedly removed [21713.799968] ================================================================== [21713.799991] BUG: KFENCE: use-after-free read in ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.799991] [21713.799997] Use-after-free read at 0x00000000abe1c75e (in kfence-#69): [21713.800010] ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.800041] drv_sta_state+0x115/0x677 [mac80211] [21713.800059] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800076] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800093] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800110] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800137] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800153] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800161] genl_rcv_msg+0x38e/0x3be [21713.800166] netlink_rcv_skb+0x89/0xf7 [21713.800171] genl_rcv+0x28/0x36 [21713.800176] netlink_unicast+0x179/0x24b [21713.800181] netlink_sendmsg+0x3a0/0x40e [21713.800187] sock_sendmsg+0x72/0x76 [21713.800192] ____sys_sendmsg+0x16d/0x1e3 [21713.800196] ___sys_sendmsg+0x95/0xd1 [21713.800200] __sys_sendmsg+0x85/0xbf [21713.800205] do_syscall_64+0x43/0x55 [21713.800210] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [21713.800213] [21713.800219] kfence-#69: 0x000000009149b0d5-0x000000004c0697fb, size=1064, cache=kmalloc-2k [21713.800219] [21713.800224] allocated by task 13 on cpu 0 at 21705.501373s: [21713.800241] ath10k_peer_map_event+0x7e/0x154 [ath10k_core] [21713.800254] ath10k_htt_t2h_msg_handler+0x586/0x1039 [ath10k_core] [21713.800265] ath10k_htt_htc_t2h_msg_handler+0x12/0x28 [ath10k_core] [21713.800277] ath10k_htc_rx_completion_handler+0x14c/0x1b5 [ath10k_core] [21713.800283] ath10k_pci_process_rx_cb+0x195/0x1df [ath10k_pci] [21713.800294] ath10k_ce_per_engine_service+0x55/0x74 [ath10k_core] [21713.800305] ath10k_ce_per_engine_service_any+0x76/0x84 [ath10k_core] [21713.800310] ath10k_pci_napi_poll+0x49/0x144 [ath10k_pci] [21713.800316] net_rx_action+0xdc/0x361 [21713.800320] __do_softirq+0x163/0x29a [21713.800325] asm_call_irq_on_stack+0x12/0x20 [21713.800331] do_softirq_own_stack+0x3c/0x48 [21713.800337] __irq_exit_rcu+0x9b/0x9d [21713.800342] common_interrupt+0xc9/0x14d [21713.800346] asm_common_interrupt+0x1e/0x40 [21713.800351] ksoftirqd_should_run+0x5/0x16 [21713.800357] smpboot_thread_fn+0x148/0x211 [21713.800362] kthread+0x150/0x15f [21713.800367] ret_from_fork+0x22/0x30 [21713.800370] [21713.800374] freed by task 708 on cpu 1 at 21713.799953s: [21713.800498] ath10k_sta_state+0x2c6/0xb8a [ath10k_core] [21713.800515] drv_sta_state+0x115/0x677 [mac80211] [21713.800532] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800548] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800565] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800581] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800598] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800614] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800619] genl_rcv_msg+0x38e/0x3be [21713.800623] netlink_rcv_skb+0x89/0xf7 [21713.800628] genl_rcv+0x28/0x36 [21713.800632] netlink_unicast+0x179/0x24b [21713.800637] netlink_sendmsg+0x3a0/0x40e [21713.800642] sock_sendmsg+0x72/0x76 [21713.800646] ____sys_sendmsg+0x16d/0x1e3 [21713.800651] ___sys_sendmsg+0x95/0xd1 [21713.800655] __sys_sendmsg+0x85/0xbf [21713.800659] do_syscall_64+0x43/0x55 [21713.800663] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1 Fixes: d0eeafa ("ath10k: Clean up peer when sta goes away.") Signed-off-by: Wen Gong <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
…te() [ Upstream commit f020d95 ] When peer delete failed in a disconnect operation, use-after-free detected by KFENCE in below log. It is because for each vdev_id and address, it has only one struct ath10k_peer, it is allocated in ath10k_peer_map_event(). When connected to an AP, it has more than one HTT_T2H_MSG_TYPE_PEER_MAP reported from firmware, then the array peer_map of struct ath10k will be set muti-elements to the same ath10k_peer in ath10k_peer_map_event(). When peer delete failed in ath10k_sta_state(), the ath10k_peer will be free for the 1st peer id in array peer_map of struct ath10k, and then use-after-free happened for the 2nd peer id because they map to the same ath10k_peer. And clean up all peers in array peer_map for the ath10k_peer, then user-after-free disappeared peer map event log: [ 306.911021] wlan0: authenticate with b0:2a:43:e6:75:0e [ 306.957187] ath10k_pci 0000:01:00.0: mac vdev 0 peer create b0:2a:43:e6:75:0e (new sta) sta 1 / 32 peer 1 / 33 [ 306.957395] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 306.957404] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 306.986924] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 166 peer unmap event log: [ 435.715691] wlan0: deauthenticating from b0:2a:43:e6:75:0e by local choice (Reason: 3=DEAUTH_LEAVING) [ 435.716802] ath10k_pci 0000:01:00.0: mac vdev 0 peer delete b0:2a:43:e6:75:0e sta ffff990e0e9c2b50 (sta gone) [ 435.717177] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 246 [ 435.717186] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 198 [ 435.717193] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 166 use-after-free log: [21705.888627] wlan0: deauthenticating from d0:76:8f:82:be:75 by local choice (Reason: 3=DEAUTH_LEAVING) [21713.799910] ath10k_pci 0000:01:00.0: failed to delete peer d0:76:8f:82:be:75 for vdev 0: -110 [21713.799925] ath10k_pci 0000:01:00.0: found sta peer d0:76:8f:82:be:75 (ptr 0000000000000000 id 102) entry on vdev 0 after it was supposedly removed [21713.799968] ================================================================== [21713.799991] BUG: KFENCE: use-after-free read in ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.799991] [21713.799997] Use-after-free read at 0x00000000abe1c75e (in kfence-#69): [21713.800010] ath10k_sta_state+0x265/0xb8a [ath10k_core] [21713.800041] drv_sta_state+0x115/0x677 [mac80211] [21713.800059] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800076] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800093] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800110] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800137] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800153] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800161] genl_rcv_msg+0x38e/0x3be [21713.800166] netlink_rcv_skb+0x89/0xf7 [21713.800171] genl_rcv+0x28/0x36 [21713.800176] netlink_unicast+0x179/0x24b [21713.800181] netlink_sendmsg+0x3a0/0x40e [21713.800187] sock_sendmsg+0x72/0x76 [21713.800192] ____sys_sendmsg+0x16d/0x1e3 [21713.800196] ___sys_sendmsg+0x95/0xd1 [21713.800200] __sys_sendmsg+0x85/0xbf [21713.800205] do_syscall_64+0x43/0x55 [21713.800210] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [21713.800213] [21713.800219] kfence-#69: 0x000000009149b0d5-0x000000004c0697fb, size=1064, cache=kmalloc-2k [21713.800219] [21713.800224] allocated by task 13 on cpu 0 at 21705.501373s: [21713.800241] ath10k_peer_map_event+0x7e/0x154 [ath10k_core] [21713.800254] ath10k_htt_t2h_msg_handler+0x586/0x1039 [ath10k_core] [21713.800265] ath10k_htt_htc_t2h_msg_handler+0x12/0x28 [ath10k_core] [21713.800277] ath10k_htc_rx_completion_handler+0x14c/0x1b5 [ath10k_core] [21713.800283] ath10k_pci_process_rx_cb+0x195/0x1df [ath10k_pci] [21713.800294] ath10k_ce_per_engine_service+0x55/0x74 [ath10k_core] [21713.800305] ath10k_ce_per_engine_service_any+0x76/0x84 [ath10k_core] [21713.800310] ath10k_pci_napi_poll+0x49/0x144 [ath10k_pci] [21713.800316] net_rx_action+0xdc/0x361 [21713.800320] __do_softirq+0x163/0x29a [21713.800325] asm_call_irq_on_stack+0x12/0x20 [21713.800331] do_softirq_own_stack+0x3c/0x48 [21713.800337] __irq_exit_rcu+0x9b/0x9d [21713.800342] common_interrupt+0xc9/0x14d [21713.800346] asm_common_interrupt+0x1e/0x40 [21713.800351] ksoftirqd_should_run+0x5/0x16 [21713.800357] smpboot_thread_fn+0x148/0x211 [21713.800362] kthread+0x150/0x15f [21713.800367] ret_from_fork+0x22/0x30 [21713.800370] [21713.800374] freed by task 708 on cpu 1 at 21713.799953s: [21713.800498] ath10k_sta_state+0x2c6/0xb8a [ath10k_core] [21713.800515] drv_sta_state+0x115/0x677 [mac80211] [21713.800532] __sta_info_destroy_part2+0xb1/0x133 [mac80211] [21713.800548] __sta_info_flush+0x11d/0x162 [mac80211] [21713.800565] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211] [21713.800581] ieee80211_mgd_deauth+0x26c/0x29b [mac80211] [21713.800598] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211] [21713.800614] nl80211_deauthenticate+0xf8/0x121 [cfg80211] [21713.800619] genl_rcv_msg+0x38e/0x3be [21713.800623] netlink_rcv_skb+0x89/0xf7 [21713.800628] genl_rcv+0x28/0x36 [21713.800632] netlink_unicast+0x179/0x24b [21713.800637] netlink_sendmsg+0x3a0/0x40e [21713.800642] sock_sendmsg+0x72/0x76 [21713.800646] ____sys_sendmsg+0x16d/0x1e3 [21713.800651] ___sys_sendmsg+0x95/0xd1 [21713.800655] __sys_sendmsg+0x85/0xbf [21713.800659] do_syscall_64+0x43/0x55 [21713.800663] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1 Fixes: d0eeafa ("ath10k: Clean up peer when sta goes away.") Signed-off-by: Wen Gong <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 7015843 ] Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT configs. In this particular case, the base VM is AMD with 166 cpus, and I run selftests with regular qemu on top of that and indeed send_signal test failed. I also tried with an Intel box with 80 cpus and there is no issue. The main qemu command line includes: -enable-kvm -smp 16 -cpu host The failure log looks like: $ ./test_progs -t send_signal [ 48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225] [ 48.503622] Modules linked in: bpf_testmod(O) [ 48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G O 6.9.0-08561-g2c1713a8f1c9-dirty #69 [ 48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 [ 48.511635] RIP: 0010:handle_softirqs+0x71/0x290 [ 48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb [ 48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246 [ 48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0 [ 48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000 [ 48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000 [ 48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280 [ 48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 48.528525] FS: 00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000 [ 48.531600] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0 [ 48.537538] Call Trace: [ 48.537538] <IRQ> [ 48.537538] ? watchdog_timer_fn+0x1cd/0x250 [ 48.539590] ? lockup_detector_update_enable+0x50/0x50 [ 48.539590] ? __hrtimer_run_queues+0xff/0x280 [ 48.542520] ? hrtimer_interrupt+0x103/0x230 [ 48.544524] ? __sysvec_apic_timer_interrupt+0x4f/0x140 [ 48.545522] ? sysvec_apic_timer_interrupt+0x3a/0x90 [ 48.547612] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.547612] ? handle_softirqs+0x71/0x290 [ 48.547612] irq_exit_rcu+0x63/0x80 [ 48.551585] sysvec_apic_timer_interrupt+0x75/0x90 [ 48.552521] </IRQ> [ 48.553529] <TASK> [ 48.553529] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260 [ 48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74 [ 48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282 [ 48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620 [ 48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00 [ 48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000 [ 48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000 [ 48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00 [ 48.575614] ? finish_task_switch.isra.0+0x8d/0x260 [ 48.576523] __schedule+0x364/0xac0 [ 48.577535] schedule+0x2e/0x110 [ 48.578555] pipe_read+0x301/0x400 [ 48.579589] ? destroy_sched_domains_rcu+0x30/0x30 [ 48.579589] vfs_read+0x2b3/0x2f0 [ 48.579589] ksys_read+0x8b/0xc0 [ 48.583590] do_syscall_64+0x3d/0xc0 [ 48.583590] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 48.586525] RIP: 0033:0x7f2f28703fa1 [ 48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 [ 48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1 [ 48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006 [ 48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000 [ 48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0 [ 48.605527] </TASK> In the test, two processes are communicating through pipe. Further debugging with strace found that the above splat is triggered as read() syscall could not receive the data even if the corresponding write() syscall in another process successfully wrote data into the pipe. The failed subtest is "send_signal_perf". The corresponding perf event has sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every overflow event will trigger a call to the BPF program. So I suspect this may overwhelm the system. So I increased the sample_period to 100,000 and the test passed. The sample_period 10,000 still has the test failed. In other parts of selftest, e.g., [1], sample_freq is used instead. So I decided to use sample_freq = 1,000 since the test can pass as well. [1] https://lore.kernel.org/bpf/[email protected]/ Reported-by: Alexei Starovoitov <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 7015843 ] Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT configs. In this particular case, the base VM is AMD with 166 cpus, and I run selftests with regular qemu on top of that and indeed send_signal test failed. I also tried with an Intel box with 80 cpus and there is no issue. The main qemu command line includes: -enable-kvm -smp 16 -cpu host The failure log looks like: $ ./test_progs -t send_signal [ 48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225] [ 48.503622] Modules linked in: bpf_testmod(O) [ 48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G O 6.9.0-08561-g2c1713a8f1c9-dirty #69 [ 48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 [ 48.511635] RIP: 0010:handle_softirqs+0x71/0x290 [ 48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb [ 48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246 [ 48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0 [ 48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000 [ 48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000 [ 48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280 [ 48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 48.528525] FS: 00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000 [ 48.531600] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0 [ 48.537538] Call Trace: [ 48.537538] <IRQ> [ 48.537538] ? watchdog_timer_fn+0x1cd/0x250 [ 48.539590] ? lockup_detector_update_enable+0x50/0x50 [ 48.539590] ? __hrtimer_run_queues+0xff/0x280 [ 48.542520] ? hrtimer_interrupt+0x103/0x230 [ 48.544524] ? __sysvec_apic_timer_interrupt+0x4f/0x140 [ 48.545522] ? sysvec_apic_timer_interrupt+0x3a/0x90 [ 48.547612] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.547612] ? handle_softirqs+0x71/0x290 [ 48.547612] irq_exit_rcu+0x63/0x80 [ 48.551585] sysvec_apic_timer_interrupt+0x75/0x90 [ 48.552521] </IRQ> [ 48.553529] <TASK> [ 48.553529] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260 [ 48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74 [ 48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282 [ 48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620 [ 48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00 [ 48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000 [ 48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000 [ 48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00 [ 48.575614] ? finish_task_switch.isra.0+0x8d/0x260 [ 48.576523] __schedule+0x364/0xac0 [ 48.577535] schedule+0x2e/0x110 [ 48.578555] pipe_read+0x301/0x400 [ 48.579589] ? destroy_sched_domains_rcu+0x30/0x30 [ 48.579589] vfs_read+0x2b3/0x2f0 [ 48.579589] ksys_read+0x8b/0xc0 [ 48.583590] do_syscall_64+0x3d/0xc0 [ 48.583590] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 48.586525] RIP: 0033:0x7f2f28703fa1 [ 48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 [ 48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1 [ 48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006 [ 48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000 [ 48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0 [ 48.605527] </TASK> In the test, two processes are communicating through pipe. Further debugging with strace found that the above splat is triggered as read() syscall could not receive the data even if the corresponding write() syscall in another process successfully wrote data into the pipe. The failed subtest is "send_signal_perf". The corresponding perf event has sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every overflow event will trigger a call to the BPF program. So I suspect this may overwhelm the system. So I increased the sample_period to 100,000 and the test passed. The sample_period 10,000 still has the test failed. In other parts of selftest, e.g., [1], sample_freq is used instead. So I decided to use sample_freq = 1,000 since the test can pass as well. [1] https://lore.kernel.org/bpf/[email protected]/ Reported-by: Alexei Starovoitov <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 7015843 ] Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT configs. In this particular case, the base VM is AMD with 166 cpus, and I run selftests with regular qemu on top of that and indeed send_signal test failed. I also tried with an Intel box with 80 cpus and there is no issue. The main qemu command line includes: -enable-kvm -smp 16 -cpu host The failure log looks like: $ ./test_progs -t send_signal [ 48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225] [ 48.503622] Modules linked in: bpf_testmod(O) [ 48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G O 6.9.0-08561-g2c1713a8f1c9-dirty #69 [ 48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 [ 48.511635] RIP: 0010:handle_softirqs+0x71/0x290 [ 48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb [ 48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246 [ 48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0 [ 48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000 [ 48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000 [ 48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280 [ 48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 48.528525] FS: 00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000 [ 48.531600] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0 [ 48.537538] Call Trace: [ 48.537538] <IRQ> [ 48.537538] ? watchdog_timer_fn+0x1cd/0x250 [ 48.539590] ? lockup_detector_update_enable+0x50/0x50 [ 48.539590] ? __hrtimer_run_queues+0xff/0x280 [ 48.542520] ? hrtimer_interrupt+0x103/0x230 [ 48.544524] ? __sysvec_apic_timer_interrupt+0x4f/0x140 [ 48.545522] ? sysvec_apic_timer_interrupt+0x3a/0x90 [ 48.547612] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.547612] ? handle_softirqs+0x71/0x290 [ 48.547612] irq_exit_rcu+0x63/0x80 [ 48.551585] sysvec_apic_timer_interrupt+0x75/0x90 [ 48.552521] </IRQ> [ 48.553529] <TASK> [ 48.553529] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260 [ 48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74 [ 48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282 [ 48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620 [ 48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00 [ 48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000 [ 48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000 [ 48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00 [ 48.575614] ? finish_task_switch.isra.0+0x8d/0x260 [ 48.576523] __schedule+0x364/0xac0 [ 48.577535] schedule+0x2e/0x110 [ 48.578555] pipe_read+0x301/0x400 [ 48.579589] ? destroy_sched_domains_rcu+0x30/0x30 [ 48.579589] vfs_read+0x2b3/0x2f0 [ 48.579589] ksys_read+0x8b/0xc0 [ 48.583590] do_syscall_64+0x3d/0xc0 [ 48.583590] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 48.586525] RIP: 0033:0x7f2f28703fa1 [ 48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 [ 48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1 [ 48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006 [ 48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000 [ 48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0 [ 48.605527] </TASK> In the test, two processes are communicating through pipe. Further debugging with strace found that the above splat is triggered as read() syscall could not receive the data even if the corresponding write() syscall in another process successfully wrote data into the pipe. The failed subtest is "send_signal_perf". The corresponding perf event has sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every overflow event will trigger a call to the BPF program. So I suspect this may overwhelm the system. So I increased the sample_period to 100,000 and the test passed. The sample_period 10,000 still has the test failed. In other parts of selftest, e.g., [1], sample_freq is used instead. So I decided to use sample_freq = 1,000 since the test can pass as well. [1] https://lore.kernel.org/bpf/[email protected]/ Reported-by: Alexei Starovoitov <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 7015843 ] Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT configs. In this particular case, the base VM is AMD with 166 cpus, and I run selftests with regular qemu on top of that and indeed send_signal test failed. I also tried with an Intel box with 80 cpus and there is no issue. The main qemu command line includes: -enable-kvm -smp 16 -cpu host The failure log looks like: $ ./test_progs -t send_signal [ 48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225] [ 48.503622] Modules linked in: bpf_testmod(O) [ 48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G O 6.9.0-08561-g2c1713a8f1c9-dirty #69 [ 48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 [ 48.511635] RIP: 0010:handle_softirqs+0x71/0x290 [ 48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb [ 48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246 [ 48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0 [ 48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000 [ 48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000 [ 48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280 [ 48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 48.528525] FS: 00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000 [ 48.531600] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0 [ 48.537538] Call Trace: [ 48.537538] <IRQ> [ 48.537538] ? watchdog_timer_fn+0x1cd/0x250 [ 48.539590] ? lockup_detector_update_enable+0x50/0x50 [ 48.539590] ? __hrtimer_run_queues+0xff/0x280 [ 48.542520] ? hrtimer_interrupt+0x103/0x230 [ 48.544524] ? __sysvec_apic_timer_interrupt+0x4f/0x140 [ 48.545522] ? sysvec_apic_timer_interrupt+0x3a/0x90 [ 48.547612] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.547612] ? handle_softirqs+0x71/0x290 [ 48.547612] irq_exit_rcu+0x63/0x80 [ 48.551585] sysvec_apic_timer_interrupt+0x75/0x90 [ 48.552521] </IRQ> [ 48.553529] <TASK> [ 48.553529] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260 [ 48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74 [ 48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282 [ 48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620 [ 48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00 [ 48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000 [ 48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000 [ 48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00 [ 48.575614] ? finish_task_switch.isra.0+0x8d/0x260 [ 48.576523] __schedule+0x364/0xac0 [ 48.577535] schedule+0x2e/0x110 [ 48.578555] pipe_read+0x301/0x400 [ 48.579589] ? destroy_sched_domains_rcu+0x30/0x30 [ 48.579589] vfs_read+0x2b3/0x2f0 [ 48.579589] ksys_read+0x8b/0xc0 [ 48.583590] do_syscall_64+0x3d/0xc0 [ 48.583590] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 48.586525] RIP: 0033:0x7f2f28703fa1 [ 48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 [ 48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1 [ 48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006 [ 48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000 [ 48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0 [ 48.605527] </TASK> In the test, two processes are communicating through pipe. Further debugging with strace found that the above splat is triggered as read() syscall could not receive the data even if the corresponding write() syscall in another process successfully wrote data into the pipe. The failed subtest is "send_signal_perf". The corresponding perf event has sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every overflow event will trigger a call to the BPF program. So I suspect this may overwhelm the system. So I increased the sample_period to 100,000 and the test passed. The sample_period 10,000 still has the test failed. In other parts of selftest, e.g., [1], sample_freq is used instead. So I decided to use sample_freq = 1,000 since the test can pass as well. [1] https://lore.kernel.org/bpf/[email protected]/ Reported-by: Alexei Starovoitov <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 7015843 ] Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT configs. In this particular case, the base VM is AMD with 166 cpus, and I run selftests with regular qemu on top of that and indeed send_signal test failed. I also tried with an Intel box with 80 cpus and there is no issue. The main qemu command line includes: -enable-kvm -smp 16 -cpu host The failure log looks like: $ ./test_progs -t send_signal [ 48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225] [ 48.503622] Modules linked in: bpf_testmod(O) [ 48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G O 6.9.0-08561-g2c1713a8f1c9-dirty #69 [ 48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 [ 48.511635] RIP: 0010:handle_softirqs+0x71/0x290 [ 48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb [ 48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246 [ 48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0 [ 48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000 [ 48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000 [ 48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280 [ 48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 48.528525] FS: 00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000 [ 48.531600] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0 [ 48.537538] Call Trace: [ 48.537538] <IRQ> [ 48.537538] ? watchdog_timer_fn+0x1cd/0x250 [ 48.539590] ? lockup_detector_update_enable+0x50/0x50 [ 48.539590] ? __hrtimer_run_queues+0xff/0x280 [ 48.542520] ? hrtimer_interrupt+0x103/0x230 [ 48.544524] ? __sysvec_apic_timer_interrupt+0x4f/0x140 [ 48.545522] ? sysvec_apic_timer_interrupt+0x3a/0x90 [ 48.547612] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.547612] ? handle_softirqs+0x71/0x290 [ 48.547612] irq_exit_rcu+0x63/0x80 [ 48.551585] sysvec_apic_timer_interrupt+0x75/0x90 [ 48.552521] </IRQ> [ 48.553529] <TASK> [ 48.553529] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260 [ 48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74 [ 48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282 [ 48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620 [ 48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00 [ 48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000 [ 48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000 [ 48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00 [ 48.575614] ? finish_task_switch.isra.0+0x8d/0x260 [ 48.576523] __schedule+0x364/0xac0 [ 48.577535] schedule+0x2e/0x110 [ 48.578555] pipe_read+0x301/0x400 [ 48.579589] ? destroy_sched_domains_rcu+0x30/0x30 [ 48.579589] vfs_read+0x2b3/0x2f0 [ 48.579589] ksys_read+0x8b/0xc0 [ 48.583590] do_syscall_64+0x3d/0xc0 [ 48.583590] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 48.586525] RIP: 0033:0x7f2f28703fa1 [ 48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 [ 48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1 [ 48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006 [ 48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000 [ 48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0 [ 48.605527] </TASK> In the test, two processes are communicating through pipe. Further debugging with strace found that the above splat is triggered as read() syscall could not receive the data even if the corresponding write() syscall in another process successfully wrote data into the pipe. The failed subtest is "send_signal_perf". The corresponding perf event has sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every overflow event will trigger a call to the BPF program. So I suspect this may overwhelm the system. So I increased the sample_period to 100,000 and the test passed. The sample_period 10,000 still has the test failed. In other parts of selftest, e.g., [1], sample_freq is used instead. So I decided to use sample_freq = 1,000 since the test can pass as well. [1] https://lore.kernel.org/bpf/[email protected]/ Reported-by: Alexei Starovoitov <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 7015843 ] Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT configs. In this particular case, the base VM is AMD with 166 cpus, and I run selftests with regular qemu on top of that and indeed send_signal test failed. I also tried with an Intel box with 80 cpus and there is no issue. The main qemu command line includes: -enable-kvm -smp 16 -cpu host The failure log looks like: $ ./test_progs -t send_signal [ 48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225] [ 48.503622] Modules linked in: bpf_testmod(O) [ 48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G O 6.9.0-08561-g2c1713a8f1c9-dirty #69 [ 48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 [ 48.511635] RIP: 0010:handle_softirqs+0x71/0x290 [ 48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb [ 48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246 [ 48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0 [ 48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000 [ 48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000 [ 48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280 [ 48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 48.528525] FS: 00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000 [ 48.531600] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0 [ 48.537538] Call Trace: [ 48.537538] <IRQ> [ 48.537538] ? watchdog_timer_fn+0x1cd/0x250 [ 48.539590] ? lockup_detector_update_enable+0x50/0x50 [ 48.539590] ? __hrtimer_run_queues+0xff/0x280 [ 48.542520] ? hrtimer_interrupt+0x103/0x230 [ 48.544524] ? __sysvec_apic_timer_interrupt+0x4f/0x140 [ 48.545522] ? sysvec_apic_timer_interrupt+0x3a/0x90 [ 48.547612] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.547612] ? handle_softirqs+0x71/0x290 [ 48.547612] irq_exit_rcu+0x63/0x80 [ 48.551585] sysvec_apic_timer_interrupt+0x75/0x90 [ 48.552521] </IRQ> [ 48.553529] <TASK> [ 48.553529] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260 [ 48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74 [ 48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282 [ 48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620 [ 48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00 [ 48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000 [ 48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000 [ 48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00 [ 48.575614] ? finish_task_switch.isra.0+0x8d/0x260 [ 48.576523] __schedule+0x364/0xac0 [ 48.577535] schedule+0x2e/0x110 [ 48.578555] pipe_read+0x301/0x400 [ 48.579589] ? destroy_sched_domains_rcu+0x30/0x30 [ 48.579589] vfs_read+0x2b3/0x2f0 [ 48.579589] ksys_read+0x8b/0xc0 [ 48.583590] do_syscall_64+0x3d/0xc0 [ 48.583590] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 48.586525] RIP: 0033:0x7f2f28703fa1 [ 48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 [ 48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1 [ 48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006 [ 48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000 [ 48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0 [ 48.605527] </TASK> In the test, two processes are communicating through pipe. Further debugging with strace found that the above splat is triggered as read() syscall could not receive the data even if the corresponding write() syscall in another process successfully wrote data into the pipe. The failed subtest is "send_signal_perf". The corresponding perf event has sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every overflow event will trigger a call to the BPF program. So I suspect this may overwhelm the system. So I increased the sample_period to 100,000 and the test passed. The sample_period 10,000 still has the test failed. In other parts of selftest, e.g., [1], sample_freq is used instead. So I decided to use sample_freq = 1,000 since the test can pass as well. [1] https://lore.kernel.org/bpf/[email protected]/ Reported-by: Alexei Starovoitov <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
commit 409dc51 upstream. The flexspi on imx8ulp only has 16 LUTs, and imx8mm flexspi has 32 LUTs, so correct the compatible string here, otherwise will meet below error: [ 1.119072] ------------[ cut here ]------------ [ 1.123926] WARNING: CPU: 0 PID: 1 at drivers/spi/spi-nxp-fspi.c:855 nxp_fspi_exec_op+0xb04/0xb64 [ 1.133239] Modules linked in: [ 1.136448] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc6-next-20240902-00001-g131bf9439dd9 #69 [ 1.146821] Hardware name: NXP i.MX8ULP EVK (DT) [ 1.151647] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 1.158931] pc : nxp_fspi_exec_op+0xb04/0xb64 [ 1.163496] lr : nxp_fspi_exec_op+0xa34/0xb64 [ 1.168060] sp : ffff80008002b2a0 [ 1.171526] x29: ffff80008002b2d0 x28: 0000000000000000 x27: 0000000000000000 [ 1.179002] x26: ffff2eb645542580 x25: ffff800080610014 x24: ffff800080610000 [ 1.186480] x23: ffff2eb645548080 x22: 0000000000000006 x21: ffff2eb6455425e0 [ 1.193956] x20: 0000000000000000 x19: ffff80008002b5e0 x18: ffffffffffffffff [ 1.201432] x17: ffff2eb644467508 x16: 0000000000000138 x15: 0000000000000002 [ 1.208907] x14: 0000000000000000 x13: ffff2eb6400d8080 x12: 00000000ffffff00 [ 1.216378] x11: 0000000000000000 x10: ffff2eb6400d8080 x9 : ffff2eb697adca80 [ 1.223850] x8 : ffff2eb697ad3cc0 x7 : 0000000100000000 x6 : 0000000000000001 [ 1.231324] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000007a6 [ 1.238795] x2 : 0000000000000000 x1 : 00000000000001ce x0 : 00000000ffffff92 [ 1.246267] Call trace: [ 1.248824] nxp_fspi_exec_op+0xb04/0xb64 [ 1.253031] spi_mem_exec_op+0x3a0/0x430 [ 1.257139] spi_nor_read_id+0x80/0xcc [ 1.261065] spi_nor_scan+0x1ec/0xf10 [ 1.264901] spi_nor_probe+0x108/0x2fc [ 1.268828] spi_mem_probe+0x6c/0xbc [ 1.272574] spi_probe+0x84/0xe4 [ 1.275958] really_probe+0xbc/0x29c [ 1.279713] __driver_probe_device+0x78/0x12c [ 1.284277] driver_probe_device+0xd8/0x15c [ 1.288660] __device_attach_driver+0xb8/0x134 [ 1.293316] bus_for_each_drv+0x88/0xe8 [ 1.297337] __device_attach+0xa0/0x190 [ 1.301353] device_initial_probe+0x14/0x20 [ 1.305734] bus_probe_device+0xac/0xb0 [ 1.309752] device_add+0x5d0/0x790 [ 1.313408] __spi_add_device+0x134/0x204 [ 1.317606] of_register_spi_device+0x3b4/0x590 [ 1.322348] spi_register_controller+0x47c/0x754 [ 1.327181] devm_spi_register_controller+0x4c/0xa4 [ 1.332289] nxp_fspi_probe+0x1cc/0x2b0 [ 1.336307] platform_probe+0x68/0xc4 [ 1.340145] really_probe+0xbc/0x29c [ 1.343893] __driver_probe_device+0x78/0x12c [ 1.348457] driver_probe_device+0xd8/0x15c [ 1.352838] __driver_attach+0x90/0x19c [ 1.356857] bus_for_each_dev+0x7c/0xdc [ 1.360877] driver_attach+0x24/0x30 [ 1.364624] bus_add_driver+0xe4/0x208 [ 1.368552] driver_register+0x5c/0x124 [ 1.372573] __platform_driver_register+0x28/0x34 [ 1.377497] nxp_fspi_driver_init+0x1c/0x28 [ 1.381888] do_one_initcall+0x80/0x1c8 [ 1.385908] kernel_init_freeable+0x1c4/0x28c [ 1.390472] kernel_init+0x20/0x1d8 [ 1.394138] ret_from_fork+0x10/0x20 [ 1.397885] ---[ end trace 0000000000000000 ]--- [ 1.407908] ------------[ cut here ]------------ Fixes: ef89fd5 ("arm64: dts: imx8ulp: add flexspi node") Cc: [email protected] Signed-off-by: Haibo Chen <[email protected]> Signed-off-by: Shawn Guo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 409dc51 upstream. The flexspi on imx8ulp only has 16 LUTs, and imx8mm flexspi has 32 LUTs, so correct the compatible string here, otherwise will meet below error: [ 1.119072] ------------[ cut here ]------------ [ 1.123926] WARNING: CPU: 0 PID: 1 at drivers/spi/spi-nxp-fspi.c:855 nxp_fspi_exec_op+0xb04/0xb64 [ 1.133239] Modules linked in: [ 1.136448] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc6-next-20240902-00001-g131bf9439dd9 #69 [ 1.146821] Hardware name: NXP i.MX8ULP EVK (DT) [ 1.151647] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 1.158931] pc : nxp_fspi_exec_op+0xb04/0xb64 [ 1.163496] lr : nxp_fspi_exec_op+0xa34/0xb64 [ 1.168060] sp : ffff80008002b2a0 [ 1.171526] x29: ffff80008002b2d0 x28: 0000000000000000 x27: 0000000000000000 [ 1.179002] x26: ffff2eb645542580 x25: ffff800080610014 x24: ffff800080610000 [ 1.186480] x23: ffff2eb645548080 x22: 0000000000000006 x21: ffff2eb6455425e0 [ 1.193956] x20: 0000000000000000 x19: ffff80008002b5e0 x18: ffffffffffffffff [ 1.201432] x17: ffff2eb644467508 x16: 0000000000000138 x15: 0000000000000002 [ 1.208907] x14: 0000000000000000 x13: ffff2eb6400d8080 x12: 00000000ffffff00 [ 1.216378] x11: 0000000000000000 x10: ffff2eb6400d8080 x9 : ffff2eb697adca80 [ 1.223850] x8 : ffff2eb697ad3cc0 x7 : 0000000100000000 x6 : 0000000000000001 [ 1.231324] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000007a6 [ 1.238795] x2 : 0000000000000000 x1 : 00000000000001ce x0 : 00000000ffffff92 [ 1.246267] Call trace: [ 1.248824] nxp_fspi_exec_op+0xb04/0xb64 [ 1.253031] spi_mem_exec_op+0x3a0/0x430 [ 1.257139] spi_nor_read_id+0x80/0xcc [ 1.261065] spi_nor_scan+0x1ec/0xf10 [ 1.264901] spi_nor_probe+0x108/0x2fc [ 1.268828] spi_mem_probe+0x6c/0xbc [ 1.272574] spi_probe+0x84/0xe4 [ 1.275958] really_probe+0xbc/0x29c [ 1.279713] __driver_probe_device+0x78/0x12c [ 1.284277] driver_probe_device+0xd8/0x15c [ 1.288660] __device_attach_driver+0xb8/0x134 [ 1.293316] bus_for_each_drv+0x88/0xe8 [ 1.297337] __device_attach+0xa0/0x190 [ 1.301353] device_initial_probe+0x14/0x20 [ 1.305734] bus_probe_device+0xac/0xb0 [ 1.309752] device_add+0x5d0/0x790 [ 1.313408] __spi_add_device+0x134/0x204 [ 1.317606] of_register_spi_device+0x3b4/0x590 [ 1.322348] spi_register_controller+0x47c/0x754 [ 1.327181] devm_spi_register_controller+0x4c/0xa4 [ 1.332289] nxp_fspi_probe+0x1cc/0x2b0 [ 1.336307] platform_probe+0x68/0xc4 [ 1.340145] really_probe+0xbc/0x29c [ 1.343893] __driver_probe_device+0x78/0x12c [ 1.348457] driver_probe_device+0xd8/0x15c [ 1.352838] __driver_attach+0x90/0x19c [ 1.356857] bus_for_each_dev+0x7c/0xdc [ 1.360877] driver_attach+0x24/0x30 [ 1.364624] bus_add_driver+0xe4/0x208 [ 1.368552] driver_register+0x5c/0x124 [ 1.372573] __platform_driver_register+0x28/0x34 [ 1.377497] nxp_fspi_driver_init+0x1c/0x28 [ 1.381888] do_one_initcall+0x80/0x1c8 [ 1.385908] kernel_init_freeable+0x1c4/0x28c [ 1.390472] kernel_init+0x20/0x1d8 [ 1.394138] ret_from_fork+0x10/0x20 [ 1.397885] ---[ end trace 0000000000000000 ]--- [ 1.407908] ------------[ cut here ]------------ Fixes: ef89fd5 ("arm64: dts: imx8ulp: add flexspi node") Cc: [email protected] Signed-off-by: Haibo Chen <[email protected]> Signed-off-by: Shawn Guo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
The flexspi on imx8ulp only has 16 LUTs, and imx8mm flexspi has 32 LUTs, so correct the compatible string here, otherwise will meet below error: [ 1.119072] ------------[ cut here ]------------ [ 1.123926] WARNING: CPU: 0 PID: 1 at drivers/spi/spi-nxp-fspi.c:855 nxp_fspi_exec_op+0xb04/0xb64 [ 1.133239] Modules linked in: [ 1.136448] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc6-next-20240902-00001-g131bf9439dd9 #69 [ 1.146821] Hardware name: NXP i.MX8ULP EVK (DT) [ 1.151647] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 1.158931] pc : nxp_fspi_exec_op+0xb04/0xb64 [ 1.163496] lr : nxp_fspi_exec_op+0xa34/0xb64 [ 1.168060] sp : ffff80008002b2a0 [ 1.171526] x29: ffff80008002b2d0 x28: 0000000000000000 x27: 0000000000000000 [ 1.179002] x26: ffff2eb645542580 x25: ffff800080610014 x24: ffff800080610000 [ 1.186480] x23: ffff2eb645548080 x22: 0000000000000006 x21: ffff2eb6455425e0 [ 1.193956] x20: 0000000000000000 x19: ffff80008002b5e0 x18: ffffffffffffffff [ 1.201432] x17: ffff2eb644467508 x16: 0000000000000138 x15: 0000000000000002 [ 1.208907] x14: 0000000000000000 x13: ffff2eb6400d8080 x12: 00000000ffffff00 [ 1.216378] x11: 0000000000000000 x10: ffff2eb6400d8080 x9 : ffff2eb697adca80 [ 1.223850] x8 : ffff2eb697ad3cc0 x7 : 0000000100000000 x6 : 0000000000000001 [ 1.231324] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000007a6 [ 1.238795] x2 : 0000000000000000 x1 : 00000000000001ce x0 : 00000000ffffff92 [ 1.246267] Call trace: [ 1.248824] nxp_fspi_exec_op+0xb04/0xb64 [ 1.253031] spi_mem_exec_op+0x3a0/0x430 [ 1.257139] spi_nor_read_id+0x80/0xcc [ 1.261065] spi_nor_scan+0x1ec/0xf10 [ 1.264901] spi_nor_probe+0x108/0x2fc [ 1.268828] spi_mem_probe+0x6c/0xbc [ 1.272574] spi_probe+0x84/0xe4 [ 1.275958] really_probe+0xbc/0x29c [ 1.279713] __driver_probe_device+0x78/0x12c [ 1.284277] driver_probe_device+0xd8/0x15c [ 1.288660] __device_attach_driver+0xb8/0x134 [ 1.293316] bus_for_each_drv+0x88/0xe8 [ 1.297337] __device_attach+0xa0/0x190 [ 1.301353] device_initial_probe+0x14/0x20 [ 1.305734] bus_probe_device+0xac/0xb0 [ 1.309752] device_add+0x5d0/0x790 [ 1.313408] __spi_add_device+0x134/0x204 [ 1.317606] of_register_spi_device+0x3b4/0x590 [ 1.322348] spi_register_controller+0x47c/0x754 [ 1.327181] devm_spi_register_controller+0x4c/0xa4 [ 1.332289] nxp_fspi_probe+0x1cc/0x2b0 [ 1.336307] platform_probe+0x68/0xc4 [ 1.340145] really_probe+0xbc/0x29c [ 1.343893] __driver_probe_device+0x78/0x12c [ 1.348457] driver_probe_device+0xd8/0x15c [ 1.352838] __driver_attach+0x90/0x19c [ 1.356857] bus_for_each_dev+0x7c/0xdc [ 1.360877] driver_attach+0x24/0x30 [ 1.364624] bus_add_driver+0xe4/0x208 [ 1.368552] driver_register+0x5c/0x124 [ 1.372573] __platform_driver_register+0x28/0x34 [ 1.377497] nxp_fspi_driver_init+0x1c/0x28 [ 1.381888] do_one_initcall+0x80/0x1c8 [ 1.385908] kernel_init_freeable+0x1c4/0x28c [ 1.390472] kernel_init+0x20/0x1d8 [ 1.394138] ret_from_fork+0x10/0x20 [ 1.397885] ---[ end trace 0000000000000000 ]--- [ 1.407908] ------------[ cut here ]------------ Fixes: ef89fd5 ("arm64: dts: imx8ulp: add flexspi node") Cc: [email protected] Signed-off-by: Haibo Chen <[email protected]> Signed-off-by: Shawn Guo <[email protected]>
Bug can be reproduced by building the AUR package
blender-git
with the following patch applied to its PKGBUILD (to make it build in Debug mode):https://gist.github.com/Stephen-Seo/88b7453a47e38f7cecf26dfde43ecf58
The following appears in the build log at error:
https://gist.github.com/Stephen-Seo/897d12b778ab44c304ae535a23635871
The bug can be reproduced without rebuilding by going into "$srcdir/build/bin/" and running the "datatoc" binary with no arguments within that directory. The source of this file is located at "$srcdir/blender/source/blender/datatoc/datattoc.c".
The
datatoc
binary builds and runs fine on stock "linux" kernel, but not on "linux-hardened", even if the binary was built when booted into the stock "linux" kernel.EDIT: The relevant entry in "compile_commands.json" generated by cmake is:
(one may have to change some directory paths to get this to work. Also, this program appears to have no dependencies, so one should be able to compile it directly into an executable instead of an object file.)
The text was updated successfully, but these errors were encountered: