Filtered by vendor Linux
Subscriptions
Total
14339 CVE
CVE | Vendors | Products | Updated | CVSS v3.1 |
---|---|---|---|---|
CVE-2023-53669 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 7.0 High |
In the Linux kernel, the following vulnerability has been resolved: tcp: fix skb_copy_ubufs() vs BIG TCP David Ahern reported crashes in skb_copy_ubufs() caused by TCP tx zerocopy using hugepages, and skb length bigger than ~68 KB. skb_copy_ubufs() assumed it could copy all payload using up to MAX_SKB_FRAGS order-0 pages. This assumption broke when BIG TCP was able to put up to 512 KB per skb. We did not hit this bug at Google because we use CONFIG_MAX_SKB_FRAGS=45 and limit gso_max_size to 180000. A solution is to use higher order pages if needed. v2: add missing __GFP_COMP, or we leak memory. | ||||
CVE-2023-53671 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 5.5 Medium |
In the Linux kernel, the following vulnerability has been resolved: srcu: Delegate work to the boot cpu if using SRCU_SIZE_SMALL Commit 994f706872e6 ("srcu: Make Tree SRCU able to operate without snp_node array") assumes that cpu 0 is always online. However, there really are situations when some other CPU is the boot CPU, for example, when booting a kdump kernel with the maxcpus=1 boot parameter. On PowerPC, the kdump kernel can hang as follows: ... [ 1.740036] systemd[1]: Hostname set to <xyz.com> [ 243.686240] INFO: task systemd:1 blocked for more than 122 seconds. [ 243.686264] Not tainted 6.1.0-rc1 #1 [ 243.686272] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.686281] task:systemd state:D stack:0 pid:1 ppid:0 flags:0x00042000 [ 243.686296] Call Trace: [ 243.686301] [c000000016657640] [c000000016657670] 0xc000000016657670 (unreliable) [ 243.686317] [c000000016657830] [c00000001001dec0] __switch_to+0x130/0x220 [ 243.686333] [c000000016657890] [c000000010f607b8] __schedule+0x1f8/0x580 [ 243.686347] [c000000016657940] [c000000010f60bb4] schedule+0x74/0x140 [ 243.686361] [c0000000166579b0] [c000000010f699b8] schedule_timeout+0x168/0x1c0 [ 243.686374] [c000000016657a80] [c000000010f61de8] __wait_for_common+0x148/0x360 [ 243.686387] [c000000016657b20] [c000000010176bb0] __flush_work.isra.0+0x1c0/0x3d0 [ 243.686401] [c000000016657bb0] [c0000000105f2768] fsnotify_wait_marks_destroyed+0x28/0x40 [ 243.686415] [c000000016657bd0] [c0000000105f21b8] fsnotify_destroy_group+0x68/0x160 [ 243.686428] [c000000016657c40] [c0000000105f6500] inotify_release+0x30/0xa0 [ 243.686440] [c000000016657cb0] [c0000000105751a8] __fput+0xc8/0x350 [ 243.686452] [c000000016657d00] [c00000001017d524] task_work_run+0xe4/0x170 [ 243.686464] [c000000016657d50] [c000000010020e94] do_notify_resume+0x134/0x140 [ 243.686478] [c000000016657d80] [c00000001002eb18] interrupt_exit_user_prepare_main+0x198/0x270 [ 243.686493] [c000000016657de0] [c00000001002ec60] syscall_exit_prepare+0x70/0x180 [ 243.686505] [c000000016657e10] [c00000001000bf7c] system_call_vectored_common+0xfc/0x280 [ 243.686520] --- interrupt: 3000 at 0x7fffa47d5ba4 [ 243.686528] NIP: 00007fffa47d5ba4 LR: 0000000000000000 CTR: 0000000000000000 [ 243.686538] REGS: c000000016657e80 TRAP: 3000 Not tainted (6.1.0-rc1) [ 243.686548] MSR: 800000000000d033 <SF,EE,PR,ME,IR,DR,RI,LE> CR: 42044440 XER: 00000000 [ 243.686572] IRQMASK: 0 [ 243.686572] GPR00: 0000000000000006 00007ffffa606710 00007fffa48e7200 0000000000000000 [ 243.686572] GPR04: 0000000000000002 000000000000000a 0000000000000000 0000000000000001 [ 243.686572] GPR08: 000001000c172dd0 0000000000000000 0000000000000000 0000000000000000 [ 243.686572] GPR12: 0000000000000000 00007fffa4ff4bc0 0000000000000000 0000000000000000 [ 243.686572] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 243.686572] GPR20: 0000000132dfdc50 000000000000000e 0000000000189375 0000000000000000 [ 243.686572] GPR24: 00007ffffa606ae0 0000000000000005 000001000c185490 000001000c172570 [ 243.686572] GPR28: 000001000c172990 000001000c184850 000001000c172e00 00007fffa4fedd98 [ 243.686683] NIP [00007fffa47d5ba4] 0x7fffa47d5ba4 [ 243.686691] LR [0000000000000000] 0x0 [ 243.686698] --- interrupt: 3000 [ 243.686708] INFO: task kworker/u16:1:24 blocked for more than 122 seconds. [ 243.686717] Not tainted 6.1.0-rc1 #1 [ 243.686724] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.686733] task:kworker/u16:1 state:D stack:0 pid:24 ppid:2 flags:0x00000800 [ 243.686747] Workqueue: events_unbound fsnotify_mark_destroy_workfn [ 243.686758] Call Trace: [ 243.686762] [c0000000166736e0] [c00000004fd91000] 0xc00000004fd91000 (unreliable) [ 243.686775] [c0000000166738d0] [c00000001001dec0] __switch_to+0x130/0x220 [ 243.686788] [c000000016673930] [c000000010f607b8] __schedule+0x1f8/0x ---truncated--- | ||||
CVE-2023-53672 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 5.5 Medium |
In the Linux kernel, the following vulnerability has been resolved: btrfs: output extra debug info if we failed to find an inline backref [BUG] Syzbot reported several warning triggered inside lookup_inline_extent_backref(). [CAUSE] As usual, the reproducer doesn't reliably trigger locally here, but at least we know the WARN_ON() is triggered when an inline backref can not be found, and it can only be triggered when @insert is true. (I.e. inserting a new inline backref, which means the backref should already exist) [ENHANCEMENT] After the WARN_ON(), dump all the parameters and the extent tree leaf to help debug. | ||||
CVE-2023-53674 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 5.5 Medium |
In the Linux kernel, the following vulnerability has been resolved: clk: Fix memory leak in devm_clk_notifier_register() devm_clk_notifier_register() allocates a devres resource for clk notifier but didn't register that to the device, so the notifier didn't get unregistered on device detach and the allocated resource was leaked. Fix the issue by registering the resource through devres_add(). This issue was found with kmemleak on a Chromebook. | ||||
CVE-2023-53675 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 7.0 High |
In the Linux kernel, the following vulnerability has been resolved: scsi: ses: Fix possible desc_ptr out-of-bounds accesses Sanitize possible desc_ptr out-of-bounds accesses in ses_enclosure_data_process(). | ||||
CVE-2023-53676 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 7.0 High |
In the Linux kernel, the following vulnerability has been resolved: scsi: target: iscsi: Fix buffer overflow in lio_target_nacl_info_show() The function lio_target_nacl_info_show() uses sprintf() in a loop to print details for every iSCSI connection in a session without checking for the buffer length. With enough iSCSI connections it's possible to overflow the buffer provided by configfs and corrupt the memory. This patch replaces sprintf() with sysfs_emit_at() that checks for buffer boundries. | ||||
CVE-2023-53677 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 5.5 Medium |
In the Linux kernel, the following vulnerability has been resolved: drm/i915: Fix memory leaks in i915 selftests This patch fixes memory leaks on error escapes in function fake_get_pages (cherry picked from commit 8bfbdadce85c4c51689da10f39c805a7106d4567) | ||||
CVE-2023-53680 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 7.0 High |
In the Linux kernel, the following vulnerability has been resolved: NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGAL OPDESC() simply indexes into nfsd4_ops[] by the op's operation number, without range checking that value. It assumes callers are careful to avoid calling it with an out-of-bounds opnum value. nfsd4_decode_compound() is not so careful, and can invoke OPDESC() with opnum set to OP_ILLEGAL, which is 10044 -- well beyond the end of nfsd4_ops[]. | ||||
CVE-2023-53681 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 5.5 Medium |
In the Linux kernel, the following vulnerability has been resolved: bcache: Fix __bch_btree_node_alloc to make the failure behavior consistent In some specific situations, the return value of __bch_btree_node_alloc may be NULL. This may lead to a potential NULL pointer dereference in caller function like a calling chain : btree_split->bch_btree_node_alloc->__bch_btree_node_alloc. Fix it by initializing the return value in __bch_btree_node_alloc. | ||||
CVE-2023-53682 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 5.5 Medium |
In the Linux kernel, the following vulnerability has been resolved: hwmon: (xgene) Fix ioremap and memremap leak Smatch reports: drivers/hwmon/xgene-hwmon.c:757 xgene_hwmon_probe() warn: 'ctx->pcc_comm_addr' from ioremap() not released on line: 757. This is because in drivers/hwmon/xgene-hwmon.c:701 xgene_hwmon_probe(), ioremap and memremap is not released, which may cause a leak. To fix this, ioremap and memremap is modified to devm_ioremap and devm_memremap. [groeck: Fixed formatting and subject] | ||||
CVE-2023-53683 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 5.5 Medium |
In the Linux kernel, the following vulnerability has been resolved: fs: hfsplus: remove WARN_ON() from hfsplus_cat_{read,write}_inode() syzbot is hitting WARN_ON() in hfsplus_cat_{read,write}_inode(), for crafted filesystem image can contain bogus length. There conditions are not kernel bugs that can justify kernel to panic. | ||||
CVE-2023-53684 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 7.0 High |
In the Linux kernel, the following vulnerability has been resolved: xfrm: Zero padding when dumping algos and encap When copying data to user-space we should ensure that only valid data is copied over. Padding in structures may be filled with random (possibly sensitve) data and should never be given directly to user-space. This patch fixes the copying of xfrm algorithms and the encap template in xfrm_user so that padding is zeroed. | ||||
CVE-2023-53687 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 5.5 Medium |
In the Linux kernel, the following vulnerability has been resolved: tty: serial: samsung_tty: Fix a memory leak in s3c24xx_serial_getclk() when iterating clk When the best clk is searched, we iterate over all possible clk. If we find a better match, the previous one, if any, needs to be freed. If a better match has already been found, we still need to free the new one, otherwise it leaks. | ||||
CVE-2023-53648 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 7.0 High |
In the Linux kernel, the following vulnerability has been resolved: ALSA: ac97: Fix possible NULL dereference in snd_ac97_mixer smatch error: sound/pci/ac97/ac97_codec.c:2354 snd_ac97_mixer() error: we previously assumed 'rac97' could be null (see line 2072) remove redundant assignment, return error if rac97 is NULL. | ||||
CVE-2023-53644 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 5.5 Medium |
In the Linux kernel, the following vulnerability has been resolved: media: radio-shark: Add endpoint checks The syzbot fuzzer was able to provoke a WARNING from the radio-shark2 driver: ------------[ cut here ]------------ usb 1-1: BOGUS urb xfer, pipe 1 != type 3 WARNING: CPU: 0 PID: 3271 at drivers/usb/core/urb.c:504 usb_submit_urb+0xed2/0x1880 drivers/usb/core/urb.c:504 Modules linked in: CPU: 0 PID: 3271 Comm: kworker/0:3 Not tainted 6.1.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 Workqueue: usb_hub_wq hub_event RIP: 0010:usb_submit_urb+0xed2/0x1880 drivers/usb/core/urb.c:504 Code: 7c 24 18 e8 00 36 ea fb 48 8b 7c 24 18 e8 36 1c 02 ff 41 89 d8 44 89 e1 4c 89 ea 48 89 c6 48 c7 c7 a0 b6 90 8a e8 9a 29 b8 03 <0f> 0b e9 58 f8 ff ff e8 d2 35 ea fb 48 81 c5 c0 05 00 00 e9 84 f7 RSP: 0018:ffffc90003876dd0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000 RDX: ffff8880750b0040 RSI: ffffffff816152b8 RDI: fffff5200070edac RBP: ffff8880172d81e0 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000080000000 R11: 0000000000000000 R12: 0000000000000001 R13: ffff8880285c5040 R14: 0000000000000002 R15: ffff888017158200 FS: 0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffe03235b90 CR3: 000000000bc8e000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> usb_start_wait_urb+0x101/0x4b0 drivers/usb/core/message.c:58 usb_bulk_msg+0x226/0x550 drivers/usb/core/message.c:387 shark_write_reg+0x1ff/0x2e0 drivers/media/radio/radio-shark2.c:88 ... The problem was caused by the fact that the driver does not check whether the endpoints it uses are actually present and have the appropriate types. This can be fixed by adding a simple check of these endpoints (and similarly for the radio-shark driver). | ||||
CVE-2023-53663 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 7.0 High |
In the Linux kernel, the following vulnerability has been resolved: KVM: nSVM: Check instead of asserting on nested TSC scaling support Check for nested TSC scaling support on nested SVM VMRUN instead of asserting that TSC scaling is exposed to L1 if L1's MSR_AMD64_TSC_RATIO has diverged from KVM's default. Userspace can trigger the WARN at will by writing the MSR and then updating guest CPUID to hide the feature (modifying guest CPUID is allowed anytime before KVM_RUN). E.g. hacking KVM's state_test selftest to do vcpu_set_msr(vcpu, MSR_AMD64_TSC_RATIO, 0); vcpu_clear_cpuid_feature(vcpu, X86_FEATURE_TSCRATEMSR); after restoring state in a new VM+vCPU yields an endless supply of: ------------[ cut here ]------------ WARNING: CPU: 164 PID: 62565 at arch/x86/kvm/svm/nested.c:699 nested_vmcb02_prepare_control+0x3d6/0x3f0 [kvm_amd] Call Trace: <TASK> enter_svm_guest_mode+0x114/0x560 [kvm_amd] nested_svm_vmrun+0x260/0x330 [kvm_amd] vmrun_interception+0x29/0x30 [kvm_amd] svm_invoke_exit_handler+0x35/0x100 [kvm_amd] svm_handle_exit+0xe7/0x180 [kvm_amd] kvm_arch_vcpu_ioctl_run+0x1eab/0x2570 [kvm] kvm_vcpu_ioctl+0x4c9/0x5b0 [kvm] __se_sys_ioctl+0x7a/0xc0 __x64_sys_ioctl+0x21/0x30 do_syscall_64+0x41/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x45ca1b Note, the nested #VMEXIT path has the same flaw, but needs a different fix and will be handled separately. | ||||
CVE-2023-53679 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 7.0 High |
In the Linux kernel, the following vulnerability has been resolved: wifi: mt7601u: fix an integer underflow Fix an integer underflow that leads to a null pointer dereference in 'mt7601u_rx_skb_from_seg()'. The variable 'dma_len' in the URB packet could be manipulated, which could trigger an integer underflow of 'seg_len' in 'mt7601u_rx_process_seg()'. This underflow subsequently causes the 'bad_frame' checks in 'mt7601u_rx_skb_from_seg()' to be bypassed, eventually leading to a dereference of the pointer 'p', which is a null pointer. Ensure that 'dma_len' is greater than 'min_seg_len'. Found by a modified version of syzkaller. KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 PID: 12 Comm: ksoftirqd/0 Tainted: G W O 5.14.0+ #139 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 RIP: 0010:skb_add_rx_frag+0x143/0x370 Code: e2 07 83 c2 03 38 ca 7c 08 84 c9 0f 85 86 01 00 00 4c 8d 7d 08 44 89 68 08 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 cd 01 00 00 48 8b 45 08 a8 01 0f 85 3d 01 00 00 RSP: 0018:ffffc900000cfc90 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888115520dc0 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff8881118430c0 RDI: ffff8881118430f8 RBP: 0000000000000000 R08: 0000000000000e09 R09: 0000000000000010 R10: ffff888111843017 R11: ffffed1022308602 R12: 0000000000000000 R13: 0000000000000e09 R14: 0000000000000010 R15: 0000000000000008 FS: 0000000000000000(0000) GS:ffff88811a800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000004035af40 CR3: 00000001157f2000 CR4: 0000000000750ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: mt7601u_rx_tasklet+0xc73/0x1270 ? mt7601u_submit_rx_buf.isra.0+0x510/0x510 ? tasklet_action_common.isra.0+0x79/0x2f0 tasklet_action_common.isra.0+0x206/0x2f0 __do_softirq+0x1b5/0x880 ? tasklet_unlock+0x30/0x30 run_ksoftirqd+0x26/0x50 smpboot_thread_fn+0x34f/0x7d0 ? smpboot_register_percpu_thread+0x370/0x370 kthread+0x3a1/0x480 ? set_kthread_struct+0x120/0x120 ret_from_fork+0x1f/0x30 Modules linked in: 88XXau(O) 88x2bu(O) ---[ end trace 57f34f93b4da0f9b ]--- RIP: 0010:skb_add_rx_frag+0x143/0x370 Code: e2 07 83 c2 03 38 ca 7c 08 84 c9 0f 85 86 01 00 00 4c 8d 7d 08 44 89 68 08 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 cd 01 00 00 48 8b 45 08 a8 01 0f 85 3d 01 00 00 RSP: 0018:ffffc900000cfc90 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888115520dc0 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff8881118430c0 RDI: ffff8881118430f8 RBP: 0000000000000000 R08: 0000000000000e09 R09: 0000000000000010 R10: ffff888111843017 R11: ffffed1022308602 R12: 0000000000000000 R13: 0000000000000e09 R14: 0000000000000010 R15: 0000000000000008 FS: 0000000000000000(0000) GS:ffff88811a800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000004035af40 CR3: 00000001157f2000 CR4: 0000000000750ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 | ||||
CVE-2023-53641 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 7.0 High |
In the Linux kernel, the following vulnerability has been resolved: wifi: ath9k: hif_usb: fix memory leak of remain_skbs hif_dev->remain_skb is allocated and used exclusively in ath9k_hif_usb_rx_stream(). It is implied that an allocated remain_skb is processed and subsequently freed (in error paths) only during the next call of ath9k_hif_usb_rx_stream(). So, if the urbs are deallocated between those two calls due to the device deinitialization or suspend, it is possible that ath9k_hif_usb_rx_stream() is not called next time and the allocated remain_skb is leaked. Our local Syzkaller instance was able to trigger that. remain_skb makes sense when receiving two consecutive urbs which are logically linked together, i.e. a specific data field from the first skb indicates a cached skb to be allocated, memcpy'd with some data and subsequently processed in the next call to ath9k_hif_usb_rx_stream(). Urbs deallocation supposedly makes that link irrelevant so we need to free the cached skb in those cases. Fix the leak by introducing a function to explicitly free remain_skb (if it is not NULL) when the rx urbs have been deallocated. remain_skb is NULL when it has not been allocated at all (hif_dev struct is kzalloced) or when it has been processed in next call to ath9k_hif_usb_rx_stream(). Found by Linux Verification Center (linuxtesting.org) with Syzkaller. | ||||
CVE-2023-53660 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 5.5 Medium |
In the Linux kernel, the following vulnerability has been resolved: bpf, cpumap: Handle skb as well when clean up ptr_ring The following warning was reported when running xdp_redirect_cpu with both skb-mode and stress-mode enabled: ------------[ cut here ]------------ Incorrect XDP memory type (-2128176192) usage WARNING: CPU: 7 PID: 1442 at net/core/xdp.c:405 Modules linked in: CPU: 7 PID: 1442 Comm: kworker/7:0 Tainted: G 6.5.0-rc2+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Workqueue: events __cpu_map_entry_free RIP: 0010:__xdp_return+0x1e4/0x4a0 ...... Call Trace: <TASK> ? show_regs+0x65/0x70 ? __warn+0xa5/0x240 ? __xdp_return+0x1e4/0x4a0 ...... xdp_return_frame+0x4d/0x150 __cpu_map_entry_free+0xf9/0x230 process_one_work+0x6b0/0xb80 worker_thread+0x96/0x720 kthread+0x1a5/0x1f0 ret_from_fork+0x3a/0x70 ret_from_fork_asm+0x1b/0x30 </TASK> The reason for the warning is twofold. One is due to the kthread cpu_map_kthread_run() is stopped prematurely. Another one is __cpu_map_ring_cleanup() doesn't handle skb mode and treats skbs in ptr_ring as XDP frames. Prematurely-stopped kthread will be fixed by the preceding patch and ptr_ring will be empty when __cpu_map_ring_cleanup() is called. But as the comments in __cpu_map_ring_cleanup() said, handling and freeing skbs in ptr_ring as well to "catch any broken behaviour gracefully". | ||||
CVE-2023-53668 | 1 Linux | 1 Linux Kernel | 2025-10-08 | 7.0 High |
In the Linux kernel, the following vulnerability has been resolved: ring-buffer: Fix deadloop issue on reading trace_pipe Soft lockup occurs when reading file 'trace_pipe': watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [cat:4488] [...] RIP: 0010:ring_buffer_empty_cpu+0xed/0x170 RSP: 0018:ffff88810dd6fc48 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000000000246 RCX: ffffffff93d1aaeb RDX: ffff88810a280040 RSI: 0000000000000008 RDI: ffff88811164b218 RBP: ffff88811164b218 R08: 0000000000000000 R09: ffff88815156600f R10: ffffed102a2acc01 R11: 0000000000000001 R12: 0000000051651901 R13: 0000000000000000 R14: ffff888115e49500 R15: 0000000000000000 [...] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8d853c2000 CR3: 000000010dcd8000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __find_next_entry+0x1a8/0x4b0 ? peek_next_entry+0x250/0x250 ? down_write+0xa5/0x120 ? down_write_killable+0x130/0x130 trace_find_next_entry_inc+0x3b/0x1d0 tracing_read_pipe+0x423/0xae0 ? tracing_splice_read_pipe+0xcb0/0xcb0 vfs_read+0x16b/0x490 ksys_read+0x105/0x210 ? __ia32_sys_pwrite64+0x200/0x200 ? switch_fpu_return+0x108/0x220 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x61/0xc6 Through the vmcore, I found it's because in tracing_read_pipe(), ring_buffer_empty_cpu() found some buffer is not empty but then it cannot read anything due to "rb_num_of_entries() == 0" always true, Then it infinitely loop the procedure due to user buffer not been filled, see following code path: tracing_read_pipe() { ... ... waitagain: tracing_wait_pipe() // 1. find non-empty buffer here trace_find_next_entry_inc() // 2. loop here try to find an entry __find_next_entry() ring_buffer_empty_cpu(); // 3. find non-empty buffer peek_next_entry() // 4. but peek always return NULL ring_buffer_peek() rb_buffer_peek() rb_get_reader_page() // 5. because rb_num_of_entries() == 0 always true here // then return NULL // 6. user buffer not been filled so goto 'waitgain' // and eventually leads to an deadloop in kernel!!! } By some analyzing, I found that when resetting ringbuffer, the 'entries' of its pages are not all cleared (see rb_reset_cpu()). Then when reducing the ringbuffer, and if some reduced pages exist dirty 'entries' data, they will be added into 'cpu_buffer->overrun' (see rb_remove_pages()), which cause wrong 'overrun' count and eventually cause the deadloop issue. To fix it, we need to clear every pages in rb_reset_cpu(). |