Changelog in Linux kernel 6.15.9

 
ALSA: hda/realtek - Add mute LED support for HP Pavilion 15-eg0xxx [+ + +]
Author: Dawid Rezler <dawidrezler.patches@gmail.com>
Date:   Sun Jul 20 17:49:08 2025 +0200

    ALSA: hda/realtek - Add mute LED support for HP Pavilion 15-eg0xxx
    
    commit 9744ede7099e8a69c04aa23fbea44c15bc390c04 upstream.
    
    The mute LED on the HP Pavilion Laptop 15-eg0xxx,
    which uses the ALC287 codec, didn't work.
    This patch fixes the issue by enabling the ALC287_FIXUP_HP_GPIO_LED quirk.
    
    Tested on a physical device, the LED now works as intended.
    
    Signed-off-by: Dawid Rezler <dawidrezler.patches@gmail.com>
    Cc: <stable@vger.kernel.org>
    Link: https://patch.msgid.link/20250720154907.80815-2-dawidrezler.patches@gmail.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek - Add mute LED support for HP Victus 15-fa0xxx [+ + +]
Author: Edip Hazuri <edip@medip.dev>
Date:   Fri Jul 18 00:26:26 2025 +0300

    ALSA: hda/realtek - Add mute LED support for HP Victus 15-fa0xxx
    
    commit 21c8ed9047b7f44c1c49b889d4ba2f555d9ee17e upstream.
    
    The mute led on this laptop is using ALC245 but requires a quirk to work
    This patch enables the existing quirk for the device.
    
    Tested on my Victus 15-fa0xxx Laptop. The LED behaviour works
    as intended.
    
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Edip Hazuri <edip@medip.dev>
    Link: https://patch.msgid.link/20250717212625.366026-2-edip@medip.dev
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: Fix mute LED mask on HP OMEN 16 laptop [+ + +]
Author: SHARAN KUMAR M <sharweshraajan@gmail.com>
Date:   Tue Jul 22 22:52:24 2025 +0530

    ALSA: hda/realtek: Fix mute LED mask on HP OMEN 16 laptop
    
    [ Upstream commit 931837cd924048ab785eedb4cee5b276c90a2924 ]
    
    this patch is to fix my previous Commit <e5182305a519> i have fixed mute
    led but for by This patch corrects the coefficient mask value introduced
    in commit <e5182305a519>, which was intended to enable the mute LED
    functionality. During testing, multiple values were evaluated, and
    an incorrect value was mistakenly included in the final commit.
    This update fixes that error by applying the correct mask value for
    proper mute LED behavior.
    
    Tested on 6.15.5-arch1-1
    
    Fixes: e5182305a519 ("ALSA: hda/realtek: Enable Mute LED on HP OMEN 16 Laptop xd000xx")
    Signed-off-by: SHARAN KUMAR M <sharweshraajan@gmail.com>
    Link: https://patch.msgid.link/20250722172224.15359-1-sharweshraajan@gmail.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: hda/tegra: Add Tegra264 support [+ + +]
Author: Mohan Kumar D <mkumard@nvidia.com>
Date:   Mon May 12 06:42:58 2025 +0000

    ALSA: hda/tegra: Add Tegra264 support
    
    commit 1c4193917eb3279788968639f24d72ffeebdec6b upstream.
    
    Update HDA driver to support Tegra264 differences from legacy HDA,
    which includes: clocks/resets, always power on, and hardware-managed
    FPCI/IPFS initialization. The driver retrieves this chip-specific
    information from soc_data.
    
    Signed-off-by: Mohan Kumar D <mkumard@nvidia.com>
    Signed-off-by: Sheetal <sheetal@nvidia.com>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Link: https://patch.msgid.link/20250512064258.1028331-4-sheetal@nvidia.com
    Stable-dep-of: e0a911ac8685 ("ALSA: hda: Add missing NVIDIA HDA codec IDs")
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda: Add missing NVIDIA HDA codec IDs [+ + +]
Author: Daniel Dadap <ddadap@nvidia.com>
Date:   Thu Jun 26 16:16:30 2025 -0500

    ALSA: hda: Add missing NVIDIA HDA codec IDs
    
    commit e0a911ac86857a73182edde9e50d9b4b949b7f01 upstream.
    
    Add codec IDs for several NVIDIA products with HDA controllers to the
    snd_hda_id_hdmi[] patch table.
    
    Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
    Cc: <stable@vger.kernel.org>
    Link: https://patch.msgid.link/aF24rqwMKFWoHu12@ddadap-lakeline.nvidia.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
arm64/entry: Mask DAIF in cpu_switch_to(), call_on_irq_stack() [+ + +]
Author: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Date:   Fri Jul 18 15:28:14 2025 +0100

    arm64/entry: Mask DAIF in cpu_switch_to(), call_on_irq_stack()
    
    commit d42e6c20de6192f8e4ab4cf10be8c694ef27e8cb upstream.
    
    `cpu_switch_to()` and `call_on_irq_stack()` manipulate SP to change
    to different stacks along with the Shadow Call Stack if it is enabled.
    Those two stack changes cannot be done atomically and both functions
    can be interrupted by SErrors or Debug Exceptions which, though unlikely,
    is very much broken : if interrupted, we can end up with mismatched stacks
    and Shadow Call Stack leading to clobbered stacks.
    
    In `cpu_switch_to()`, it can happen when SP_EL0 points to the new task,
    but x18 stills points to the old task's SCS. When the interrupt handler
    tries to save the task's SCS pointer, it will save the old task
    SCS pointer (x18) into the new task struct (pointed to by SP_EL0),
    clobbering it.
    
    In `call_on_irq_stack()`, it can happen when switching from the task stack
    to the IRQ stack and when switching back. In both cases, we can be
    interrupted when the SCS pointer points to the IRQ SCS, but SP points to
    the task stack. The nested interrupt handler pushes its return addresses
    on the IRQ SCS. It then detects that SP points to the task stack,
    calls `call_on_irq_stack()` and clobbers the task SCS pointer with
    the IRQ SCS pointer, which it will also use !
    
    This leads to tasks returning to addresses on the wrong SCS,
    or even on the IRQ SCS, triggering kernel panics via CONFIG_VMAP_STACK
    or FPAC if enabled.
    
    This is possible on a default config, but unlikely.
    However, when enabling CONFIG_ARM64_PSEUDO_NMI, DAIF is unmasked and
    instead the GIC is responsible for filtering what interrupts the CPU
    should receive based on priority.
    Given the goal of emulating NMIs, pseudo-NMIs can be received by the CPU
    even in `cpu_switch_to()` and `call_on_irq_stack()`, possibly *very*
    frequently depending on the system configuration and workload, leading
    to unpredictable kernel panics.
    
    Completely mask DAIF in `cpu_switch_to()` and restore it when returning.
    Do the same in `call_on_irq_stack()`, but restore and mask around
    the branch.
    Mask DAIF even if CONFIG_SHADOW_CALL_STACK is not enabled for consistency
    of behaviour between all configurations.
    
    Introduce and use an assembly macro for saving and masking DAIF,
    as the existing one saves but only masks IF.
    
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
    Reported-by: Cristian Prundeanu <cpru@amazon.com>
    Fixes: 59b37fe52f49 ("arm64: Stash shadow stack pointer in the task struct on interrupt")
    Tested-by: Cristian Prundeanu <cpru@amazon.com>
    Acked-by: Will Deacon <will@kernel.org>
    Link: https://lore.kernel.org/r/20250718142814.133329-1-ada.coupriediaz@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ARM: 9448/1: Use an absolute path to unified.h in KBUILD_AFLAGS [+ + +]
Author: Nathan Chancellor <nathan@kernel.org>
Date:   Fri Jun 20 19:08:09 2025 +0100

    ARM: 9448/1: Use an absolute path to unified.h in KBUILD_AFLAGS
    
    commit 87c4e1459e80bf65066f864c762ef4dc932fad4b upstream.
    
    After commit d5c8d6e0fa61 ("kbuild: Update assembler calls to use proper
    flags and language target"), which updated as-instr to use the
    'assembler-with-cpp' language option, the Kbuild version of as-instr
    always fails internally for arch/arm with
    
      <command-line>: fatal error: asm/unified.h: No such file or directory
      compilation terminated.
    
    because '-include' flags are now taken into account by the compiler
    driver and as-instr does not have '$(LINUXINCLUDE)', so unified.h is not
    found.
    
    This went unnoticed at the time of the Kbuild change because the last
    use of as-instr in Kbuild that arch/arm could reach was removed in 5.7
    by commit 541ad0150ca4 ("arm: Remove 32bit KVM host support") but a
    stable backport of the Kbuild change to before that point exposed this
    potential issue if one were to be reintroduced.
    
    Follow the general pattern of '-include' paths throughout the tree and
    make unified.h absolute using '$(srctree)' to ensure KBUILD_AFLAGS can
    be used independently.
    
    Closes: https://lore.kernel.org/CACo-S-1qbCX4WAVFA63dWfHtrRHZBTyyr2js8Lx=Az03XHTTHg@mail.gmail.com/
    
    Cc: stable@vger.kernel.org
    Fixes: d5c8d6e0fa61 ("kbuild: Update assembler calls to use proper flags and language target")
    Reported-by: KernelCI bot <bot@kernelci.org>
    Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
    Signed-off-by: Nathan Chancellor <nathan@kernel.org>
    Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: 9450/1: Fix allowing linker DCE with binutils < 2.36 [+ + +]
Author: Nathan Chancellor <nathan@kernel.org>
Date:   Mon Jul 14 20:56:47 2025 +0100

    ARM: 9450/1: Fix allowing linker DCE with binutils < 2.36
    
    commit 53e7e1fb81cc8ba2da1cb31f8917ef397caafe91 upstream.
    
    Commit e7607f7d6d81 ("ARM: 9443/1: Require linker to support KEEP within
    OVERLAY for DCE") accidentally broke the binutils version restriction
    that was added in commit 0d437918fb64 ("ARM: 9414/1: Fix build issue
    with LD_DEAD_CODE_DATA_ELIMINATION"), reintroducing the segmentation
    fault addressed by that workaround.
    
    Restore the binutils version dependency by using
    CONFIG_LD_CAN_USE_KEEP_IN_OVERLAY as an additional condition to ensure
    that CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION is only enabled with
    binutils >= 2.36 and ld.lld >= 21.0.0.
    
    Closes: https://lore.kernel.org/6739da7d-e555-407a-b5cb-e5681da71056@landley.net/
    Closes: https://lore.kernel.org/CAFERDQ0zPoya5ZQfpbeuKVZEo_fKsonLf6tJbp32QnSGAtbi+Q@mail.gmail.com/
    
    Cc: stable@vger.kernel.org
    Fixes: e7607f7d6d81 ("ARM: 9443/1: Require linker to support KEEP within OVERLAY for DCE")
    Reported-by: Rob Landley <rob@landley.net>
    Tested-by: Rob Landley <rob@landley.net>
    Reported-by: Martin Wetterwald <martin@wetterwald.eu>
    Signed-off-by: Nathan Chancellor <nathan@kernel.org>
    Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ASoC: mediatek: common: fix device and OF node leak [+ + +]
Author: Johan Hovold <johan@kernel.org>
Date:   Tue Jul 22 11:25:42 2025 +0200

    ASoC: mediatek: common: fix device and OF node leak
    
    commit 696e123aa36bf0bc72bda98df96dd8f379a6e854 upstream.
    
    Make sure to drop the references to the accdet OF node and platform
    device taken by of_parse_phandle() and of_find_device_by_node() after
    looking up the sound component during probe.
    
    Fixes: cf536e2622e2 ("ASoC: mediatek: common: Handle mediatek,accdet property")
    Cc: stable@vger.kernel.org      # 6.15
    Cc: Nícolas F. R. A. Prado <nfraprado@collabora.com>
    Signed-off-by: Johan Hovold <johan@kernel.org>
    Link: https://patch.msgid.link/20250722092542.32754-1-johan@kernel.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ASoC: mediatek: mt8365-dai-i2s: pass correct size to mt8365_dai_set_priv [+ + +]
Author: Guoqing Jiang <guoqing.jiang@canonical.com>
Date:   Thu Jul 10 09:18:06 2025 +0800

    ASoC: mediatek: mt8365-dai-i2s: pass correct size to mt8365_dai_set_priv
    
    [ Upstream commit 6bea85979d05470e6416a2bb504a9bcd9178304c ]
    
    Given mt8365_dai_set_priv allocate priv_size space to copy priv_data which
    means we should pass mt8365_i2s_priv[i] or "struct mtk_afe_i2s_priv"
    instead of afe_priv which has the size of "struct mt8365_afe_private".
    
    Otherwise the KASAN complains about.
    
    [   59.389765] BUG: KASAN: global-out-of-bounds in mt8365_dai_set_priv+0xc8/0x168 [snd_soc_mt8365_pcm]
    ...
    [   59.394789] Call trace:
    [   59.395167]  dump_backtrace+0xa0/0x128
    [   59.395733]  show_stack+0x20/0x38
    [   59.396238]  dump_stack_lvl+0xe8/0x148
    [   59.396806]  print_report+0x37c/0x5e0
    [   59.397358]  kasan_report+0xac/0xf8
    [   59.397885]  kasan_check_range+0xe8/0x190
    [   59.398485]  asan_memcpy+0x3c/0x98
    [   59.399022]  mt8365_dai_set_priv+0xc8/0x168 [snd_soc_mt8365_pcm]
    [   59.399928]  mt8365_dai_i2s_register+0x1e8/0x2b0 [snd_soc_mt8365_pcm]
    [   59.400893]  mt8365_afe_pcm_dev_probe+0x4d0/0xdf0 [snd_soc_mt8365_pcm]
    [   59.401873]  platform_probe+0xcc/0x228
    [   59.402442]  really_probe+0x340/0x9e8
    [   59.402992]  driver_probe_device+0x16c/0x3f8
    [   59.403638]  driver_probe_device+0x64/0x1d8
    [   59.404256]  driver_attach+0x1dc/0x4c8
    [   59.404840]  bus_for_each_dev+0x100/0x190
    [   59.405442]  driver_attach+0x44/0x68
    [   59.405980]  bus_add_driver+0x23c/0x500
    [   59.406550]  driver_register+0xf8/0x3d0
    [   59.407122]  platform_driver_register+0x68/0x98
    [   59.407810]  mt8365_afe_pcm_driver_init+0x2c/0xff8 [snd_soc_mt8365_pcm]
    
    Fixes: 402bbb13a195 ("ASoC: mediatek: mt8365: Add I2S DAI support")
    Signed-off-by: Guoqing Jiang <guoqing.jiang@canonical.com>
    Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
    Link: https://patch.msgid.link/20250710011806.134507-1-guoqing.jiang@canonical.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bus: fsl-mc: Fix potential double device reference in fsl_mc_get_endpoint() [+ + +]
Author: Ma Ke <make24@iscas.ac.cn>
Date:   Thu Jul 17 10:23:07 2025 +0800

    bus: fsl-mc: Fix potential double device reference in fsl_mc_get_endpoint()
    
    commit bddbe13d36a02d5097b99cf02354d5752ad1ac60 upstream.
    
    The fsl_mc_get_endpoint() function may call fsl_mc_device_lookup()
    twice, which would increment the device's reference count twice if
    both lookups find a device. This could lead to a reference count leak.
    
    Found by code review.
    
    Cc: stable@vger.kernel.org
    Fixes: 1ac210d128ef ("bus: fsl-mc: add the fsl_mc_get_endpoint function")
    Signed-off-by: Ma Ke <make24@iscas.ac.cn>
    Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Fixes: 8567494cebe5 ("bus: fsl-mc: rescan devices if endpoint not found")
    Link: https://patch.msgid.link/20250717022309.3339976-1-make24@iscas.ac.cn
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode [+ + +]
Author: Marc Kleine-Budde <mkl@pengutronix.de>
Date:   Tue Jul 15 22:35:46 2025 +0200

    can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode
    
    [ Upstream commit c1f3f9797c1f44a762e6f5f72520b2e520537b52 ]
    
    Andrei Lalaev reported a NULL pointer deref when a CAN device is
    restarted from Bus Off and the driver does not implement the struct
    can_priv::do_set_mode callback.
    
    There are 2 code path that call struct can_priv::do_set_mode:
    - directly by a manual restart from the user space, via
      can_changelink()
    - delayed automatic restart after bus off (deactivated by default)
    
    To prevent the NULL pointer deference, refuse a manual restart or
    configure the automatic restart delay in can_changelink() and report
    the error via extack to user space.
    
    As an additional safety measure let can_restart() return an error if
    can_priv::do_set_mode is not set instead of dereferencing it
    unchecked.
    
    Reported-by: Andrei Lalaev <andrey.lalaev@gmail.com>
    Closes: https://lore.kernel.org/all/20250714175520.307467-1-andrey.lalaev@gmail.com
    Fixes: 39549eef3587 ("can: CAN Network device driver and Netlink interface")
    Link: https://patch.msgid.link/20250718-fix-nullptr-deref-do_set_mode-v1-1-0b520097bb96@pengutronix.de
    Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
dpaa2-eth: Fix device reference count leak in MAC endpoint handling [+ + +]
Author: Ma Ke <make24@iscas.ac.cn>
Date:   Thu Jul 17 10:23:08 2025 +0800

    dpaa2-eth: Fix device reference count leak in MAC endpoint handling
    
    commit ee9f3a81ab08dfe0538dbd1746f81fd4d5147fdc upstream.
    
    The fsl_mc_get_endpoint() function uses device_find_child() for
    localization, which implicitly calls get_device() to increment the
    device's reference count before returning the pointer. However, the
    caller dpaa2_eth_connect_mac() fails to properly release this
    reference in multiple scenarios. We should call put_device() to
    decrement reference count properly.
    
    As comment of device_find_child() says, 'NOTE: you will need to drop
    the reference with put_device() after use'.
    
    Found by code review.
    
    Cc: stable@vger.kernel.org
    Fixes: 719479230893 ("dpaa2-eth: add MAC/PHY support through phylink")
    Signed-off-by: Ma Ke <make24@iscas.ac.cn>
    Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
    Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20250717022309.3339976-2-make24@iscas.ac.cn
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
dpaa2-switch: Fix device reference count leak in MAC endpoint handling [+ + +]
Author: Ma Ke <make24@iscas.ac.cn>
Date:   Thu Jul 17 10:23:09 2025 +0800

    dpaa2-switch: Fix device reference count leak in MAC endpoint handling
    
    commit 96e056ffba912ef18a72177f71956a5b347b5177 upstream.
    
    The fsl_mc_get_endpoint() function uses device_find_child() for
    localization, which implicitly calls get_device() to increment the
    device's reference count before returning the pointer. However, the
    caller dpaa2_switch_port_connect_mac() fails to properly release this
    reference in multiple scenarios. We should call put_device() to
    decrement reference count properly.
    
    As comment of device_find_child() says, 'NOTE: you will need to drop
    the reference with put_device() after use'.
    
    Found by code review.
    
    Cc: stable@vger.kernel.org
    Fixes: 84cba72956fd ("dpaa2-switch: integrate the MAC endpoint support")
    Signed-off-by: Ma Ke <make24@iscas.ac.cn>
    Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
    Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20250717022309.3339976-3-make24@iscas.ac.cn
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/amd/display: Don't allow OLED to go down to fully off [+ + +]
Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Thu Jun 19 09:29:13 2025 -0500

    drm/amd/display: Don't allow OLED to go down to fully off
    
    [ Upstream commit 39d81457ad3417a98ac826161f9ca0e642677661 ]
    
    [Why]
    OLED panels can be fully off, but this behavior is unexpected.
    
    [How]
    Ensure that minimum luminance is at least 1.
    
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4338
    Reviewed-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Ray Wu <ray.wu@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    (cherry picked from commit 51496c7737d06a74b599d0aa7974c3d5a4b1162e)
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/amdgpu: Add the new sdma function pointers for amdgpu_sdma.h [+ + +]
Author: Jesse.zhang@amd.com <Jesse.zhang@amd.com>
Date:   Fri Apr 11 13:01:19 2025 +0800

    drm/amdgpu: Add the new sdma function pointers for amdgpu_sdma.h
    
    commit 29891842154d7ebca97a94b0d5aaae94e560f61c upstream.
    
    This patch introduces new function pointers in the amdgpu_sdma structure
    to handle queue stop, start and soft reset operations. These will replace
    the older callback mechanism.
    
    The new functions are:
    - stop_kernel_queue: Stops a specific SDMA queue
    - start_kernel_queue: Starts/Restores a specific SDMA queue
    - soft_reset_kernel_queue: Performs soft reset on a specific SDMA queue
    
    v2: Update stop_queue/start_queue function paramters to use ring pointer instead of device/instance(Chritian)
    v3: move stop_queue/start_queue to struct amdgpu_sdma_instance and rename them. (Alex)
    v4: rework the ordering a bit (Alex)
    
    Suggested-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Stable-dep-of: 09b585592fa4 ("drm/amdgpu: Fix SDMA engine reset with logical instance ID")
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amdgpu: Fix SDMA engine reset with logical instance ID [+ + +]
Author: Jesse Zhang <jesse.zhang@amd.com>
Date:   Wed Jun 11 15:02:09 2025 +0800

    drm/amdgpu: Fix SDMA engine reset with logical instance ID
    
    commit 09b585592fa481384597c81388733aed4a04dd05 upstream.
    
    This commit makes the following improvements to SDMA engine reset handling:
    
    1. Clarifies in the function documentation that instance_id refers to a logical ID
    2. Adds conversion from logical to physical instance ID before performing reset
       using GET_INST(SDMA0, instance_id) macro
    3. Improves error messaging to indicate when a logical instance reset fails
    4. Adds better code organization with blank lines for readability
    
    The change ensures proper SDMA engine reset by using the correct physical
    instance ID while maintaining the logical ID interface for callers.
    
    V2: Remove harvest_config check and convert directly to physical instance (Lijo)
    
    Suggested-by: Jonathan Kim <jonathan.kim@amd.com>
    Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    (cherry picked from commit 5efa6217c239ed1ceec0f0414f9b6f6927387dfc)
    Cc: stable@vger.kernel.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amdgpu: Implement SDMA soft reset directly for v5.x [+ + +]
Author: Jesse.zhang@amd.com <Jesse.zhang@amd.com>
Date:   Fri Apr 11 15:26:18 2025 +0800

    drm/amdgpu: Implement SDMA soft reset directly for v5.x
    
    commit 5c3e7c49538e2ddad10296a318c225bbb3d37d20 upstream.
    
    This patch introduces a new function `amdgpu_sdma_soft_reset` to handle SDMA soft resets directly,
    rather than relying on the DPM interface.
    
    1. **New `amdgpu_sdma_soft_reset` Function**:
       - Implements a soft reset for SDMA engines by directly writing to the hardware registers.
       - Handles SDMA versions 4.x and 5.x separately:
         - For SDMA 4.x, the existing `amdgpu_dpm_reset_sdma` function is used for backward compatibility.
         - For SDMA 5.x, the driver directly manipulates the `GRBM_SOFT_RESET` register to reset the specified SDMA instance.
    
    2. **Integration into `amdgpu_sdma_reset_engine`**:
       - The `amdgpu_sdma_soft_reset` function is called during the SDMA reset process, replacing the previous call to `amdgpu_dpm_reset_sdma`.
    
    v2: r should default to an error (Alex)
    
    Suggested-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Stable-dep-of: 09b585592fa4 ("drm/amdgpu: Fix SDMA engine reset with logical instance ID")
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amdgpu: Reset the clear flag in buddy during resume [+ + +]
Author: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Date:   Wed Jul 16 13:21:24 2025 +0530

    drm/amdgpu: Reset the clear flag in buddy during resume
    
    commit 95a16160ca1d75c66bf7a1c5e0bcaffb18e7c7fc upstream.
    
    - Added a handler in DRM buddy manager to reset the cleared
      flag for the blocks in the freelist.
    
    - This is necessary because, upon resuming, the VRAM becomes
      cluttered with BIOS data, yet the VRAM backend manager
      believes that everything has been cleared.
    
    v2:
      - Add lock before accessing drm_buddy_clear_reset_blocks()(Matthew Auld)
      - Force merge the two dirty blocks.(Matthew Auld)
      - Add a new unit test case for this issue.(Matthew Auld)
      - Having this function being able to flip the state either way would be
        good. (Matthew Brost)
    
    v3(Matthew Auld):
      - Do merge step first to avoid the use of extra reset flag.
    
    Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
    Suggested-by: Christian König <christian.koenig@amd.com>
    Acked-by: Christian König <christian.koenig@amd.com>
    Reviewed-by: Matthew Auld <matthew.auld@intel.com>
    Cc: stable@vger.kernel.org
    Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality")
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3812
    Signed-off-by: Christian König <christian.koenig@amd.com>
    Link: https://lore.kernel.org/r/20250716075125.240637-2-Arunpravin.PaneerSelvam@amd.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/bridge: ti-sn65dsi86: Remove extra semicolon in ti_sn_bridge_probe() [+ + +]
Author: Douglas Anderson <dianders@chromium.org>
Date:   Mon Jul 14 13:06:32 2025 -0700

    drm/bridge: ti-sn65dsi86: Remove extra semicolon in ti_sn_bridge_probe()
    
    [ Upstream commit 15a7ca747d9538c2ad8b0c81dd4c1261e0736c82 ]
    
    As reported by the kernel test robot, a recent patch introduced an
    unnecessary semicolon. Remove it.
    
    Fixes: 55e8ff842051 ("drm/bridge: ti-sn65dsi86: Add HPD for DisplayPort connector type")
    Reported-by: kernel test robot <lkp@intel.com>
    Closes: https://lore.kernel.org/oe-kbuild-all/202506301704.0SBj6ply-lkp@intel.com/
    Reviewed-by: Devarsh Thakkar <devarsht@ti.com>
    Signed-off-by: Douglas Anderson <dianders@chromium.org>
    Link: https://lore.kernel.org/r/20250714130631.1.I1cfae3222e344a3b3c770d079ee6b6f7f3b5d636@changeid
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/i915/dp: Fix 2.7 Gbps DP_LINK_BW value on g4x [+ + +]
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Thu Jul 10 23:17:12 2025 +0300

    drm/i915/dp: Fix 2.7 Gbps DP_LINK_BW value on g4x
    
    commit 9e0c433d0c05fde284025264b89eaa4ad59f0a3e upstream.
    
    On g4x we currently use the 96MHz non-SSC refclk, which can't actually
    generate an exact 2.7 Gbps link rate. In practice we end up with 2.688
    Gbps which seems to be close enough to actually work, but link training
    is currently failing due to miscalculating the DP_LINK_BW value (we
    calcualte it directly from port_clock which reflects the actual PLL
    outpout frequency).
    
    Ideas how to fix this:
    - nudge port_clock back up to 270000 during PLL computation/readout
    - track port_clock and the nominal link rate separately so they might
      differ a bit
    - switch to the 100MHz refclk, but that one should be SSC so perhaps
      not something we want
    
    While we ponder about a better solution apply some band aid to the
    immediate issue of miscalculated DP_LINK_BW value. With this
    I can again use 2.7 Gbps link rate on g4x.
    
    Cc: stable@vger.kernel.org
    Fixes: 665a7b04092c ("drm/i915: Feed the DPLL output freq back into crtc_state")
    Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20250710201718.25310-2-ville.syrjala@linux.intel.com
    Reviewed-by: Imre Deak <imre.deak@intel.com>
    (cherry picked from commit a8b874694db5cae7baaf522756f87acd956e6e66)
    Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/sched: Remove optimization that causes hang when killing dependent jobs [+ + +]
Author: Lin.Cao <lincao12@amd.com>
Date:   Thu Jul 17 16:44:53 2025 +0800

    drm/sched: Remove optimization that causes hang when killing dependent jobs
    
    commit 15f77764e90a713ee3916ca424757688e4f565b9 upstream.
    
    When application A submits jobs and application B submits a job with a
    dependency on A's fence, the normal flow wakes up the scheduler after
    processing each job. However, the optimization in
    drm_sched_entity_add_dependency_cb() uses a callback that only clears
    dependencies without waking up the scheduler.
    
    When application A is killed before its jobs can run, the callback gets
    triggered but only clears the dependency without waking up the scheduler,
    causing the scheduler to enter sleep state and application B to hang.
    
    Remove the optimization by deleting drm_sched_entity_clear_dep() and its
    usage, ensuring the scheduler is always woken up when dependencies are
    cleared.
    
    Fixes: 777dbd458c89 ("drm/amdgpu: drop a dummy wakeup scheduler")
    Cc: stable@vger.kernel.org # v4.6+
    Signed-off-by: Lin.Cao <lincao12@amd.com>
    Reviewed-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Philipp Stanner <phasta@kernel.org>
    Link: https://lore.kernel.org/r/20250717084453.921097-1-lincao12@amd.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/shmem-helper: Remove obsoleted is_iomem test [+ + +]
Author: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Date:   Sun Mar 23 00:26:04 2025 +0300

    drm/shmem-helper: Remove obsoleted is_iomem test
    
    commit eab10538073c3ff9e21c857bd462f79f2f6f7e00 upstream.
    
    Everything that uses the mapped buffer should be agnostic to is_iomem.
    The only reason for the is_iomem test is that we're setting shmem->vaddr
    to the returned map->vaddr. Now that the shmem->vaddr code is gone, remove
    the obsoleted is_iomem test to clean up the code.
    
    Acked-by: Maxime Ripard <mripard@kernel.org>
    Suggested-by: Thomas Zimmermann <tzimmermann@suse.de>
    Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
    Acked-by: Thomas Zimmermann <tzimmermann@suse.d>
    Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20250322212608.40511-7-dmitry.osipenko@collabora.com
    Stable-dep-of: 6d496e956998 ("Revert "drm/gem-shmem: Use dma_buf from GEM object instance"")
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/xe: Make WA BB part of LRC BO [+ + +]
Author: Matthew Brost <matthew.brost@intel.com>
Date:   Wed Jun 11 20:19:25 2025 -0700

    drm/xe: Make WA BB part of LRC BO
    
    commit afcad92411772a1f361339f22c49f855c6cc7d0f upstream.
    
    No idea why, but without this GuC context switches randomly fail when
    running IGTs in a loop. Need to follow up why this fixes the
    aforementioned issue but can live with a stable driver for now.
    
    Fixes: 617d824c5323 ("drm/xe: Add WA BB to capture active context utilization")
    Cc: stable@vger.kernel.org
    Signed-off-by: Matthew Brost <matthew.brost@intel.com>
    Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Tested-by: Shuicheng Lin <shuicheng.lin@intel.com>
    Link: https://lore.kernel.org/r/20250612031925.4009701-1-matthew.brost@intel.com
    (cherry picked from commit 3a1edef8f4b58b0ba826bc68bf4bce4bdf59ecf3)
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    [ adapted xe_bo_create_pin_map() call ]
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
e1000e: disregard NVM checksum on tgp when valid checksum bit is not set [+ + +]
Author: Jacek Kowalski <jacek@jacekk.info>
Date:   Mon Jun 30 10:33:39 2025 +0200

    e1000e: disregard NVM checksum on tgp when valid checksum bit is not set
    
    commit 536fd741c7ac907d63166cdae1081b1febfab613 upstream.
    
    As described by Vitaly Lifshits:
    
    > Starting from Tiger Lake, LAN NVM is locked for writes by SW, so the
    > driver cannot perform checksum validation and correction. This means
    > that all NVM images must leave the factory with correct checksum and
    > checksum valid bit set. Since Tiger Lake devices were the first to have
    > this lock, some systems in the field did not meet this requirement.
    > Therefore, for these transitional devices we skip checksum update and
    > verification, if the valid bit is not set.
    
    Signed-off-by: Jacek Kowalski <jacek@jacekk.info>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
    Fixes: 4051f68318ca9 ("e1000e: Do not take care about recovery NVM checksum")
    Cc: stable@vger.kernel.org
    Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e1000e: ignore uninitialized checksum word on tgp [+ + +]
Author: Jacek Kowalski <jacek@jacekk.info>
Date:   Mon Jun 30 10:35:00 2025 +0200

    e1000e: ignore uninitialized checksum word on tgp
    
    commit 61114910a5f6a71d0b6ea3b95082dfe031b19dfe upstream.
    
    As described by Vitaly Lifshits:
    
    > Starting from Tiger Lake, LAN NVM is locked for writes by SW, so the
    > driver cannot perform checksum validation and correction. This means
    > that all NVM images must leave the factory with correct checksum and
    > checksum valid bit set.
    
    Unfortunately some systems have left the factory with an uninitialized
    value of 0xFFFF at register address 0x3F (checksum word location).
    So on Tiger Lake platform we ignore the computed checksum when such
    condition is encountered.
    
    Signed-off-by: Jacek Kowalski <jacek@jacekk.info>
    Tested-by: Vlad URSU <vlad@ursu.me>
    Fixes: 4051f68318ca9 ("e1000e: Do not take care about recovery NVM checksum")
    Cc: stable@vger.kernel.org
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
    Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
gve: Fix stuck TX queue for DQ queue format [+ + +]
Author: Praveen Kaligineedi <pkaligineedi@google.com>
Date:   Thu Jul 17 19:20:24 2025 +0000

    gve: Fix stuck TX queue for DQ queue format
    
    commit b03f15c0192b184078206760c839054ae6eb4eaa upstream.
    
    gve_tx_timeout was calculating missed completions in a way that is only
    relevant in the GQ queue format. Additionally, it was attempting to
    disable device interrupts, which is not needed in either GQ or DQ queue
    formats.
    
    As a result, TX timeouts with the DQ queue format likely would have
    triggered early resets without kicking the queue at all.
    
    This patch drops the check for pending work altogether and always kicks
    the queue after validating the queue has not seen a TX timeout too
    recently.
    
    Cc: stable@vger.kernel.org
    Fixes: 87a7f321bb6a ("gve: Recover from queue stall due to missed IRQ")
    Co-developed-by: Tim Hostetler <thostet@google.com>
    Signed-off-by: Tim Hostetler <thostet@google.com>
    Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
    Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
    Link: https://patch.msgid.link/20250717192024.1820931-1-hramamurthy@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
i2c: qup: jump out of the loop in case of timeout [+ + +]
Author: Yang Xiwen <forbidden405@outlook.com>
Date:   Mon Jun 16 00:01:10 2025 +0800

    i2c: qup: jump out of the loop in case of timeout
    
    commit a7982a14b3012527a9583d12525cd0dc9f8d8934 upstream.
    
    Original logic only sets the return value but doesn't jump out of the
    loop if the bus is kept active by a client. This is not expected. A
    malicious or buggy i2c client can hang the kernel in this case and
    should be avoided. This is observed during a long time test with a
    PCA953x GPIO extender.
    
    Fix it by changing the logic to not only sets the return value, but also
    jumps out of the loop and return to the caller with -ETIMEDOUT.
    
    Fixes: fbfab1ab0658 ("i2c: qup: reorganization of driver code to remove polling for qup v1")
    Signed-off-by: Yang Xiwen <forbidden405@outlook.com>
    Cc: <stable@vger.kernel.org> # v4.17+
    Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
    Link: https://lore.kernel.org/r/20250616-qca-i2c-v1-1-2a8d37ee0a30@outlook.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: tegra: Fix reset error handling with ACPI [+ + +]
Author: Akhil R <akhilrajeev@nvidia.com>
Date:   Thu Jul 10 18:42:04 2025 +0530

    i2c: tegra: Fix reset error handling with ACPI
    
    commit 56344e241c543f17e8102fa13466ad5c3e7dc9ff upstream.
    
    The acpi_evaluate_object() returns an ACPI error code and not
    Linux one. For the some platforms the err will have positive code
    which may be interpreted incorrectly. Use device_reset() for
    reset control which handles it correctly.
    
    Fixes: bd2fdedbf2ba ("i2c: tegra: Add the ACPI support")
    Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
    Cc: <stable@vger.kernel.org> # v5.17+
    Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
    Link: https://lore.kernel.org/r/20250710131206.2316-2-akhilrajeev@nvidia.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: virtio: Avoid hang by using interruptible completion wait [+ + +]
Author: Viresh Kumar <viresh.kumar@linaro.org>
Date:   Thu Jul 3 17:01:02 2025 +0530

    i2c: virtio: Avoid hang by using interruptible completion wait
    
    commit a663b3c47ab10f66130818cf94eb59c971541c3f upstream.
    
    The current implementation uses wait_for_completion(), which can cause
    the caller to hang indefinitely if the transfer never completes.
    
    Switch to wait_for_completion_interruptible() so that the operation can
    be interrupted by signals.
    
    Fixes: 84e1d0bf1d71 ("i2c: virtio: disable timeout handling")
    Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
    Cc: <stable@vger.kernel.org> # v5.16+
    Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
    Link: https://lore.kernel.org/r/b8944e9cab8eb959d888ae80add6f2a686159ba2.1751541962.git.viresh.kumar@linaro.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
i40e: report VF tx_dropped with tx_errors instead of tx_discards [+ + +]
Author: Dennis Chen <dechen@redhat.com>
Date:   Wed Jun 18 15:52:40 2025 -0400

    i40e: report VF tx_dropped with tx_errors instead of tx_discards
    
    [ Upstream commit 50b2af451597ca6eefe9d4543f8bbf8de8aa00e7 ]
    
    Currently the tx_dropped field in VF stats is not updated correctly
    when reading stats from the PF. This is because it reads from
    i40e_eth_stats.tx_discards which seems to be unused for per VSI stats,
    as it is not updated by i40e_update_eth_stats() and the corresponding
    register, GLV_TDPC, is not implemented[1].
    
    Use i40e_eth_stats.tx_errors instead, which is actually updated by
    i40e_update_eth_stats() by reading from GLV_TEPC.
    
    To test, create a VF and try to send bad packets through it:
    
    $ echo 1 > /sys/class/net/enp2s0f0/device/sriov_numvfs
    $ cat test.py
    from scapy.all import *
    
    vlan_pkt = Ether(dst="ff:ff:ff:ff:ff:ff") / Dot1Q(vlan=999) / IP(dst="192.168.0.1") / ICMP()
    ttl_pkt = IP(dst="8.8.8.8", ttl=0) / ICMP()
    
    print("Send packet with bad VLAN tag")
    sendp(vlan_pkt, iface="enp2s0f0v0")
    print("Send packet with TTL=0")
    sendp(ttl_pkt, iface="enp2s0f0v0")
    $ ip -s link show dev enp2s0f0
    16: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
        link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff
        RX:  bytes packets errors dropped  missed   mcast
                 0       0      0       0       0       0
        TX:  bytes packets errors dropped carrier collsns
                 0       0      0       0       0       0
        vf 0     link/ether e2:c6:fd:c1:1e:92 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
        RX: bytes  packets  mcast   bcast   dropped
                 0        0       0       0        0
        TX: bytes  packets   dropped
                 0        0        0
    $ python test.py
    Send packet with bad VLAN tag
    .
    Sent 1 packets.
    Send packet with TTL=0
    .
    Sent 1 packets.
    $ ip -s link show dev enp2s0f0
    16: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
        link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff
        RX:  bytes packets errors dropped  missed   mcast
                 0       0      0       0       0       0
        TX:  bytes packets errors dropped carrier collsns
                 0       0      0       0       0       0
        vf 0     link/ether e2:c6:fd:c1:1e:92 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
        RX: bytes  packets  mcast   bcast   dropped
                 0        0       0       0        0
        TX: bytes  packets   dropped
                 0        0        0
    
    A packet with non-existent VLAN tag and a packet with TTL = 0 are sent,
    but tx_dropped is not incremented.
    
    After patch:
    
    $ ip -s link show dev enp2s0f0
    19: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
        link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff
        RX:  bytes packets errors dropped  missed   mcast
                 0       0      0       0       0       0
        TX:  bytes packets errors dropped carrier collsns
                 0       0      0       0       0       0
        vf 0     link/ether 4a:b7:3d:37:f7:56 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
        RX: bytes  packets  mcast   bcast   dropped
                 0        0       0       0        0
        TX: bytes  packets   dropped
                 0        0        2
    
    Fixes: dc645daef9af5bcbd9c ("i40e: implement VF stats NDO")
    Signed-off-by: Dennis Chen <dechen@redhat.com>
    Link: https://www.intel.com/content/www/us/en/content-details/596333/intel-ethernet-controller-x710-tm4-at2-carlsville-datasheet.html
    Reviewed-by: Simon Horman <horms@kernel.org>
    Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

i40e: When removing VF MAC filters, only check PF-set MAC [+ + +]
Author: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Date:   Wed Jun 25 09:29:18 2025 +1000

    i40e: When removing VF MAC filters, only check PF-set MAC
    
    [ Upstream commit 5a0df02999dbe838c3feed54b1d59e9445f68b89 ]
    
    When the PF is processing an Admin Queue message to delete a VF's MACs
    from the MAC filter, we currently check if the PF set the MAC and if
    the VF is trusted.
    
    This results in undesirable behaviour, where if a trusted VF with a
    PF-set MAC sets itself down (which sends an AQ message to delete the
    VF's MAC filters) then the VF MAC is erased from the interface.
    
    This results in the VF losing its PF-set MAC which should not happen.
    
    There is no need to check for trust at all, because an untrusted VF
    cannot change its own MAC. The only check needed is whether the PF set
    the MAC. If the PF set the MAC, then don't erase the MAC on link-down.
    
    Resolve this by changing the deletion check only for PF-set MAC.
    
    (the out-of-tree driver has also intentionally removed the check for VF
    trust here with OOT driver version 2.26.8, this changes the Linux kernel
    driver behaviour and comment to match the OOT driver behaviour)
    
    Fixes: ea2a1cfc3b201 ("i40e: Fix VF MAC filter removal")
    Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ice: Fix a null pointer dereference in ice_copy_and_init_pkg() [+ + +]
Author: Haoxiang Li <haoxiang_li2024@163.com>
Date:   Thu Jul 3 17:52:32 2025 +0800

    ice: Fix a null pointer dereference in ice_copy_and_init_pkg()
    
    commit 4ff12d82dac119b4b99b5a78b5af3bf2474c0a36 upstream.
    
    Add check for the return value of devm_kmemdup()
    to prevent potential null pointer dereference.
    
    Fixes: c76488109616 ("ice: Implement Dynamic Device Personalization (DDP) download")
    Cc: stable@vger.kernel.org
    Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com>
    Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
    Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
iio: adc: ad7949: use spi_is_bpw_supported() [+ + +]
Author: David Lechner <dlechner@baylibre.com>
Date:   Wed Jun 11 10:04:58 2025 -0500

    iio: adc: ad7949: use spi_is_bpw_supported()
    
    [ Upstream commit 7b86482632788acd48d7b9ee1867f5ad3a32ccbb ]
    
    Use spi_is_bpw_supported() instead of directly accessing spi->controller
    ->bits_per_word_mask. bits_per_word_mask may be 0, which implies that
    8-bits-per-word is supported. spi_is_bpw_supported() takes this into
    account while spi_ctrl_mask == SPI_BPW_MASK(8) does not.
    
    Fixes: 0b2a740b424e ("iio: adc: ad7949: enable use with non 14/16-bit controllers")
    Closes: https://lore.kernel.org/linux-spi/c8b8a963-6cef-4c9b-bfef-dab2b7bd0b0f@sirena.org.uk/
    Signed-off-by: David Lechner <dlechner@baylibre.com>
    Reviewed-by: Andy Shevchenko <andy@kernel.org>
    Link: https://patch.msgid.link/20250611-iio-adc-ad7949-use-spi_is_bpw_supported-v1-1-c4e15bfd326e@baylibre.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

iio: fix potential out-of-bound write [+ + +]
Author: Markus Burri <markus.burri@mt.com>
Date:   Thu May 8 15:06:09 2025 +0200

    iio: fix potential out-of-bound write
    
    [ Upstream commit 16285a0931869baa618b1f5d304e1e9d090470a8 ]
    
    The buffer is set to 20 characters. If a caller write more characters,
    count is truncated to the max available space in "simple_write_to_buffer".
    To protect from OoB access, check that the input size fit into buffer and
    add a zero terminator after copy to the end of the copied data.
    
    Fixes: 6d5dd486c715 iio: core: make use of simple_write_to_buffer()
    Signed-off-by: Markus Burri <markus.burri@mt.com>
    Link: https://patch.msgid.link/20250508130612.82270-4-markus.burri@mt.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
interconnect: icc-clk: destroy nodes in case of memory allocation failures [+ + +]
Author: Gabor Juhos <j4g8y7@gmail.com>
Date:   Wed Jun 25 19:32:35 2025 +0200

    interconnect: icc-clk: destroy nodes in case of memory allocation failures
    
    [ Upstream commit 618c810a7b2163517ab1875bd56b633ca3cb3328 ]
    
    When memory allocation fails during creating the name of the nodes in
    icc_clk_register(), the code continues on the error path and it calls
    icc_nodes_remove() to destroy the already created nodes. However that
    function only destroys the nodes which were already added to the provider
    and the newly created nodes are never destroyed in case of error.
    
    In order to avoid a memory leaks, change the code to destroy the newly
    created nodes explicitly in case of memory allocation failures.
    
    Fixes: 44c5aa73ccd1 ("interconnect: icc-clk: check return values of devm_kasprintf()")
    Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
    Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Link: https://lore.kernel.org/r/20250625-icc-clk-memleak-fix-v1-1-4151484cd24f@gmail.com
    Signed-off-by: Georgi Djakov <djakov@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

interconnect: qcom: sc7280: Add missing num_links to xm_pcie3_1 node [+ + +]
Author: Xilin Wu <sophon@radxa.com>
Date:   Fri Jun 13 22:53:38 2025 +0800

    interconnect: qcom: sc7280: Add missing num_links to xm_pcie3_1 node
    
    [ Upstream commit 886a94f008dd1a1702ee66dd035c266f70fd9e90 ]
    
    This allows adding interconnect paths for PCIe 1 in device tree later.
    
    Fixes: 46bdcac533cc ("interconnect: qcom: Add SC7280 interconnect provider driver")
    Signed-off-by: Xilin Wu <sophon@radxa.com>
    Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
    Link: https://lore.kernel.org/r/20250613-sc7280-icc-pcie1-fix-v1-1-0b09813e3b09@radxa.com
    Signed-off-by: Georgi Djakov <djakov@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
kasan: use vmalloc_dump_obj() for vmalloc error reports [+ + +]
Author: Marco Elver <elver@google.com>
Date:   Wed Jul 16 17:23:28 2025 +0200

    kasan: use vmalloc_dump_obj() for vmalloc error reports
    
    commit 6ade153349c6bb990d170cecc3e8bdd8628119ab upstream.
    
    Since 6ee9b3d84775 ("kasan: remove kasan_find_vm_area() to prevent
    possible deadlock"), more detailed info about the vmalloc mapping and the
    origin was dropped due to potential deadlocks.
    
    While fixing the deadlock is necessary, that patch was too quick in
    killing an otherwise useful feature, and did no due-diligence in
    understanding if an alternative option is available.
    
    Restore printing more helpful vmalloc allocation info in KASAN reports
    with the help of vmalloc_dump_obj().  Example report:
    
    | BUG: KASAN: vmalloc-out-of-bounds in vmalloc_oob+0x4c9/0x610
    | Read of size 1 at addr ffffc900002fd7f3 by task kunit_try_catch/493
    |
    | CPU: [...]
    | Call Trace:
    |  <TASK>
    |  dump_stack_lvl+0xa8/0xf0
    |  print_report+0x17e/0x810
    |  kasan_report+0x155/0x190
    |  vmalloc_oob+0x4c9/0x610
    |  [...]
    |
    | The buggy address belongs to a 1-page vmalloc region starting at 0xffffc900002fd000 allocated at vmalloc_oob+0x36/0x610
    | The buggy address belongs to the physical page:
    | page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x126364
    | flags: 0x200000000000000(node=0|zone=2)
    | raw: 0200000000000000 0000000000000000 dead000000000122 0000000000000000
    | raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
    | page dumped because: kasan: bad access detected
    |
    | [..]
    
    Link: https://lkml.kernel.org/r/20250716152448.3877201-1-elver@google.com
    Fixes: 6ee9b3d84775 ("kasan: remove kasan_find_vm_area() to prevent possible deadlock")
    Signed-off-by: Marco Elver <elver@google.com>
    Suggested-by: Uladzislau Rezki <urezki@gmail.com>
    Acked-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Cc: Yeoreum Yun <yeoreum.yun@arm.com>
    Cc: Yunseong Kim <ysk@kzalloc.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Linux: Linux 6.15.9 [+ + +]
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Fri Aug 1 09:51:29 2025 +0100

    Linux 6.15.9
    
    Link: https://lore.kernel.org/r/20250730093230.629234025@linuxfoundation.org
    Tested-by: Ronald Warsow <rwarsow@gmx.de>
    Tested-by: Christian Heusel <christian@heusel.eu>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-By: Achill Gilgenast <fossdd@pwned.life>
    Tested-by: Brett A C Sheffield <bacs@librecast.net>
    Tested-by: Peter Schneider <pschneider1968@googlemail.com>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
    Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Tested-by: Miguel Ojeda <ojeda@kernel.org>
    Tested-by: Hardik Garg <hargar@linux.microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mm/ksm: fix -Wsometimes-uninitialized from clang-21 in advisor_mode_show() [+ + +]
Author: Nathan Chancellor <nathan@kernel.org>
Date:   Tue Jul 15 12:56:16 2025 -0700

    mm/ksm: fix -Wsometimes-uninitialized from clang-21 in advisor_mode_show()
    
    commit 153ad566724fe6f57b14f66e9726d295d22e576d upstream.
    
    After a recent change in clang to expose uninitialized warnings from const
    variables [1], there is a false positive warning from the if statement in
    advisor_mode_show().
    
      mm/ksm.c:3687:11: error: variable 'output' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
       3687 |         else if (ksm_advisor == KSM_ADVISOR_SCAN_TIME)
            |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      mm/ksm.c:3690:33: note: uninitialized use occurs here
       3690 |         return sysfs_emit(buf, "%s\n", output);
            |                                        ^~~~~~
    
    Rewrite the if statement to implicitly make KSM_ADVISOR_NONE the else
    branch so that it is obvious to the compiler that ksm_advisor can only be
    KSM_ADVISOR_NONE or KSM_ADVISOR_SCAN_TIME due to the assignments in
    advisor_mode_store().
    
    Link: https://lkml.kernel.org/r/20250715-ksm-fix-clang-21-uninit-warning-v1-1-f443feb4bfc4@kernel.org
    Fixes: 66790e9a735b ("mm/ksm: add sysfs knobs for advisor")
    Signed-off-by: Nathan Chancellor <nathan@kernel.org>
    Closes: https://github.com/ClangBuiltLinux/linux/issues/2100
    Link: https://github.com/llvm/llvm-project/commit/2464313eef01c5b1edf0eccf57a32cdee01472c7 [1]
    Acked-by: David Hildenbrand <david@redhat.com>
    Cc: Chengming Zhou <chengming.zhou@linux.dev>
    Cc: Stefan Roesch <shr@devkernel.io>
    Cc: xu xin <xu.xin16@zte.com.cn>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mm/vmscan: fix hwpoisoned large folio handling in shrink_folio_list [+ + +]
Author: Jinjiang Tu <tujinjiang@huawei.com>
Date:   Fri Jun 27 20:57:46 2025 +0800

    mm/vmscan: fix hwpoisoned large folio handling in shrink_folio_list
    
    commit 9f1e8cd0b7c4c944e9921b52a6661b5eda2705ab upstream.
    
    In shrink_folio_list(), the hwpoisoned folio may be large folio, which
    can't be handled by unmap_poisoned_folio().  For THP, try_to_unmap_one()
    must be passed with TTU_SPLIT_HUGE_PMD to split huge PMD first and then
    retry.  Without TTU_SPLIT_HUGE_PMD, we will trigger null-ptr deref of
    pvmw.pte.  Even we passed TTU_SPLIT_HUGE_PMD, we will trigger a
    WARN_ON_ONCE due to the page isn't in swapcache.
    
    Since UCE is rare in real world, and race with reclaimation is more rare,
    just skipping the hwpoisoned large folio is enough.  memory_failure() will
    handle it if the UCE is triggered again.
    
    This happens when memory reclaim for large folio races with
    memory_failure(), and will lead to kernel panic.  The race is as
    follows:
    
    cpu0      cpu1
     shrink_folio_list memory_failure
      TestSetPageHWPoison
      unmap_poisoned_folio
      --> trigger BUG_ON due to
      unmap_poisoned_folio couldn't
       handle large folio
    
    [tujinjiang@huawei.com: add comment to unmap_poisoned_folio()]
      Link: https://lkml.kernel.org/r/69fd4e00-1b13-d5f7-1c82-705c7d977ea4@huawei.com
    Link: https://lkml.kernel.org/r/20250627125747.3094074-2-tujinjiang@huawei.com
    Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
    Fixes: 1b0449544c64 ("mm/vmscan: don't try to reclaim hwpoison folio")
    Reported-by: syzbot+3b220254df55d8ca8a61@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/all/68412d57.050a0220.2461cf.000e.GAE@google.com/
    Acked-by: David Hildenbrand <david@redhat.com>
    Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
    Acked-by: Zi Yan <ziy@nvidia.com>
    Reviewed-by: Oscar Salvador <osalvador@suse.de>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mm/zsmalloc: do not pass __GFP_MOVABLE if CONFIG_COMPACTION=n [+ + +]
Author: Harry Yoo <harry.yoo@oracle.com>
Date:   Fri Jul 4 19:30:53 2025 +0900

    mm/zsmalloc: do not pass __GFP_MOVABLE if CONFIG_COMPACTION=n
    
    commit 694d6b99923eb05a8fd188be44e26077d19f0e21 upstream.
    
    Commit 48b4800a1c6a ("zsmalloc: page migration support") added support for
    migrating zsmalloc pages using the movable_operations migration framework.
    However, the commit did not take into account that zsmalloc supports
    migration only when CONFIG_COMPACTION is enabled.  Tracing shows that
    zsmalloc was still passing the __GFP_MOVABLE flag even when compaction is
    not supported.
    
    This can result in unmovable pages being allocated from movable page
    blocks (even without stealing page blocks), ZONE_MOVABLE and CMA area.
    
    Possible user visible effects:
    - Some ZONE_MOVABLE memory can be not actually movable
    - CMA allocation can fail because of this
    - Increased memory fragmentation due to ignoring the page mobility
      grouping feature
    I'm not really sure who uses kernels without compaction support, though :(
    
    
    To fix this, clear the __GFP_MOVABLE flag when
    !IS_ENABLED(CONFIG_COMPACTION).
    
    Link: https://lkml.kernel.org/r/20250704103053.6913-1-harry.yoo@oracle.com
    Fixes: 48b4800a1c6a ("zsmalloc: page migration support")
    Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
    Acked-by: David Hildenbrand <david@redhat.com>
    Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
net/mlx5: E-Switch, Fix peer miss rules to use peer eswitch [+ + +]
Author: Shahar Shitrit <shshitrit@nvidia.com>
Date:   Thu Jul 17 15:06:10 2025 +0300

    net/mlx5: E-Switch, Fix peer miss rules to use peer eswitch
    
    [ Upstream commit 5b4c56ad4da0aa00b258ab50b1f5775b7d3108c7 ]
    
    In the original design, it is assumed local and peer eswitches have the
    same number of vfs. However, in new firmware, local and peer eswitches
    can have different number of vfs configured by mlxconfig.  In such
    configuration, it is incorrect to derive the number of vfs from the
    local device's eswitch.
    
    Fix this by updating the peer miss rules add and delete functions to use
    the peer device's eswitch and vf count instead of the local device's
    information, ensuring correct behavior regardless of vf configuration
    differences.
    
    Fixes: ac004b832128 ("net/mlx5e: E-Switch, Add peer miss rules")
    Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
    Reviewed-by: Mark Bloch <mbloch@nvidia.com>
    Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/1752753970-261832-3-git-send-email-tariqt@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5: Fix memory leak in cmd_exec() [+ + +]
Author: Chiara Meiohas <cmeiohas@nvidia.com>
Date:   Thu Jul 17 15:06:09 2025 +0300

    net/mlx5: Fix memory leak in cmd_exec()
    
    [ Upstream commit 3afa3ae3db52e3c216d77bd5907a5a86833806cc ]
    
    If cmd_exec() is called with callback and mlx5_cmd_invoke() returns an
    error, resources allocated in cmd_exec() will not be freed.
    
    Fix the code to release the resources if mlx5_cmd_invoke() returns an
    error.
    
    Fixes: f086470122d5 ("net/mlx5: cmdif, Return value improvements")
    Reported-by: Alex Tereshkin <atereshkin@nvidia.com>
    Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
    Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
    Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
    Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/1752753970-261832-2-git-send-email-tariqt@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net/sched: sch_qfq: Avoid triggering might_sleep in atomic context in qfq_delete_class [+ + +]
Author: Xiang Mei <xmei5@asu.edu>
Date:   Thu Jul 17 16:01:28 2025 -0700

    net/sched: sch_qfq: Avoid triggering might_sleep in atomic context in qfq_delete_class
    
    [ Upstream commit cf074eca0065bc5142e6004ae236bb35a2687fdf ]
    
    might_sleep could be trigger in the atomic context in qfq_delete_class.
    
    qfq_destroy_class was moved into atomic context locked
    by sch_tree_lock to avoid a race condition bug on
    qfq_aggregate. However, might_sleep could be triggered by
    qfq_destroy_class, which introduced sleeping in atomic context (path:
    qfq_destroy_class->qdisc_put->__qdisc_destroy->lockdep_unregister_key
    ->might_sleep).
    
    Considering the race is on the qfq_aggregate objects, keeping
    qfq_rm_from_agg in the lock but moving the left part out can solve
    this issue.
    
    Fixes: 5e28d5a3f774 ("net/sched: sch_qfq: Fix race condition on qfq_aggregate")
    Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
    Signed-off-by: Xiang Mei <xmei5@asu.edu>
    Link: https://patch.msgid.link/4a04e0cc-a64b-44e7-9213-2880ed641d77@sabinyo.mountain
    Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
    Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
    Link: https://patch.msgid.link/20250717230128.159766-1-xmei5@asu.edu
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net: appletalk: Fix use-after-free in AARP proxy probe [+ + +]
Author: Kito Xu (veritas501) <hxzene@gmail.com>
Date:   Thu Jul 17 01:28:43 2025 +0000

    net: appletalk: Fix use-after-free in AARP proxy probe
    
    [ Upstream commit 6c4a92d07b0850342d3becf2e608f805e972467c ]
    
    The AARP proxy‐probe routine (aarp_proxy_probe_network) sends a probe,
    releases the aarp_lock, sleeps, then re-acquires the lock.  During that
    window an expire timer thread (__aarp_expire_timer) can remove and
    kfree() the same entry, leading to a use-after-free.
    
    race condition:
    
             cpu 0                          |            cpu 1
        atalk_sendmsg()                     |   atif_proxy_probe_device()
        aarp_send_ddp()                     |   aarp_proxy_probe_network()
        mod_timer()                         |   lock(aarp_lock) // LOCK!!
        timeout around 200ms                |   alloc(aarp_entry)
        and then call                       |   proxies[hash] = aarp_entry
        aarp_expire_timeout()               |   aarp_send_probe()
                                            |   unlock(aarp_lock) // UNLOCK!!
        lock(aarp_lock) // LOCK!!           |   msleep(100);
        __aarp_expire_timer(&proxies[ct])   |
        free(aarp_entry)                    |
        unlock(aarp_lock) // UNLOCK!!       |
                                            |   lock(aarp_lock) // LOCK!!
                                            |   UAF aarp_entry !!
    
    ==================================================================
    BUG: KASAN: slab-use-after-free in aarp_proxy_probe_network+0x560/0x630 net/appletalk/aarp.c:493
    Read of size 4 at addr ffff8880123aa360 by task repro/13278
    
    CPU: 3 UID: 0 PID: 13278 Comm: repro Not tainted 6.15.2 #3 PREEMPT(full)
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:94 [inline]
     dump_stack_lvl+0x116/0x1b0 lib/dump_stack.c:120
     print_address_description mm/kasan/report.c:408 [inline]
     print_report+0xc1/0x630 mm/kasan/report.c:521
     kasan_report+0xca/0x100 mm/kasan/report.c:634
     aarp_proxy_probe_network+0x560/0x630 net/appletalk/aarp.c:493
     atif_proxy_probe_device net/appletalk/ddp.c:332 [inline]
     atif_ioctl+0xb58/0x16c0 net/appletalk/ddp.c:857
     atalk_ioctl+0x198/0x2f0 net/appletalk/ddp.c:1818
     sock_do_ioctl+0xdc/0x260 net/socket.c:1190
     sock_ioctl+0x239/0x6a0 net/socket.c:1311
     vfs_ioctl fs/ioctl.c:51 [inline]
     __do_sys_ioctl fs/ioctl.c:906 [inline]
     __se_sys_ioctl fs/ioctl.c:892 [inline]
     __x64_sys_ioctl+0x194/0x200 fs/ioctl.c:892
     do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
     do_syscall_64+0xcb/0x250 arch/x86/entry/syscall_64.c:94
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
     </TASK>
    
    Allocated:
     aarp_alloc net/appletalk/aarp.c:382 [inline]
     aarp_proxy_probe_network+0xd8/0x630 net/appletalk/aarp.c:468
     atif_proxy_probe_device net/appletalk/ddp.c:332 [inline]
     atif_ioctl+0xb58/0x16c0 net/appletalk/ddp.c:857
     atalk_ioctl+0x198/0x2f0 net/appletalk/ddp.c:1818
    
    Freed:
     kfree+0x148/0x4d0 mm/slub.c:4841
     __aarp_expire net/appletalk/aarp.c:90 [inline]
     __aarp_expire_timer net/appletalk/aarp.c:261 [inline]
     aarp_expire_timeout+0x480/0x6e0 net/appletalk/aarp.c:317
    
    The buggy address belongs to the object at ffff8880123aa300
     which belongs to the cache kmalloc-192 of size 192
    The buggy address is located 96 bytes inside of
     freed 192-byte region [ffff8880123aa300, ffff8880123aa3c0)
    
    Memory state around the buggy address:
     ffff8880123aa200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     ffff8880123aa280: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc
    >ffff8880123aa300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                           ^
     ffff8880123aa380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
     ffff8880123aa400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    ==================================================================
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kito Xu (veritas501) <hxzene@gmail.com>
    Link: https://patch.msgid.link/20250717012843.880423-1-hxzene@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hns3: default enable tx bounce buffer when smmu enabled [+ + +]
Author: Jijie Shao <shaojijie@huawei.com>
Date:   Tue Jul 22 20:54:23 2025 +0800

    net: hns3: default enable tx bounce buffer when smmu enabled
    
    [ Upstream commit 49ade8630f36e9dca2395592cfb0b7deeb07e746 ]
    
    The SMMU engine on HIP09 chip has a hardware issue.
    SMMU pagetable prefetch features may prefetch and use a invalid PTE
    even the PTE is valid at that time. This will cause the device trigger
    fake pagefaults. The solution is to avoid prefetching by adding a
    SYNC command when smmu mapping a iova. But the performance of nic has a
    sharp drop. Then we do this workaround, always enable tx bounce buffer,
    avoid mapping/unmapping on TX path.
    
    This issue only affects HNS3, so we always enable
    tx bounce buffer when smmu enabled to improve performance.
    
    Fixes: 295ba232a8c3 ("net: hns3: add device version to replace pci revision")
    Signed-off-by: Jian Shen <shenjian15@huawei.com>
    Signed-off-by: Jijie Shao <shaojijie@huawei.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20250722125423.1270673-5-shaojijie@huawei.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hns3: disable interrupt when ptp init failed [+ + +]
Author: Yonglong Liu <liuyonglong@huawei.com>
Date:   Tue Jul 22 20:54:21 2025 +0800

    net: hns3: disable interrupt when ptp init failed
    
    [ Upstream commit cde304655f25d94a996c45b0f9956e7dcc2bc4c0 ]
    
    When ptp init failed, we'd better disable the interrupt and clear the
    flag, to avoid early report interrupt at next probe.
    
    Fixes: 0bf5eb788512 ("net: hns3: add support for PTP")
    Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
    Signed-off-by: Jijie Shao <shaojijie@huawei.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20250722125423.1270673-3-shaojijie@huawei.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hns3: fix concurrent setting vlan filter issue [+ + +]
Author: Jian Shen <shenjian15@huawei.com>
Date:   Tue Jul 22 20:54:20 2025 +0800

    net: hns3: fix concurrent setting vlan filter issue
    
    [ Upstream commit 4555f8f8b6aa46940f55feb6a07704c2935b6d6e ]
    
    The vport->req_vlan_fltr_en may be changed concurrently by function
    hclge_sync_vlan_fltr_state() called in periodic work task and
    function hclge_enable_vport_vlan_filter() called by user configuration.
    It may cause the user configuration inoperative. Fixes it by protect
    the vport->req_vlan_fltr by vport_lock.
    
    Fixes: 2ba306627f59 ("net: hns3: add support for modify VLAN filter state")
    Signed-off-by: Jian Shen <shenjian15@huawei.com>
    Signed-off-by: Jijie Shao <shaojijie@huawei.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20250722125423.1270673-2-shaojijie@huawei.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hns3: fixed vf get max channels bug [+ + +]
Author: Jian Shen <shenjian15@huawei.com>
Date:   Tue Jul 22 20:54:22 2025 +0800

    net: hns3: fixed vf get max channels bug
    
    [ Upstream commit b3e75c0bcc53f647311960bc1b0970b9b480ca5a ]
    
    Currently, the queried maximum of vf channels is the maximum of channels
    supported by each TC. However, the actual maximum of channels is
    the maximum of channels supported by the device.
    
    Fixes: 849e46077689 ("net: hns3: add ethtool_ops.get_channels support for VF")
    Signed-off-by: Jian Shen <shenjian15@huawei.com>
    Signed-off-by: Hao Lan <lanhao@huawei.com>
    Signed-off-by: Jijie Shao <shaojijie@huawei.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20250722125423.1270673-4-shaojijie@huawei.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ti: icssg-prueth: Fix buffer allocation for ICSSG [+ + +]
Author: Himanshu Mittal <h-mittal1@ti.com>
Date:   Thu Jul 17 15:12:20 2025 +0530

    net: ti: icssg-prueth: Fix buffer allocation for ICSSG
    
    [ Upstream commit 6e86fb73de0fe3ec5cdcd5873ad1d6005f295b64 ]
    
    Fixes overlapping buffer allocation for ICSSG peripheral
    used for storing packets to be received/transmitted.
    There are 3 buffers:
    1. Buffer for Locally Injected Packets
    2. Buffer for Forwarding Packets
    3. Buffer for Host Egress Packets
    
    In existing allocation buffers for 2. and 3. are overlapping causing
    packet corruption.
    
    Packet corruption observations:
    During tcp iperf testing, due to overlapping buffers the received ack
    packet overwrites the packet to be transmitted. So, we see packets on
    wire with the ack packet content inside the content of next TCP packet
    from sender device.
    
    Details for AM64x switch mode:
    -> Allocation by existing driver:
    +---------+-------------------------------------------------------------+
    |         |          SLICE 0             |          SLICE 1             |
    |         +------+--------------+--------+------+--------------+--------+
    |         | Slot | Base Address | Size   | Slot | Base Address | Size   |
    |---------+------+--------------+--------+------+--------------+--------+
    |         | 0    | 70000000     | 0x2000 | 0    | 70010000     | 0x2000 |
    |         | 1    | 70002000     | 0x2000 | 1    | 70012000     | 0x2000 |
    |         | 2    | 70004000     | 0x2000 | 2    | 70014000     | 0x2000 |
    | FWD     | 3    | 70006000     | 0x2000 | 3    | 70016000     | 0x2000 |
    | Buffers | 4    | 70008000     | 0x2000 | 4    | 70018000     | 0x2000 |
    |         | 5    | 7000A000     | 0x2000 | 5    | 7001A000     | 0x2000 |
    |         | 6    | 7000C000     | 0x2000 | 6    | 7001C000     | 0x2000 |
    |         | 7    | 7000E000     | 0x2000 | 7    | 7001E000     | 0x2000 |
    +---------+------+--------------+--------+------+--------------+--------+
    |         | 8    | 70020000     | 0x1000 | 8    | 70028000     | 0x1000 |
    |         | 9    | 70021000     | 0x1000 | 9    | 70029000     | 0x1000 |
    |         | 10   | 70022000     | 0x1000 | 10   | 7002A000     | 0x1000 |
    | Our     | 11   | 70023000     | 0x1000 | 11   | 7002B000     | 0x1000 |
    | LI      | 12   | 00000000     | 0x0    | 12   | 00000000     | 0x0    |
    | Buffers | 13   | 00000000     | 0x0    | 13   | 00000000     | 0x0    |
    |         | 14   | 00000000     | 0x0    | 14   | 00000000     | 0x0    |
    |         | 15   | 00000000     | 0x0    | 15   | 00000000     | 0x0    |
    +---------+------+--------------+--------+------+--------------+--------+
    |         | 16   | 70024000     | 0x1000 | 16   | 7002C000     | 0x1000 |
    |         | 17   | 70025000     | 0x1000 | 17   | 7002D000     | 0x1000 |
    |         | 18   | 70026000     | 0x1000 | 18   | 7002E000     | 0x1000 |
    | Their   | 19   | 70027000     | 0x1000 | 19   | 7002F000     | 0x1000 |
    | LI      | 20   | 00000000     | 0x0    | 20   | 00000000     | 0x0    |
    | Buffers | 21   | 00000000     | 0x0    | 21   | 00000000     | 0x0    |
    |         | 22   | 00000000     | 0x0    | 22   | 00000000     | 0x0    |
    |         | 23   | 00000000     | 0x0    | 23   | 00000000     | 0x0    |
    +---------+------+--------------+--------+------+--------------+--------+
    --> here 16, 17, 18, 19 overlapping with below express buffer
    
    +-----+-----------------------------------------------+
    |     |       SLICE 0       |        SLICE 1          |
    |     +------------+----------+------------+----------+
    |     | Start addr | End addr | Start addr | End addr |
    +-----+------------+----------+------------+----------+
    | EXP | 70024000   | 70028000 | 7002C000   | 70030000 | <-- Overlapping
    | PRE | 70030000   | 70033800 | 70034000   | 70037800 |
    +-----+------------+----------+------------+----------+
    
    +---------------------+----------+----------+
    |                     | SLICE 0  |  SLICE 1 |
    +---------------------+----------+----------+
    | Default Drop Offset | 00000000 | 00000000 |     <-- Field not configured
    +---------------------+----------+----------+
    
    -> Allocation this patch brings:
    +---------+-------------------------------------------------------------+
    |         |          SLICE 0             |          SLICE 1             |
    |         +------+--------------+--------+------+--------------+--------+
    |         | Slot | Base Address | Size   | Slot | Base Address | Size   |
    |---------+------+--------------+--------+------+--------------+--------+
    |         | 0    | 70000000     | 0x2000 | 0    | 70040000     | 0x2000 |
    |         | 1    | 70002000     | 0x2000 | 1    | 70042000     | 0x2000 |
    |         | 2    | 70004000     | 0x2000 | 2    | 70044000     | 0x2000 |
    | FWD     | 3    | 70006000     | 0x2000 | 3    | 70046000     | 0x2000 |
    | Buffers | 4    | 70008000     | 0x2000 | 4    | 70048000     | 0x2000 |
    |         | 5    | 7000A000     | 0x2000 | 5    | 7004A000     | 0x2000 |
    |         | 6    | 7000C000     | 0x2000 | 6    | 7004C000     | 0x2000 |
    |         | 7    | 7000E000     | 0x2000 | 7    | 7004E000     | 0x2000 |
    +---------+------+--------------+--------+------+--------------+--------+
    |         | 8    | 70010000     | 0x1000 | 8    | 70050000     | 0x1000 |
    |         | 9    | 70011000     | 0x1000 | 9    | 70051000     | 0x1000 |
    |         | 10   | 70012000     | 0x1000 | 10   | 70052000     | 0x1000 |
    | Our     | 11   | 70013000     | 0x1000 | 11   | 70053000     | 0x1000 |
    | LI      | 12   | 00000000     | 0x0    | 12   | 00000000     | 0x0    |
    | Buffers | 13   | 00000000     | 0x0    | 13   | 00000000     | 0x0    |
    |         | 14   | 00000000     | 0x0    | 14   | 00000000     | 0x0    |
    |         | 15   | 00000000     | 0x0    | 15   | 00000000     | 0x0    |
    +---------+------+--------------+--------+------+--------------+--------+
    |         | 16   | 70014000     | 0x1000 | 16   | 70054000     | 0x1000 |
    |         | 17   | 70015000     | 0x1000 | 17   | 70055000     | 0x1000 |
    |         | 18   | 70016000     | 0x1000 | 18   | 70056000     | 0x1000 |
    | Their   | 19   | 70017000     | 0x1000 | 19   | 70057000     | 0x1000 |
    | LI      | 20   | 00000000     | 0x0    | 20   | 00000000     | 0x0    |
    | Buffers | 21   | 00000000     | 0x0    | 21   | 00000000     | 0x0    |
    |         | 22   | 00000000     | 0x0    | 22   | 00000000     | 0x0    |
    |         | 23   | 00000000     | 0x0    | 23   | 00000000     | 0x0    |
    +---------+------+--------------+--------+------+--------------+--------+
    
    +-----+-----------------------------------------------+
    |     |       SLICE 0       |        SLICE 1          |
    |     +------------+----------+------------+----------+
    |     | Start addr | End addr | Start addr | End addr |
    +-----+------------+----------+------------+----------+
    | EXP | 70018000   | 7001C000 | 70058000   | 7005C000 |
    | PRE | 7001C000   | 7001F800 | 7005C000   | 7005F800 |
    +-----+------------+----------+------------+----------+
    
    +---------------------+----------+----------+
    |                     | SLICE 0  |  SLICE 1 |
    +---------------------+----------+----------+
    | Default Drop Offset | 7001F800 | 7005F800 |
    +---------------------+----------+----------+
    
    Rootcause: missing buffer configuration for Express frames in
    function: prueth_fw_offload_buffer_setup()
    
    Details:
    Driver implements two distinct buffer configuration functions that are
    invoked based on the driver state and ICSSG firmware:-
    - prueth_fw_offload_buffer_setup()
    - prueth_emac_buffer_setup()
    
    During initialization, driver creates standard network interfaces
    (netdevs) and configures buffers via prueth_emac_buffer_setup().
    This function properly allocates and configures all required memory
    regions including:
    - LI buffers
    - Express packet buffers
    - Preemptible packet buffers
    
    However, when the driver transitions to an offload mode (switch/HSR/PRP),
    buffer reconfiguration is handled by prueth_fw_offload_buffer_setup().
    This function does not reconfigure the buffer regions required for
    Express packets, leading to incorrect buffer allocation.
    
    Fixes: abd5576b9c57 ("net: ti: icssg-prueth: Add support for ICSSG switch firmware")
    Signed-off-by: Himanshu Mittal <h-mittal1@ti.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20250717094220.546388-1-h-mittal1@ti.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
nilfs2: reject invalid file types when reading inodes [+ + +]
Author: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Date:   Thu Jul 10 22:49:08 2025 +0900

    nilfs2: reject invalid file types when reading inodes
    
    commit 4aead50caf67e01020c8be1945c3201e8a972a27 upstream.
    
    To prevent inodes with invalid file types from tripping through the vfs
    and causing malfunctions or assertion failures, add a missing sanity check
    when reading an inode from a block device.  If the file type is not valid,
    treat it as a filesystem error.
    
    Link: https://lkml.kernel.org/r/20250710134952.29862-1-konishi.ryusuke@gmail.com
    Fixes: 05fe58fdc10d ("nilfs2: inode operations")
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
    Reported-by: syzbot+895c23f6917da440ed0d@syzkaller.appspotmail.com
    Link: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
PCI/pwrctrl: Create pwrctrl devices only when CONFIG_PCI_PWRCTRL is enabled [+ + +]
Author: Manivannan Sadhasivam <mani@kernel.org>
Date:   Tue Jul 1 12:17:31 2025 +0530

    PCI/pwrctrl: Create pwrctrl devices only when CONFIG_PCI_PWRCTRL is enabled
    
    commit 8c493cc91f3a1102ad2f8c75ae0cf80f0a057488 upstream.
    
    If devicetree describes power supplies related to a PCI device, we
    unnecessarily created a pwrctrl device even if CONFIG_PCI_PWRCTL was not
    enabled.
    
    We only need pci_pwrctrl_create_device() when CONFIG_PCI_PWRCTRL is
    enabled.  Compile it out when CONFIG_PCI_PWRCTRL is not enabled.
    
    When pci_pwrctrl_create_device() creates and returns a pwrctrl device,
    pci_scan_device() doesn't enumerate the PCI device. It assumes the pwrctrl
    core will rescan the bus after turning on the power. However, if
    CONFIG_PCI_PWRCTRL is not enabled, the rescan never happens, which breaks
    PCI enumeration on any system that describes power supplies in devicetree
    but does not use pwrctrl.
    
    Jim reported that some brcmstb platforms break this way.  The brcmstb
    driver is still broken if CONFIG_PCI_PWRCTRL is enabled, but this commit at
    least allows brcmstb to work when it's NOT enabled.
    
    Fixes: 957f40d039a9 ("PCI/pwrctrl: Move creation of pwrctrl devices to pci_scan_device()")
    Reported-by: Jim Quinlan <james.quinlan@broadcom.com>
    Link: https://lore.kernel.org/r/CA+-6iNwgaByXEYD3j=-+H_PKAxXRU78svPMRHDKKci8AGXAUPg@mail.gmail.com
    Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    [bhelgaas: commit log]
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Reviewed-by: Lukas Wunner <lukas@wunner.de>
    Cc: stable@vger.kernel.org      # v6.15
    Link: https://patch.msgid.link/20250701064731.52901-1-manivannan.sadhasivam@linaro.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
platform/mellanox: mlxbf-pmc: Remove newline char from event name input [+ + +]
Author: Shravan Kumar Ramani <shravankr@nvidia.com>
Date:   Wed Jul 2 06:09:01 2025 -0400

    platform/mellanox: mlxbf-pmc: Remove newline char from event name input
    
    [ Upstream commit 44e6ca8faeeed12206f3e7189c5ac618b810bb9c ]
    
    Since the input string passed via the command line appends a newline char,
    it needs to be removed before comparison with the event_list.
    
    Fixes: 1a218d312e65 ("platform/mellanox: mlxbf-pmc: Add Mellanox BlueField PMC driver")
    Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
    Reviewed-by: David Thompson <davthompson@nvidia.com>
    Link: https://lore.kernel.org/r/4978c18e33313b48fa2ae7f3aa6dbcfce40877e4.1751380187.git.shravankr@nvidia.com
    Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

platform/mellanox: mlxbf-pmc: Use kstrtobool() to check 0/1 input [+ + +]
Author: Shravan Kumar Ramani <shravankr@nvidia.com>
Date:   Wed Jul 2 06:09:02 2025 -0400

    platform/mellanox: mlxbf-pmc: Use kstrtobool() to check 0/1 input
    
    [ Upstream commit 0e2cebd72321caeef84b6ba7084e85be0287fb4b ]
    
    For setting the enable value, the input should be 0 or 1 only. Use
    kstrtobool() in place of kstrtoint() in mlxbf_pmc_enable_store() to
    accept only valid input.
    
    Fixes: 423c3361855c ("platform/mellanox: mlxbf-pmc: Add support for BlueField-3")
    Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
    Reviewed-by: David Thompson <davthompson@nvidia.com>
    Link: https://lore.kernel.org/r/2ee618c59976bcf1379d5ddce2fc60ab5014b3a9.1751380187.git.shravankr@nvidia.com
    [ij: split kstrbool() change to own commit.]
    Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

platform/mellanox: mlxbf-pmc: Validate event/enable input [+ + +]
Author: Shravan Kumar Ramani <shravankr@nvidia.com>
Date:   Wed Jul 2 06:09:02 2025 -0400

    platform/mellanox: mlxbf-pmc: Validate event/enable input
    
    [ Upstream commit f8c1311769d3b2c82688b294b4ae03e94f1c326d ]
    
    Before programming the event info, validate the event number received as input
    by checking if it exists in the event_list. Also fix a typo in the comment for
    mlxbf_pmc_get_event_name() to correctly mention that it returns the event name
    when taking the event number as input, and not the other way round.
    
    Fixes: 423c3361855c ("platform/mellanox: mlxbf-pmc: Add support for BlueField-3")
    Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
    Reviewed-by: David Thompson <davthompson@nvidia.com>
    Link: https://lore.kernel.org/r/2ee618c59976bcf1379d5ddce2fc60ab5014b3a9.1751380187.git.shravankr@nvidia.com
    [ij: split kstrbool() change to own commit.]
    Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
platform/x86: alienware-wmi-wmax: Fix `dmi_system_id` array [+ + +]
Author: Kurt Borja <kuurtb@gmail.com>
Date:   Mon Jul 7 03:24:05 2025 -0300

    platform/x86: alienware-wmi-wmax: Fix `dmi_system_id` array
    
    commit 8346c6af27f1c1410eb314f4be5875fdf1579a10 upstream.
    
    Add missing empty member to `awcc_dmi_table`.
    
    Cc: stable@vger.kernel.org
    Fixes: 6d7f1b1a5db6 ("platform/x86: alienware-wmi: Split DMI table")
    Signed-off-by: Kurt Borja <kuurtb@gmail.com>
    Reviewed-by: Hans de Goede <hansg@kernel.org>
    Link: https://lore.kernel.org/r/20250707-dmi-fix-v1-1-6730835d824d@gmail.com
    Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

platform/x86: asus-nb-wmi: add DMI quirk for ASUS Zenbook Duo UX8406CA [+ + +]
Author: Rahul Chandra <rahul@chandra.net>
Date:   Tue Jun 24 03:33:01 2025 -0400

    platform/x86: asus-nb-wmi: add DMI quirk for ASUS Zenbook Duo UX8406CA
    
    [ Upstream commit 7dc6b2d3b5503bcafebbeaf9818112bf367107b4 ]
    
    Add a DMI quirk entry for the ASUS Zenbook Duo UX8406CA 2025 model to use
    the existing zenbook duo keyboard quirk.
    
    Signed-off-by: Rahul Chandra <rahul@chandra.net>
    Link: https://lore.kernel.org/r/20250624073301.602070-1-rahul@chandra.net
    Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

platform/x86: Fix initialization order for firmware_attributes_class [+ + +]
Author: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Date:   Fri Jul 11 12:32:54 2025 +0200

    platform/x86: Fix initialization order for firmware_attributes_class
    
    [ Upstream commit 2bfe3ae1aa45f8b61cb0dc462114fd0c9636ad32 ]
    
    The think-lmi driver uses the firwmare_attributes_class. But this class
    is registered after think-lmi, causing the "think-lmi" directory in
    "/sys/class/firmware-attributes" to be missing when the driver is
    compiled as builtin.
    
    Fixes: 55922403807a ("platform/x86: think-lmi: Directly use firmware_attributes_class")
    Signed-off-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
    Link: https://lore.kernel.org/r/7dce5f7f-c348-4350-ac53-d14a8e1e8034@secunet.com
    Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

platform/x86: ideapad-laptop: Fix FnLock not remembered among boots [+ + +]
Author: Rong Zhang <i@rong.moe>
Date:   Tue Jul 8 00:38:06 2025 +0800

    platform/x86: ideapad-laptop: Fix FnLock not remembered among boots
    
    commit 9533b789df7e8d273543a5991aec92447be043d7 upstream.
    
    On devices supported by ideapad-laptop, the HW/FW can remember the
    FnLock state among boots. However, since the introduction of the FnLock
    LED class device, it is turned off while shutting down, as a side effect
    of the LED class device unregistering sequence.
    
    Many users always turn on FnLock because they use function keys much
    more frequently than multimedia keys. The behavior change is
    inconvenient for them. Thus, set LED_RETAIN_AT_SHUTDOWN on the LED class
    device so that the FnLock state gets remembered, which also aligns with
    the behavior of manufacturer utilities on Windows.
    
    Fixes: 07f48f668fac ("platform/x86: ideapad-laptop: add FnLock LED class device")
    Cc: stable@vger.kernel.org
    Signed-off-by: Rong Zhang <i@rong.moe>
    Reviewed-by: Hans de Goede <hansg@kernel.org>
    Link: https://lore.kernel.org/r/20250707163808.155876-2-i@rong.moe
    Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

platform/x86: ideapad-laptop: Fix kbd backlight not remembered among boots [+ + +]
Author: Rong Zhang <i@rong.moe>
Date:   Tue Jul 8 00:38:07 2025 +0800

    platform/x86: ideapad-laptop: Fix kbd backlight not remembered among boots
    
    commit e10981075adce203eac0be866389309eeb8ef11e upstream.
    
    On some models supported by ideapad-laptop, the HW/FW can remember the
    state of keyboard backlight among boots. However, it is always turned
    off while shutting down, as a side effect of the LED class device
    unregistering sequence.
    
    This is inconvenient for users who always prefer turning on the
    keyboard backlight. Thus, set LED_RETAIN_AT_SHUTDOWN on the LED class
    device so that the state of keyboard backlight gets remembered, which
    also aligns with the behavior of manufacturer utilities on Windows.
    
    Fixes: 503325f84bc0 ("platform/x86: ideapad-laptop: add keyboard backlight control support")
    Cc: stable@vger.kernel.org
    Signed-off-by: Rong Zhang <i@rong.moe>
    Reviewed-by: Hans de Goede <hansg@kernel.org>
    Link: https://lore.kernel.org/r/20250707163808.155876-3-i@rong.moe
    Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
RDMA/core: Rate limit GID cache warning messages [+ + +]
Author: Maor Gottlieb <maorg@nvidia.com>
Date:   Mon Jun 16 11:26:21 2025 +0300

    RDMA/core: Rate limit GID cache warning messages
    
    [ Upstream commit 333e4d79316c9ed5877d7aac8b8ed22efc74e96d ]
    
    The GID cache warning messages can flood the kernel log when there are
    multiple failed attempts to add GIDs. This can happen when creating many
    virtual interfaces without having enough space for their GIDs in the GID
    table.
    
    Change pr_warn to pr_warn_ratelimited to prevent log flooding while still
    maintaining visibility of the issue.
    
    Link: https://patch.msgid.link/r/fd45ed4a1078e743f498b234c3ae816610ba1b18.1750062357.git.leon@kernel.org
    Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
regmap: fix potential memory leak of regmap_bus [+ + +]
Author: Abdun Nihaal <abdun.nihaal@gmail.com>
Date:   Thu Jun 26 22:58:21 2025 +0530

    regmap: fix potential memory leak of regmap_bus
    
    [ Upstream commit c871c199accb39d0f4cb941ad0dccabfc21e9214 ]
    
    When __regmap_init() is called from __regmap_init_i2c() and
    __regmap_init_spi() (and their devm versions), the bus argument
    obtained from regmap_get_i2c_bus() and regmap_get_spi_bus(), may be
    allocated using kmemdup() to support quirks. In those cases, the
    bus->free_on_exit field is set to true.
    
    However, inside __regmap_init(), buf is not freed on any error path.
    This could lead to a memory leak of regmap_bus when __regmap_init()
    fails. Fix that by freeing bus on error path when free_on_exit is set.
    
    Fixes: ea030ca68819 ("regmap-i2c: Set regmap max raw r/w from quirks")
    Signed-off-by: Abdun Nihaal <abdun.nihaal@gmail.com>
    Link: https://patch.msgid.link/20250626172823.18725-1-abdun.nihaal@gmail.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
regulator: core: fix NULL dereference on unbind due to stale coupling data [+ + +]
Author: Alessandro Carminati <acarmina@redhat.com>
Date:   Thu Jun 26 08:38:09 2025 +0000

    regulator: core: fix NULL dereference on unbind due to stale coupling data
    
    [ Upstream commit ca46946a482238b0cdea459fb82fc837fb36260e ]
    
    Failing to reset coupling_desc.n_coupled after freeing coupled_rdevs can
    lead to NULL pointer dereference when regulators are accessed post-unbind.
    
    This can happen during runtime PM or other regulator operations that rely
    on coupling metadata.
    
    For example, on ridesx4, unbinding the 'reg-dummy' platform device triggers
    a panic in regulator_lock_recursive() due to stale coupling state.
    
    Ensure n_coupled is set to 0 to prevent access to invalid pointers.
    
    Signed-off-by: Alessandro Carminati <acarmina@redhat.com>
    Link: https://patch.msgid.link/20250626083809.314842-1-acarmina@redhat.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
resource: fix false warning in __request_region() [+ + +]
Author: Akinobu Mita <akinobu.mita@gmail.com>
Date:   Sat Jul 19 20:26:04 2025 +0900

    resource: fix false warning in __request_region()
    
    commit 91a229bb7ba86b2592c3f18c54b7b2c5e6fe0f95 upstream.
    
    A warning is raised when __request_region() detects a conflict with a
    resource whose resource.desc is IORES_DESC_DEVICE_PRIVATE_MEMORY.
    
    But this warning is only valid for iomem_resources.
    The hmem device resource uses resource.desc as the numa node id, which can
    cause spurious warnings.
    
    This warning appeared on a machine with multiple cxl memory expanders.
    One of the NUMA node id is 6, which is the same as the value of
    IORES_DESC_DEVICE_PRIVATE_MEMORY.
    
    In this environment it was just a spurious warning, but when I saw the
    warning I suspected a real problem so it's better to fix it.
    
    This change fixes this by restricting the warning to only iomem_resource.
    This also adds a missing new line to the warning message.
    
    Link: https://lkml.kernel.org/r/20250719112604.25500-1-akinobu.mita@gmail.com
    Fixes: 7dab174e2e27 ("dax/hmem: Move hmem device registration to dax_hmem.ko")
    Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
    Reviewed-by: Dan Williams <dan.j.williams@intel.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Revert "drm/gem-dma: Use dma_buf from GEM object instance" [+ + +]
Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Tue Jul 15 17:58:17 2025 +0200

    Revert "drm/gem-dma: Use dma_buf from GEM object instance"
    
    commit 1918e79be908b8a2c8757640289bc196c14d928a upstream.
    
    This reverts commit e8afa1557f4f963c9a511bd2c6074a941c308685.
    
    The dma_buf field in struct drm_gem_object is not stable over the
    object instance's lifetime. The field becomes NULL when user space
    releases the final GEM handle on the buffer object. This resulted
    in a NULL-pointer deref.
    
    Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
    GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
    Acquire internal references on GEM handles") only solved the problem
    partially. They especially don't work for buffer objects without a DRM
    framebuffer associated.
    
    Hence, this revert to going back to using .import_attach->dmabuf.
    
    v3:
    - cc stable
    
    Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
    Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
    Acked-by: Christian König <christian.koenig@amd.com>
    Acked-by: Zack Rusin <zack.rusin@broadcom.com>
    Cc: <stable@vger.kernel.org> # v6.15+
    Link: https://lore.kernel.org/r/20250715155934.150656-8-tzimmermann@suse.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Revert "drm/gem-framebuffer: Use dma_buf from GEM object instance" [+ + +]
Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Tue Jul 15 17:58:15 2025 +0200

    Revert "drm/gem-framebuffer: Use dma_buf from GEM object instance"
    
    commit 2712ca878b688682ac2ce02aefc413fc76019cd9 upstream.
    
    This reverts commit cce16fcd7446dcff7480cd9d2b6417075ed81065.
    
    The dma_buf field in struct drm_gem_object is not stable over the
    object instance's lifetime. The field becomes NULL when user space
    releases the final GEM handle on the buffer object. This resulted
    in a NULL-pointer deref.
    
    Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
    GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
    Acquire internal references on GEM handles") only solved the problem
    partially. They especially don't work for buffer objects without a DRM
    framebuffer associated.
    
    Hence, this revert to going back to using .import_attach->dmabuf.
    
    v3:
    - cc stable
    
    Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
    Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
    Acked-by: Christian König <christian.koenig@amd.com>
    Acked-by: Zack Rusin <zack.rusin@broadcom.com>
    Cc: <stable@vger.kernel.org> # v6.15+
    Link: https://lore.kernel.org/r/20250715155934.150656-6-tzimmermann@suse.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Revert "drm/gem-shmem: Use dma_buf from GEM object instance" [+ + +]
Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Tue Jul 15 17:58:16 2025 +0200

    Revert "drm/gem-shmem: Use dma_buf from GEM object instance"
    
    commit 6d496e9569983a0d7a05be6661126d0702cf94f7 upstream.
    
    This reverts commit 1a148af06000e545e714fe3210af3d77ff903c11.
    
    The dma_buf field in struct drm_gem_object is not stable over the
    object instance's lifetime. The field becomes NULL when user space
    releases the final GEM handle on the buffer object. This resulted
    in a NULL-pointer deref.
    
    Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
    GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
    Acquire internal references on GEM handles") only solved the problem
    partially. They especially don't work for buffer objects without a DRM
    framebuffer associated.
    
    Hence, this revert to going back to using .import_attach->dmabuf.
    
    v3:
    - cc stable
    
    Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
    Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
    Acked-by: Christian König <christian.koenig@amd.com>
    Acked-by: Zack Rusin <zack.rusin@broadcom.com>
    Cc: <stable@vger.kernel.org> # v6.15+
    Link: https://lore.kernel.org/r/20250715155934.150656-7-tzimmermann@suse.de
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Revert "drm/prime: Use dma_buf from GEM object instance" [+ + +]
Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Tue Jul 15 17:58:14 2025 +0200

    Revert "drm/prime: Use dma_buf from GEM object instance"
    
    commit fb4ef4a52b79a22ad382bfe77332642d02aef773 upstream.
    
    This reverts commit f83a9b8c7fd0557b0c50784bfdc1bbe9140c9bf8.
    
    The dma_buf field in struct drm_gem_object is not stable over the
    object instance's lifetime. The field becomes NULL when user space
    releases the final GEM handle on the buffer object. This resulted
    in a NULL-pointer deref.
    
    Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
    GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
    Acquire internal references on GEM handles") only solved the problem
    partially. They especially don't work for buffer objects without a DRM
    framebuffer associated.
    
    Hence, this revert to going back to using .import_attach->dmabuf.
    
    v3:
    - cc stable
    
    Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
    Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
    Acked-by: Christian König <christian.koenig@amd.com>
    Acked-by: Zack Rusin <zack.rusin@broadcom.com>
    Cc: <stable@vger.kernel.org> # v6.15+
    Link: https://lore.kernel.org/r/20250715155934.150656-5-tzimmermann@suse.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
s390/ism: fix concurrency management in ism_cmd() [+ + +]
Author: Halil Pasic <pasic@linux.ibm.com>
Date:   Tue Jul 22 18:18:17 2025 +0200

    s390/ism: fix concurrency management in ism_cmd()
    
    [ Upstream commit 897e8601b9cff1d054cdd53047f568b0e1995726 ]
    
    The s390x ISM device data sheet clearly states that only one
    request-response sequence is allowable per ISM function at any point in
    time.  Unfortunately as of today the s390/ism driver in Linux does not
    honor that requirement. This patch aims to rectify that.
    
    This problem was discovered based on Aliaksei's bug report which states
    that for certain workloads the ISM functions end up entering error state
    (with PEC 2 as seen from the logs) after a while and as a consequence
    connections handled by the respective function break, and for future
    connection requests the ISM device is not considered -- given it is in a
    dysfunctional state. During further debugging PEC 3A was observed as
    well.
    
    A kernel message like
    [ 1211.244319] zpci: 061a:00:00.0: Event 0x2 reports an error for PCI function 0x61a
    is a reliable indicator of the stated function entering error state
    with PEC 2. Let me also point out that a kernel message like
    [ 1211.244325] zpci: 061a:00:00.0: The ism driver bound to the device does not support error recovery
    is a reliable indicator that the ISM function won't be auto-recovered
    because the ISM driver currently lacks support for it.
    
    On a technical level, without this synchronization, commands (inputs to
    the FW) may be partially or fully overwritten (corrupted) by another CPU
    trying to issue commands on the same function. There is hard evidence that
    this can lead to DMB token values being used as DMB IOVAs, leading to
    PEC 2 PCI events indicating invalid DMA. But this is only one of the
    failure modes imaginable. In theory even completely losing one command
    and executing another one twice and then trying to interpret the outputs
    as if the command we intended to execute was actually executed and not
    the other one is also possible.  Frankly, I don't feel confident about
    providing an exhaustive list of possible consequences.
    
    Fixes: 684b89bc39ce ("s390/ism: add device driver for internal shared memory")
    Reported-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com>
    Tested-by: Mahanta Jambigi <mjambigi@linux.ibm.com>
    Tested-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com>
    Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
    Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
    Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20250722161817.1298473-1-wintera@linux.ibm.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
selftests/bpf: Add tests with stack ptr register in conditional jmp [+ + +]
Author: Yonghong Song <yonghong.song@linux.dev>
Date:   Fri May 23 21:13:40 2025 -0700

    selftests/bpf: Add tests with stack ptr register in conditional jmp
    
    commit 5ffb537e416ee22dbfb3d552102e50da33fec7f6 upstream.
    
    Add two tests:
      - one test has 'rX <op> r10' where rX is not r10, and
      - another test has 'rX <op> rY' where rX and rY are not r10
        but there is an early insn 'rX = r10'.
    
    Without previous verifier change, both tests will fail.
    
    Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20250524041340.4046304-1-yonghong.song@linux.dev
    [ shung-hsi.yu: contains additional hunks for kernel/bpf/verifier.c that
      should be part of the previous patch in the series, commit
      e2d2115e56c4 "bpf: Do not include stack ptr register in precision
      backtracking bookkeeping", which already incorporated. ]
    Link: https://lore.kernel.org/all/9b41f9f5-396f-47e0-9a12-46c52087df6c@linux.dev/
    Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
selftests/mm: fix split_huge_page_test for folio_split() tests [+ + +]
Author: Zi Yan <ziy@nvidia.com>
Date:   Tue Jul 8 21:27:59 2025 -0400

    selftests/mm: fix split_huge_page_test for folio_split() tests
    
    commit 7563fcbfd484b347b776aeed4d7dac78b30884aa upstream.
    
    PID_FMT does not have an offset field, so folio_split() tests are not
    performed.  Add PID_FMT_OFFSET with an offset field and use it to perform
    folio_split() tests.
    
    Link: https://lkml.kernel.org/r/20250709012800.3225727-1-ziy@nvidia.com
    Fixes: 80a5c494c89f ("selftests/mm: add tests for folio_split(), buddy allocator like split")
    Signed-off-by: Zi Yan <ziy@nvidia.com>
    Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
    Reviewed-by: Donet Tom <donettom@linux.ibm.com>
    Tested-by : Donet Tom <donettom@linux.ibm.com>
    Cc: Barry Song <baohua@kernel.org>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Dev Jain <dev.jain@arm.com>
    Cc: Liam Howlett <liam.howlett@oracle.com>
    Cc: Mariano Pache <npache@redhat.com>
    Cc: Ryan Roberts <ryan.roberts@arm.com>
    Cc: Shuah Khan <shuah@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
selftests: drv-net: wait for iperf client to stop sending [+ + +]
Author: Nimrod Oren <noren@nvidia.com>
Date:   Tue Jul 22 15:26:55 2025 +0300

    selftests: drv-net: wait for iperf client to stop sending
    
    [ Upstream commit 86941382508850d58c11bdafe0fec646dfd31b09 ]
    
    A few packets may still be sent out during the termination of iperf
    processes. These late packets cause failures in rss_ctx.py when they
    arrive on queues expected to be empty.
    
    Example failure observed:
    
      Check failed 2 != 0 traffic on inactive queues (context 1):
        [0, 0, 1, 1, 386385, 397196, 0, 0, 0, 0, ...]
    
      Check failed 4 != 0 traffic on inactive queues (context 2):
        [0, 0, 0, 0, 2, 2, 247152, 253013, 0, 0, ...]
    
      Check failed 2 != 0 traffic on inactive queues (context 3):
        [0, 0, 0, 0, 0, 0, 1, 1, 282434, 283070, ...]
    
    To avoid such failures, wait until all client sockets for the requested
    port are either closed or in the TIME_WAIT state.
    
    Fixes: 847aa551fa78 ("selftests: drv-net: rss_ctx: factor out send traffic and check")
    Signed-off-by: Nimrod Oren <noren@nvidia.com>
    Reviewed-by: Gal Pressman <gal@nvidia.com>
    Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://patch.msgid.link/20250722122655.3194442-1-noren@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: mptcp: connect: also cover alt modes [+ + +]
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date:   Tue Jul 15 20:43:28 2025 +0200

    selftests: mptcp: connect: also cover alt modes
    
    commit 37848a456fc38c191aedfe41f662cc24db8c23d9 upstream.
    
    The "mmap" and "sendfile" alternate modes for mptcp_connect.sh/.c are
    available from the beginning, but only tested when mptcp_connect.sh is
    manually launched with "-m mmap" or "-m sendfile", not via the
    kselftests helpers.
    
    The MPTCP CI was manually running "mptcp_connect.sh -m mmap", but not
    "-m sendfile". Plus other CIs, especially the ones validating the stable
    releases, were not validating these alternate modes.
    
    To make sure these modes are validated by these CIs, add two new test
    programs executing mptcp_connect.sh with the alternate modes.
    
    Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp")
    Cc: stable@vger.kernel.org
    Reviewed-by: Geliang Tang <geliang@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/20250715-net-mptcp-sft-connect-alt-v2-1-8230ddd82454@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests: mptcp: connect: also cover checksum [+ + +]
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date:   Tue Jul 15 20:43:29 2025 +0200

    selftests: mptcp: connect: also cover checksum
    
    commit fdf0f60a2bb02ba581d9e71d583e69dd0714a521 upstream.
    
    The checksum mode has been added a while ago, but it is only validated
    when manually launching mptcp_connect.sh with "-C".
    
    The different CIs were then not validating these MPTCP Connect tests
    with checksum enabled. To make sure they do, add a new test program
    executing mptcp_connect.sh with the checksum mode.
    
    Fixes: 94d66ba1d8e4 ("selftests: mptcp: enable checksum in mptcp_connect.sh")
    Cc: stable@vger.kernel.org
    Reviewed-by: Geliang Tang <geliang@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/20250715-net-mptcp-sft-connect-alt-v2-2-8230ddd82454@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
spi: cadence-quadspi: fix cleanup of rx_chan on failure paths [+ + +]
Author: Khairul Anuar Romli <khairul.anuar.romli@altera.com>
Date:   Mon Jun 30 17:11:56 2025 +0800

    spi: cadence-quadspi: fix cleanup of rx_chan on failure paths
    
    commit 04a8ff1bc3514808481ddebd454342ad902a3f60 upstream.
    
    Remove incorrect checks on cqspi->rx_chan that cause driver breakage
    during failure cleanup. Ensure proper resource freeing on the success
    path when operating in cqspi->use_direct_mode, preventing leaks and
    improving stability.
    
    Signed-off-by: Khairul Anuar Romli <khairul.anuar.romli@altera.com>
    Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
    Link: https://patch.msgid.link/89765a2b94f047ded4f14babaefb7ef92ba07cb2.1751274389.git.khairul.anuar.romli@altera.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Linux: sprintf.h requires stdarg.h [+ + +]
Author: Stephen Rothwell <sfr@canb.auug.org.au>
Date:   Mon Jul 21 16:15:57 2025 +1000

    sprintf.h requires stdarg.h
    
    commit 0dec7201788b9152f06321d0dab46eed93834cda upstream.
    
    In file included from drivers/crypto/intel/qat/qat_common/adf_pm_dbgfs_utils.c:4:
    include/linux/sprintf.h:11:54: error: unknown type name 'va_list'
       11 | __printf(2, 0) int vsprintf(char *buf, const char *, va_list);
          |                                                      ^~~~~~~
    include/linux/sprintf.h:1:1: note: 'va_list' is defined in header '<stdarg.h>'; this is probably fixable by adding '#include <stdarg.h>'
    
    Link: https://lkml.kernel.org/r/20250721173754.42865913@canb.auug.org.au
    Fixes: 39ced19b9e60 ("lib/vsprintf: split out sprintf() and friends")
    Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
    Cc: Andriy Shevchenko <andriy.shevchenko@linux.intel.com>
    Cc: Herbert Xu <herbert@gondor.apana.org.au>
    Cc: Petr Mladek <pmladek@suse.com>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
staging: vchiq_arm: Make vchiq_shutdown never fail [+ + +]
Author: Stefan Wahren <wahrenst@gmx.net>
Date:   Tue Jul 15 18:11:08 2025 +0200

    staging: vchiq_arm: Make vchiq_shutdown never fail
    
    [ Upstream commit f2b8ebfb867011ddbefbdf7b04ad62626cbc2afd ]
    
    Most of the users of vchiq_shutdown ignore the return value,
    which is bad because this could lead to resource leaks.
    So instead of changing all calls to vchiq_shutdown, it's easier
    to make vchiq_shutdown never fail.
    
    Fixes: 71bad7f08641 ("staging: add bcm2708 vchiq driver")
    Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
    Link: https://lore.kernel.org/r/20250715161108.3411-4-wahrenst@gmx.net
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
timekeeping: Zero initialize system_counterval when querying time from phc drivers [+ + +]
Author: Markus Blöchl <markus@blochl.de>
Date:   Sun Jul 20 15:54:51 2025 +0200

    timekeeping: Zero initialize system_counterval when querying time from phc drivers
    
    commit 67c632b4a7fbd6b76a08b86f4950f0f84de93439 upstream.
    
    Most drivers only populate the fields cycles and cs_id of system_counterval
    in their get_time_fn() callback for get_device_system_crosststamp(), unless
    they explicitly provide nanosecond values.
    
    When the use_nsecs field was added to struct system_counterval, most
    drivers did not care.  Clock sources other than CSID_GENERIC could then get
    converted in convert_base_to_cs() based on an uninitialized use_nsecs field,
    which usually results in -EINVAL during the following range check.
    
    Pass in a fully zero initialized system_counterval_t to cure that.
    
    Fixes: 6b2e29977518 ("timekeeping: Provide infrastructure for converting to/from a base clock")
    Signed-off-by: Markus Blöchl <markus@blochl.de>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Acked-by: John Stultz <jstultz@google.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/all/20250720-timekeeping_uninit_crossts-v2-1-f513c885b7c2@blochl.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
tools/hv: fcopy: Fix incorrect file path conversion [+ + +]
Author: Yasumasa Suenaga <yasuenag@gmail.com>
Date:   Sat Jun 28 11:22:17 2025 +0900

    tools/hv: fcopy: Fix incorrect file path conversion
    
    [ Upstream commit 0d86a8d65c1e69610bfe1a7a774f71ff111ed8c1 ]
    
    The hv_fcopy_uio_daemon fails to correctly handle file copy requests
    from Windows hosts (e.g. via Copy-VMFile) due to wchar_t size
    differences between Windows and Linux. On Linux, wchar_t is 32 bit,
    whereas Windows uses 16 bit wide characters.
    
    Fix this by ensuring that file transfers from host to Linux guest
    succeed with correctly decoded file names and paths.
    
    - Treats file name and path as __u16 arrays, not wchar_t*.
    - Allocates fixed-size buffers (W_MAX_PATH) for converted strings
      instead of using malloc.
    - Adds a check for target path length to prevent snprintf() buffer
      overflow.
    
    Fixes: 82b0945ce2c2 ("tools: hv: Add new fcopy application based on uio driver")
    Signed-off-by: Yasumasa Suenaga <yasuenag@gmail.com>
    Reviewed-by: Naman Jain <namjain@linux.microsoft.com>
    Link: https://lore.kernel.org/r/20250628022217.1514-2-yasuenag@gmail.com
    Signed-off-by: Wei Liu <wei.liu@kernel.org>
    Message-ID: <20250628022217.1514-2-yasuenag@gmail.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
usb: typec: tcpm: allow switching to mode accessory to mux properly [+ + +]
Author: Michael Grzeschik <m.grzeschik@pengutronix.de>
Date:   Fri Apr 4 00:43:06 2025 +0200

    usb: typec: tcpm: allow switching to mode accessory to mux properly
    
    commit 8a50da849151e7e12b43c1d8fe7ad302223aef6b upstream.
    
    The funciton tcpm_acc_attach is not setting the proper state when
    calling tcpm_set_role. The function tcpm_set_role is currently only
    handling TYPEC_STATE_USB. For the tcpm_acc_attach to switch into other
    modal states tcpm_set_role needs to be extended by an extra state
    parameter. This patch is handling the proper state change when calling
    tcpm_acc_attach.
    
    Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
    Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
    Link: https://lore.kernel.org/r/20250404-ml-topic-tcpm-v1-3-b99f44badce8@pengutronix.de
    Stable-dep-of: bec15191d523 ("usb: typec: tcpm: apply vbus before data bringup in tcpm_src_attach")
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: typec: tcpm: allow to use sink in accessory mode [+ + +]
Author: Michael Grzeschik <m.grzeschik@pengutronix.de>
Date:   Fri Apr 4 00:43:04 2025 +0200

    usb: typec: tcpm: allow to use sink in accessory mode
    
    commit 64843d0ba96d3eae297025562111d57585273366 upstream.
    
    Since the function tcpm_acc_attach is not setting the data and role for
    for the sink case we extend it to check for it first.
    
    Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
    Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
    Link: https://lore.kernel.org/r/20250404-ml-topic-tcpm-v1-1-b99f44badce8@pengutronix.de
    Stable-dep-of: bec15191d523 ("usb: typec: tcpm: apply vbus before data bringup in tcpm_src_attach")
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: typec: tcpm: apply vbus before data bringup in tcpm_src_attach [+ + +]
Author: RD Babiera <rdbabiera@google.com>
Date:   Wed Jun 18 23:06:04 2025 +0000

    usb: typec: tcpm: apply vbus before data bringup in tcpm_src_attach
    
    commit bec15191d52300defa282e3fd83820f69e447116 upstream.
    
    This patch fixes Type-C compliance test TD 4.7.6 - Try.SNK DRP Connect
    SNKAS.
    
    tVbusON has a limit of 275ms when entering SRC_ATTACHED. Compliance
    testers can interpret the TryWait.Src to Attached.Src transition after
    Try.Snk as being in Attached.Src the entire time, so ~170ms is lost
    to the debounce timer.
    
    Setting the data role can be a costly operation in host mode, and when
    completed after 100ms can cause Type-C compliance test check TD 4.7.5.V.4
    to fail.
    
    Turn VBUS on before tcpm_set_roles to meet timing requirement.
    
    Fixes: f0690a25a140 ("staging: typec: USB Type-C Port Manager (tcpm)")
    Cc: stable <stable@kernel.org>
    Signed-off-by: RD Babiera <rdbabiera@google.com>
    Reviewed-by: Badhri Jagan Sridharan <badhri@google.com>
    Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
    Link: https://lore.kernel.org/r/20250618230606.3272497-2-rdbabiera@google.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
virtio_net: Enforce minimum TX ring size for reliability [+ + +]
Author: Laurent Vivier <lvivier@redhat.com>
Date:   Wed May 21 11:22:36 2025 +0200

    virtio_net: Enforce minimum TX ring size for reliability
    
    [ Upstream commit 24b2f5df86aaebbe7bac40304eaf5a146c02367c ]
    
    The `tx_may_stop()` logic stops TX queues if free descriptors
    (`sq->vq->num_free`) fall below the threshold of (`MAX_SKB_FRAGS` + 2).
    If the total ring size (`ring_num`) is not strictly greater than this
    value, queues can become persistently stopped or stop after minimal
    use, severely degrading performance.
    
    A single sk_buff transmission typically requires descriptors for:
    - The virtio_net_hdr (1 descriptor)
    - The sk_buff's linear data (head) (1 descriptor)
    - Paged fragments (up to MAX_SKB_FRAGS descriptors)
    
    This patch enforces that the TX ring size ('ring_num') must be strictly
    greater than (MAX_SKB_FRAGS + 2). This ensures that the ring is
    always large enough to hold at least one maximally-fragmented packet
    plus at least one additional slot.
    
    Reported-by: Lei Yang <leiyang@redhat.com>
    Signed-off-by: Laurent Vivier <lvivier@redhat.com>
    Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Link: https://patch.msgid.link/20250521092236.661410-4-lvivier@redhat.com
    Tested-by: Lei Yang <leiyang@redhat.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
virtio_ring: Fix error reporting in virtqueue_resize [+ + +]
Author: Laurent Vivier <lvivier@redhat.com>
Date:   Wed May 21 11:22:34 2025 +0200

    virtio_ring: Fix error reporting in virtqueue_resize
    
    [ Upstream commit 45ebc7e6c125ce93d2ddf82cd5bea20121bb0258 ]
    
    The virtqueue_resize() function was not correctly propagating error codes
    from its internal resize helper functions, specifically
    virtqueue_resize_packet() and virtqueue_resize_split(). If these helpers
    returned an error, but the subsequent call to virtqueue_enable_after_reset()
    succeeded, the original error from the resize operation would be masked.
    Consequently, virtqueue_resize() could incorrectly report success to its
    caller despite an underlying resize failure.
    
    This change restores the original code behavior:
    
           if (vdev->config->enable_vq_after_reset(_vq))
                   return -EBUSY;
    
           return err;
    
    Fix: commit ad48d53b5b3f ("virtio_ring: separate the logic of reset/enable from virtqueue_resize")
    Cc: xuanzhuo@linux.alibaba.com
    Signed-off-by: Laurent Vivier <lvivier@redhat.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Link: https://patch.msgid.link/20250521092236.661410-2-lvivier@redhat.com
    Tested-by: Lei Yang <leiyang@redhat.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
x86/hyperv: Fix usage of cpu_online_mask to get valid cpu [+ + +]
Author: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Date:   Thu Jul 3 15:44:34 2025 -0700

    x86/hyperv: Fix usage of cpu_online_mask to get valid cpu
    
    [ Upstream commit bb169f80ed5a156ec3405e0e49c6b8e9ae264718 ]
    
    Accessing cpu_online_mask here is problematic because the cpus read lock
    is not held in this context.
    
    However, cpu_online_mask isn't needed here since the effective affinity
    mask is guaranteed to be valid in this callback. So, just use
    cpumask_first() to get the cpu instead of ANDing it with cpus_online_mask
    unnecessarily.
    
    Fixes: e39397d1fd68 ("x86/hyperv: implement an MSI domain for root partition")
    Reported-by: Michael Kelley <mhklinux@outlook.com>
    Closes: https://lore.kernel.org/linux-hyperv/SN6PR02MB4157639630F8AD2D8FD8F52FD475A@SN6PR02MB4157.namprd02.prod.outlook.com/
    Suggested-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
    Reviewed-by: Michael Kelley <mhklinux@outlook.com>
    Link: https://lore.kernel.org/r/1751582677-30930-4-git-send-email-nunodasneves@linux.microsoft.com
    Signed-off-by: Wei Liu <wei.liu@kernel.org>
    Message-ID: <1751582677-30930-4-git-send-email-nunodasneves@linux.microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
x86/traps: Initialize DR7 by writing its architectural reset value [+ + +]
Author: Xin Li (Intel) <xin@zytor.com>
Date:   Fri Jun 20 16:15:04 2025 -0700

    x86/traps: Initialize DR7 by writing its architectural reset value
    
    commit fa7d0f83c5c4223a01598876352473cb3d3bd4d7 upstream.
    
    Initialize DR7 by writing its architectural reset value to always set
    bit 10, which is reserved to '1', when "clearing" DR7 so as not to
    trigger unanticipated behavior if said bit is ever unreserved, e.g. as
    a feature enabling flag with inverted polarity.
    
    Signed-off-by: Xin Li (Intel) <xin@zytor.com>
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com>
    Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
    Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: Sean Christopherson <seanjc@google.com>
    Tested-by: Sohil Mehta <sohil.mehta@intel.com>
    Cc:stable@vger.kernel.org
    Link: https://lore.kernel.org/all/20250620231504.2676902-3-xin%40zytor.com
    [ context adjusted: no KVM_DEBUGREG_AUTO_SWITCH flag test" ]
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 
xfrm: always initialize offload path [+ + +]
Author: Leon Romanovsky <leon@kernel.org>
Date:   Sun Jun 8 10:42:53 2025 +0300

    xfrm: always initialize offload path
    
    [ Upstream commit c0f21029f123d1b15f8eddc8e3976bf0c8781c43 ]
    
    Offload path is used for GRO with SW IPsec, and not just for HW
    offload. So initialize it anyway.
    
    Fixes: 585b64f5a620 ("xfrm: delay initialization of offload path till its actually requested")
    Reported-by: Sabrina Dubroca <sd@queasysnail.net>
    Closes: https://lore.kernel.org/all/aEGW_5HfPqU1rFjl@krikkit
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

xfrm: interface: fix use-after-free after changing collect_md xfrm interface [+ + +]
Author: Eyal Birger <eyal.birger@gmail.com>
Date:   Thu Jul 3 10:02:58 2025 -0700

    xfrm: interface: fix use-after-free after changing collect_md xfrm interface
    
    [ Upstream commit a90b2a1aaacbcf0f91d7e4868ad6c51c5dee814b ]
    
    collect_md property on xfrm interfaces can only be set on device creation,
    thus xfrmi_changelink() should fail when called on such interfaces.
    
    The check to enforce this was done only in the case where the xi was
    returned from xfrmi_locate() which doesn't look for the collect_md
    interface, and thus the validation was never reached.
    
    Calling changelink would thus errornously place the special interface xi
    in the xfrmi_net->xfrmi hash, but since it also exists in the
    xfrmi_net->collect_md_xfrmi pointer it would lead to a double free when
    the net namespace was taken down [1].
    
    Change the check to use the xi from netdev_priv which is available earlier
    in the function to prevent changes in xfrm collect_md interfaces.
    
    [1] resulting oops:
    [    8.516540] kernel BUG at net/core/dev.c:12029!
    [    8.516552] Oops: invalid opcode: 0000 [#1] SMP NOPTI
    [    8.516559] CPU: 0 UID: 0 PID: 12 Comm: kworker/u80:0 Not tainted 6.15.0-virtme #5 PREEMPT(voluntary)
    [    8.516565] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
    [    8.516569] Workqueue: netns cleanup_net
    [    8.516579] RIP: 0010:unregister_netdevice_many_notify+0x101/0xab0
    [    8.516590] Code: 90 0f 0b 90 48 8b b0 78 01 00 00 48 8b 90 80 01 00 00 48 89 56 08 48 89 32 4c 89 80 78 01 00 00 48 89 b8 80 01 00 00 eb ac 90 <0f> 0b 48 8b 45 00 4c 8d a0 88 fe ff ff 48 39 c5 74 5c 41 80 bc 24
    [    8.516593] RSP: 0018:ffffa93b8006bd30 EFLAGS: 00010206
    [    8.516598] RAX: ffff98fe4226e000 RBX: ffffa93b8006bd58 RCX: ffffa93b8006bc60
    [    8.516601] RDX: 0000000000000004 RSI: 0000000000000000 RDI: dead000000000122
    [    8.516603] RBP: ffffa93b8006bdd8 R08: dead000000000100 R09: ffff98fe4133c100
    [    8.516605] R10: 0000000000000000 R11: 00000000000003d2 R12: ffffa93b8006be00
    [    8.516608] R13: ffffffff96c1a510 R14: ffffffff96c1a510 R15: ffffa93b8006be00
    [    8.516615] FS:  0000000000000000(0000) GS:ffff98fee73b7000(0000) knlGS:0000000000000000
    [    8.516619] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [    8.516622] CR2: 00007fcd2abd0700 CR3: 000000003aa40000 CR4: 0000000000752ef0
    [    8.516625] PKRU: 55555554
    [    8.516627] Call Trace:
    [    8.516632]  <TASK>
    [    8.516635]  ? rtnl_is_locked+0x15/0x20
    [    8.516641]  ? unregister_netdevice_queue+0x29/0xf0
    [    8.516650]  ops_undo_list+0x1f2/0x220
    [    8.516659]  cleanup_net+0x1ad/0x2e0
    [    8.516664]  process_one_work+0x160/0x380
    [    8.516673]  worker_thread+0x2aa/0x3c0
    [    8.516679]  ? __pfx_worker_thread+0x10/0x10
    [    8.516686]  kthread+0xfb/0x200
    [    8.516690]  ? __pfx_kthread+0x10/0x10
    [    8.516693]  ? __pfx_kthread+0x10/0x10
    [    8.516697]  ret_from_fork+0x82/0xf0
    [    8.516705]  ? __pfx_kthread+0x10/0x10
    [    8.516709]  ret_from_fork_asm+0x1a/0x30
    [    8.516718]  </TASK>
    
    Fixes: abc340b38ba2 ("xfrm: interface: support collect metadata mode")
    Reported-by: Lonial Con <kongln9170@gmail.com>
    Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
    Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

xfrm: ipcomp: adjust transport header after decompressing [+ + +]
Author: Fernando Fernandez Mancera <fmancera@suse.de>
Date:   Tue Jun 24 15:11:15 2025 +0200

    xfrm: ipcomp: adjust transport header after decompressing
    
    [ Upstream commit 2ca58d87ebae20906cf808ef813d747db0177a18 ]
    
    The skb transport header pointer needs to be adjusted by network header
    pointer plus the size of the ipcomp header.
    
    This shows up when running traffic over ipcomp using transport mode.
    After being reinjected, packets are dropped because the header isn't
    adjusted properly and some checks can be triggered. E.g the skb is
    mistakenly considered as IP fragmented packet and later dropped.
    
    kworker/30:1-mm     443 [030]   102.055250:     skb:kfree_skb:skbaddr=0xffff8f104aa3ce00 rx_sk=(
            ffffffff8419f1f4 sk_skb_reason_drop+0x94 ([kernel.kallsyms])
            ffffffff8419f1f4 sk_skb_reason_drop+0x94 ([kernel.kallsyms])
            ffffffff84281420 ip_defrag+0x4b0 ([kernel.kallsyms])
            ffffffff8428006e ip_local_deliver+0x4e ([kernel.kallsyms])
            ffffffff8432afb1 xfrm_trans_reinject+0xe1 ([kernel.kallsyms])
            ffffffff83758230 process_one_work+0x190 ([kernel.kallsyms])
            ffffffff83758f37 worker_thread+0x2d7 ([kernel.kallsyms])
            ffffffff83761cc9 kthread+0xf9 ([kernel.kallsyms])
            ffffffff836c3437 ret_from_fork+0x197 ([kernel.kallsyms])
            ffffffff836718da ret_from_fork_asm+0x1a ([kernel.kallsyms])
    
    Fixes: eb2953d26971 ("xfrm: ipcomp: Use crypto_acomp interface")
    Link: https://bugzilla.suse.com/1244532
    Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
    Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

xfrm: Set transport header to fix UDP GRO handling [+ + +]
Author: Tobias Brunner <tobias@strongswan.org>
Date:   Tue Jun 24 14:47:20 2025 +0200

    xfrm: Set transport header to fix UDP GRO handling
    
    [ Upstream commit 3ac9e29211fa2df5539ba0d742c8fe9fe95fdc79 ]
    
    The referenced commit replaced a call to __xfrm4|6_udp_encap_rcv() with
    a custom check for non-ESP markers.  But what the called function also
    did was setting the transport header to the ESP header.  The function
    that follows, esp4|6_gro_receive(), relies on that being set when it calls
    xfrm_parse_spi().  We have to set the full offset as the skb's head was
    not moved yet so adding just the UDP header length won't work.
    
    Fixes: e3fd05777685 ("xfrm: Fix UDP GRO handling for some corner cases")
    Signed-off-by: Tobias Brunner <tobias@strongswan.org>
    Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

xfrm: state: initialize state_ptrs earlier in xfrm_state_find [+ + +]
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Fri May 23 17:11:17 2025 +0200

    xfrm: state: initialize state_ptrs earlier in xfrm_state_find
    
    [ Upstream commit 94d077c331730510d5611b438640a292097341f0 ]
    
    In case of preemption, xfrm_state_look_at will find a different
    pcpu_id and look up states for that other CPU. If we matched a state
    for CPU2 in the state_cache while the lookup started on CPU1, we will
    jump to "found", but the "best" state that we got will be ignored and
    we will enter the "acquire" block. This block uses state_ptrs, which
    isn't initialized at this point.
    
    Let's initialize state_ptrs just after taking rcu_read_lock. This will
    also prevent a possible misuse in the future, if someone adjusts this
    function.
    
    Reported-by: syzbot+7ed9d47e15e88581dc5b@syzkaller.appspotmail.com
    Fixes: e952837f3ddb ("xfrm: state: fix out-of-bounds read during lookup")
    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Reviewed-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

xfrm: state: use a consistent pcpu_id in xfrm_state_find [+ + +]
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Fri May 23 17:11:18 2025 +0200

    xfrm: state: use a consistent pcpu_id in xfrm_state_find
    
    [ Upstream commit 7eb11c0ab70777b9e5145a5ba1c0a2312c3980b2 ]
    
    If we get preempted during xfrm_state_find, we could run
    xfrm_state_look_at using a different pcpu_id than the one
    xfrm_state_find saw. This could lead to ignoring states that should
    have matched, and triggering acquires on a CPU that already has a pcpu
    state.
    
        xfrm_state_find starts on CPU1
        pcpu_id = 1
        lookup starts
        <preemption, we're now on CPU2>
        xfrm_state_look_at pcpu_id = 2
           finds a state
    found:
        best->pcpu_num != pcpu_id (2 != 1)
        if (!x && !error && !acquire_in_progress) {
            ...
            xfrm_state_alloc
            xfrm_init_tempstate
            ...
    
    This can be avoided by passing the original pcpu_id down to all
    xfrm_state_look_at() calls.
    
    Also switch to raw_smp_processor_id, disabling preempting just to
    re-enable it immediately doesn't really make sense.
    
    Fixes: 1ddf9916ac09 ("xfrm: Add support for per cpu xfrm state handling.")
    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Reviewed-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>