r than a feature. - Dependencies and compatibility: - It uses `adev->smuio.funcs->get_socket_id` if available, otherwise falls back to 0, preserving prior behavior on ASICs without socket ID support. This same pattern is already used elsewhere in this file for `record_id` and FRU text (drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c:73-81, 123-131), so there is no new dependency risk. - No API/ABI changes; no headers or structures changed; no architectural changes. - Risk assessment: - Minimal risk: pure data-field fix inside a CPER payload builder; no control flow or subsystem behavior changes. - Side effects are limited to CPER contents produced when bad page threshold is exceeded (trigger path in drivers/gpu/drm/amd/pm/amdgpu_dpm.c:764-778). - Stable backport criteria: - Fixes a real (though non-crashing) bug affecting users of RAS/CPER reporting in multi-GPU or multi-socket environments. - Small, localized change with clear intent and low regression risk. - No new features or architectural changes; adheres to stable rules. - Practical note for backporting: - Backport to stable trees that already contain CPER generation for bad page threshold and the `smuio.get_socket_id` plumbing. Where `get_socket_id` is absent, the fallback keeps behavior identical to pre-fix. drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c index 25252231a68a9..6c266f18c5981 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c @@ -206,6 +206,7 @@ int amdgpu_cper_entry_fill_bad_page_threshold_section(struct amdgpu_device *adev { struct cper_sec_desc *section_desc; struct cper_sec_nonstd_err *section; + uint32_t socket_id; section_desc = (struct cper_sec_desc *)((uint8_t *)hdr + SEC_DESC_OFFSET(idx)); section = (struct cper_sec_nonstd_err *)((uint8_t *)hdr + @@ -224,6 +225,9 @@ int amdgpu_cper_entry_fill_bad_page_threshold_section(struct amdgpu_device *adev section->ctx.reg_arr_size = sizeof(section->ctx.reg_dump); /* Hardcoded Reg dump for bad page threshold CPER */ + socket_id = (adev->smuio.funcs && adev->smuio.funcs->get_socket_id) ? + adev->smuio.funcs->get_socket_id(adev) : + 0; section->ctx.reg_dump[CPER_ACA_REG_CTL_LO] = 0x1; section->ctx.reg_dump[CPER_ACA_REG_CTL_HI] = 0x0; section->ctx.reg_dump[CPER_ACA_REG_STATUS_LO] = 0x137; @@ -234,8 +238,8 @@ int amdgpu_cper_entry_fill_bad_page_threshold_section(struct amdgpu_device *adev section->ctx.reg_dump[CPER_ACA_REG_MISC0_HI] = 0x0; section->ctx.reg_dump[CPER_ACA_REG_CONFIG_LO] = 0x2; section->ctx.reg_dump[CPER_ACA_REG_CONFIG_HI] = 0x1ff; - section->ctx.reg_dump[CPER_ACA_REG_IPID_LO] = 0x0; - section->ctx.reg_dump[CPER_ACA_REG_IPID_HI] = 0x96; + section->ctx.reg_dump[CPER_ACA_REG_IPID_LO] = (socket_id / 4) & 0x01; + section->ctx.reg_dump[CPER_ACA_REG_IPID_HI] = 0x096 | (((socket_id % 4) & 0x3) << 12); section->ctx.reg_dump[CPER_ACA_REG_SYND_LO] = 0x0; section->ctx.reg_dump[CPER_ACA_REG_SYND_HI] = 0x0; -- 2.51.0[PATCH AUTOSEL 6.17] drm/amdgpu: Update IPID value for bad page threshold CPERSasha Levin undefinedpatches@lists.linux.dev, stable@vger.kernel.org undefined undefined undefined undefined undefined undefined undefined undefined undefinedÌQƒÉF