he same physical function by mlx5_core's pci_error_handlers. On s390 this can be injected with > zpcictl --reset-fw Tests with this patch failed to reproduce that second deadlock situation, the devlink command is rejected with "kernel answers: Permission denied" - and we get a kernel log message of: Oct 14 13:32:39 b46lp03.lnxne.boe kernel: mlx5_core 1ed0:00:00.1: mlx5_crdump_collect:50:(pid 254382): crdump: failed to lock vsc gw err -5 because the config read of VSC_SEMAPHORE is rejected by the underlying hardware. Two prior attempts to address this issue have been discussed and ultimately rejected [see link], with the primary argument that s390's implementation of PCI error recovery is imposing restrictions that neither powerpc's EEH nor PCI AER handling need. Tests show that PCI error recovery on s390 is running to completion even without blocking access to PCI config space. Link: https://lore.kernel.org/all/20251007144826.2825134-1-gbayer@linux.ibm.com/ Fixes: 4cdf2f4e24ff ("s390/pci: implement minimal PCI error recovery") Signed-off-by: Gerd Bayer --- All, sorry for the immediate v2, but I had to rebase this to a current upstream commit since v1 didn't apply cleanly, as Niklas pointed out in private. The following assessment from v1 is still valid, though: Hi Niklas, Shay, Jason, by now I believe fixing this in s390/pci is the right way to go, since the other PCI error recovery implementations apparently don't require this strict blocking of accesses to the PCI config space. Hi Alexander, Vasily, Heiko, while I sent this to netdev since prior versions were discussed there, I assume this patch will go through the s390 tree, right? Thanks, Gerd --- arch/s390/pci/pci_event.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c index b95376041501f479eee20705d45fb8c68553da71..27db1e72c623f8a289cae457e87f0a9896ed241d 100644 --- a/arch/s390/pci/pci_event.c +++ b/arch/s390/pci/pci_event.c @@ -188,7 +188,7 @@ static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev) * is unbound or probed and that userspace can't access its * configuration space while we perform recovery. */ - pci_dev_lock(pdev); + device_lock(&pdev->dev); if (pdev->error_state == pci_channel_io_perm_failure) { ers_res = PCI_ERS_RESULT_DISCONNECT; goto out_unlock; @@ -257,7 +257,7 @@ static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev) driver->err_handler->resume(pdev); pci_uevent_ers(pdev, PCI_ERS_RESULT_RECOVERED); out_unlock: - pci_dev_unlock(pdev); + device_unlock(&pdev->dev); zpci_report_status(zdev, "recovery", status_str); return ers_res; --- base-commit: 9b332cece987ee1790b2ed4c989e28162fa47860 change-id: 20251015-fix_pcirecov_master-55fe3705c6c6 Best regards, -- Gerd Bayer [PATCH v2] s390/pci: Avoid deadlock between PCI error recovery and mlx5 crdumpGerd Bayer undefinedNiklas Schnelle , Gerald Schaefer , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Shay Drori , Jason Gunthorpe undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefinedU3