00000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process qib/0 (pid: 1350, threadinfo ffff8803cdc6e000, task ffff88042728a100) Stack: 53544c5553455201 0000000100000005 0000000000000000 ffff8803d84ba000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000001 ffff8803cdc6fd30 ffffffffa0165d7a Call Trace: [] qib_make_rc_req+0x36a/0xe80 [ib_qib] [] ? qib_make_rc_req+0x0/0xe80 [ib_qib] [] qib_do_send+0xf3/0xb60 [ib_qib] [] ? thread_return+0x4e/0x777 [] ? qib_do_send+0x0/0xb60 [ib_qib] [] worker_thread+0x170/0x2a0 [] ? autoremove_wake_function+0x0/0x40 [] ? worker_thread+0x0/0x2a0 [] kthread+0x96/0xa0 [] child_rip+0xa/0x20 [] ? kthread+0x0/0xa0 [] ? child_rip+0x0/0x20 RIP [] qib_send_complete+0x3b/0x190 [ib_qib] The RC error state flush logic in qib_make_rc_req() could return all of the acked wqes and potentially have emptied the queue. It would then unconditionally try return a flush completion via qib_send_complete() for an invalid wqe, or worse a valid one that is not queued. The panic results when the completion code tries to maintain an MR reference count for a NULL MR. This fix modifies logic to only send one completion per qib_make_rc_req() call and changing the completion status from IB_WC_SUCCESS to IB_WC_WR_FLUSH_ERR as the completions progress. The outer loop will call as many times as necessary to flush the queue. Reviewed-by: Ram Vepa Signed-off-by: Mike Marciniszyn Signed-off-by: Roland Dreier ºAd@ž'x