000000000000 DR2: 0000000000000000 > [ 11.284048] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 11.284337] Process swapper (pid: 0, threadinfo ffff88027f1f2000, task ffff88027f1f0640) > [ 11.284936] Stack: ffffffff80250963 0000000000000212 0000000000ee8c78 0000000000ee8a66 > [ 11.285802] ffff88027e571550 ffff88027f1f7fa8 ffffffff8021adb5 ffff88027f1f3e40 > [ 11.286599] ffffffff8020bdd6 ffff88027f1f3e40 ffff88027f1f3ef8 0000000000000000 > [ 11.287120] Call Trace: > [ 11.287768] [] ? generic_smp_call_function_interrupt+0x61/0x12c > [ 11.288354] [] smp_call_function_interrupt+0x17/0x27 > [ 11.288744] [] call_function_interrupt+0x66/0x70 > [ 11.289030] [] ? clockevents_notify+0x19/0x73 > [ 11.289380] [] ? acpi_idle_enter_simple+0x18b/0x1fa > [ 11.289760] [] ? acpi_idle_enter_simple+0x181/0x1fa > [ 11.290051] [] ? cpuidle_idle_call+0x70/0xa2 > [ 11.290338] [] ? cpu_idle+0x5f/0x7d > [ 11.290723] [] ? start_secondary+0x14d/0x152 > [ 11.291010] > [ 11.291287] > [ 11.291654] Code: Bad RIP value. > [ 11.292041] RIP [] 0xffff8802ffffffff > [ 11.292380] RSP > [ 11.292741] CR2: ffff8802ffffffff > [ 11.310951] ---[ end trace 137c54d525305f1c ]--- > > The problem is with the following sequence of events: > > - CPU A calls smp_call_function_mask() for CPU B with wait parameter > - CPU A sets up the call_function_data on the stack and does an rcu add to > call_function_queue > - CPU A waits until the WAIT flag is cleared > - CPU B gets the call function interrupt and starts going through the > call_function_queue > - CPU C also gets some other call function interrupt and starts going through > the call_function_queue > - CPU C, which is also going through the call_function_queue, starts referencing > CPU A's stack, as that element is still in call_function_queue > - CPU B finishes the function call that CPU A set up and as there are no other > references to it, rcu deletes the call_function_data (which was from CPU A > stack) > - CPU B sees the wait flag and just clears the flag (no call_rcu to free) > - CPU A which was waiting on the flag continues executing and the stack > contents change > > - CPU C is still in rcu_read section accessing the CPU A's stack sees > inconsistent call_funation_data and can try to execute > function with some random pointer, causing stack corruption for A > (by clearing the bits in mask field) and oops. Nice debugging work. I'd suggest something like the attached (boot tested) patch as the simple fix for now. I expect the benefits from the less synchronized, multiple-in-flight-data global queue will still outweigh the costs of dynamic allocations. But if worst comes to worst then we just go back to a globally synchronous one-at-a-time implementation, but that would be pretty sad! Signed-off-by: Ingo Molnar