gh-148820: Fix _PyRawMutex use-after-free on spurious semaphore wakeup#148852
gh-148820: Fix _PyRawMutex use-after-free on spurious semaphore wakeup#148852colesbury merged 1 commit intopython:mainfrom
Conversation
… wakeup _PyRawMutex_UnlockSlow CAS-removes the waiter from the list and then calls _PySemaphore_Wakeup, with no handshake. If _PySemaphore_Wait returns Py_PARK_INTR, the waiter can destroy its stack-allocated semaphore before the unlocker's Wakeup runs, causing a fatal error from ReleaseSemaphore / sem_post. Loop in _PyRawMutex_LockSlow until _PySemaphore_Wait returns Py_PARK_OK, which is only signalled when a matching Wakeup has been observed. Also include GetLastError() and the handle in the Windows fatal messages in _PySemaphore_Init, _PySemaphore_Wait, and _PySemaphore_Wakeup to make similar races easier to diagnose in the future.
|
I played around with adding a regression test for this, but I think it's trick to get right. You have to raise a SIGINT at just the right point. However, if you raise SIGINT too frequently you can cause a stack overflow in the main thread due to Python signal handlers being called within Python signal handlers... |
|
Thanks @colesbury for the PR 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14. |
|
Sorry, @colesbury, I could not cleanly backport this to |
|
Sorry, @colesbury, I could not cleanly backport this to |
|
GH-148884 is a backport of this pull request to the 3.14 branch. |
|
GH-148885 is a backport of this pull request to the 3.13 branch. |
…e wakeup (gh-148852) (#148885) _PyRawMutex_UnlockSlow CAS-removes the waiter from the list and then calls _PySemaphore_Wakeup, with no handshake. If _PySemaphore_Wait returns Py_PARK_INTR, the waiter can destroy its stack-allocated semaphore before the unlocker's Wakeup runs, causing a fatal error from ReleaseSemaphore / sem_post. Loop in _PyRawMutex_LockSlow until _PySemaphore_Wait returns Py_PARK_OK, which is only signalled when a matching Wakeup has been observed. Also include GetLastError() and the handle in the Windows fatal messages in _PySemaphore_Init, _PySemaphore_Wait, and _PySemaphore_Wakeup to make similar races easier to diagnose in the future. (cherry picked from commit ad3c5b7)
…e wakeup (gh-148852) (#148884) _PyRawMutex_UnlockSlow CAS-removes the waiter from the list and then calls _PySemaphore_Wakeup, with no handshake. If _PySemaphore_Wait returns Py_PARK_INTR, the waiter can destroy its stack-allocated semaphore before the unlocker's Wakeup runs, causing a fatal error from ReleaseSemaphore / sem_post. Loop in _PyRawMutex_LockSlow until _PySemaphore_Wait returns Py_PARK_OK, which is only signalled when a matching Wakeup has been observed. Also include GetLastError() and the handle in the Windows fatal messages in _PySemaphore_Init, _PySemaphore_Wait, and _PySemaphore_Wakeup to make similar races easier to diagnose in the future. (cherry picked from commit ad3c5b7)
_PyRawMutex_UnlockSlow CAS-removes the waiter from the list and then calls _PySemaphore_Wakeup, with no handshake. If _PySemaphore_Wait returns Py_PARK_INTR, the waiter can destroy its stack-allocated semaphore before the unlocker's Wakeup runs, causing a fatal error from ReleaseSemaphore / sem_post.
Loop in _PyRawMutex_LockSlow until _PySemaphore_Wait returns Py_PARK_OK, which is only signalled when a matching Wakeup has been observed.
Also include GetLastError() and the handle in the Windows fatal messages in _PySemaphore_Init, _PySemaphore_Wait, and _PySemaphore_Wakeup to make similar races easier to diagnose in the future.