Content-Length: 349589 | pFad | https://github.com/python/cpython/pull/127051

35 gh-127049: Fix `asyncio.subprocess.Process` race condition killing an unrelated process on Unix by gschaffner · Pull Request #127051 · python/cpython · GitHub
Skip to content

gh-127049: Fix asyncio.subprocess.Process race condition killing an unrelated process on Unix#127051

Draft
gschaffner wants to merge 2 commits into
python:mainfrom
gschaffner:fix-asyncio-kill-race
Draft

gh-127049: Fix asyncio.subprocess.Process race condition killing an unrelated process on Unix#127051
gschaffner wants to merge 2 commits into
python:mainfrom
gschaffner:fix-asyncio-kill-race

Conversation

@gschaffner

Copy link
Copy Markdown

Proposal that fixes GH-127049. The idea is to revert GH-121126 and re-fix GH-87744 in a way that doesn't introduce GH-127049. The proposed approach here is to change child watchers to behave closer to their analog on Windows (_WindowsSubprocessTransport), i.e. let subprocess handle the PID lifetime management instead of having subprocess do the allocation and asyncio do the deallocation, i.e. follow the guidance of #86724 (comment).

In the _ThreadedChildWatcher case this borrows the waitid + WNOWAIT idea from Trio. (See also #82811 (comment).) Note that in the _ThreadedChildWatcher case, while it would ideally be safe to just call Popen.wait in the thread instead of involving waitid + WNOWAIT, Popen currently has some thread-unsafeties, so running Popen.wait or Popen.poll in the thread in practice resulted in unsafe kill calls and broken transport._process_exited(returncode=None) calls. See the longer commit message.

I have marked this as a draft because it does not yet have regression tests. Both cases (_PidfdChildWatcher and _ThreadedChildWatcher) can have (nearly-)deterministic (but consistent regardless) regression tests, but I am not sure how folks would want them implemented, as the reproducers I put together for the report involve monkeypatching os.waitpid and os.kill (which is nontrivial in part because subprocess holds strong references to os.waitpid, and which will miss any os.waitid calls made without WNOWAIT unless we patch that too).

Anyway, if this PR is pursued, I'd appreciate some help with adding tests. This is also my first PR proposed to CPython so any feedback/pointers are appreciated. :)

…_signal in asyncio (python#121126)"

This reverts the non-test changes of commit bd473aa.
Note that we call Popen.poll/wait in the event loop thread to avoid
Popen's thread-unsafety. Without this workaround, when testing
_ThreadedChildWatcher with the reproducer shared in the linked issue on my
machine:
* Case 1 of the known race condition ignored in pythonGH-20010 is met (and an
  unsafe kill call is issued) about 1 in 10 times.
* The race condition pythonGH-127050 is met (causing _process_exited's assert
  returncode is not None to raise) about 1 in 30 times.
@ghost

ghost commented Nov 20, 2024

Copy link
Copy Markdown

All commit authors signed the Contributor License Agreement.
CLA signed

@github-actions

Copy link
Copy Markdown

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions Bot added the stale Stale PR or inactive for long period of time. label Apr 19, 2026
@gschaffner

Copy link
Copy Markdown
Author

The bug is still present on main (5673748), the reproducer still reproduces the bug, and this patchset still works. I would not consider this to be stale, just low-priority.

@github-actions github-actions Bot removed the stale Stale PR or inactive for long period of time. label May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants









ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: https://github.com/python/cpython/pull/127051

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy