Bug report
Bug description:
|
parent_r = child_w = child_r = parent_w = None |
|
try: |
|
parent_r, child_w = os.pipe() |
|
child_r, parent_w = os.pipe() |
|
cmd = spawn.get_command_line(tracker_fd=tracker_fd, |
|
pipe_handle=child_r) |
|
self._fds.extend([child_r, child_w]) |
|
self.pid = util.spawnv_passfds(spawn.get_executable(), |
|
cmd, self._fds) |
|
self.sentinel = parent_r |
|
with open(parent_w, 'wb', closefd=False) as f: |
|
f.write(fp.getbuffer()) |
|
finally: |
|
fds_to_close = [] |
|
for fd in (parent_r, parent_w): |
|
if fd is not None: |
|
fds_to_close.append(fd) |
|
self.finalizer = util.Finalize(self, util.close_fds, fds_to_close) |
|
|
|
for fd in (child_r, child_w): |
|
if fd is not None: |
|
os.close(fd) |
What happens here:
- Two pipes get opened.
- The child proc gets spawned, the child fds are passed to it.
- It is written into the parent fd.
- The child fds are closed.
This order is wrong and can lead to hangs when the child proc crashes in between for whatever reason. Specifically, when the client crashes, then the parent will hang while trying to write into the parent fd, in this line:
(I have this case because of some unpickling error happening in the child, but that's not really the point of the issue here.)
The fix should be easy: Add extra code to close the child fds, right after the spawn. This is also the standard pattern for this kind of code.
Also reported here: rwth-i6/returnn#1514
CPython versions tested on:
CPython main branch, CPython 3.11
Operating systems tested on:
Linux, Other
Linked PRs
Bug report
Bug description:
cpython/Lib/multiprocessing/popen_spawn_posix.py
Lines 51 to 72 in a705c1e
What happens here:
This order is wrong and can lead to hangs when the child proc crashes in between for whatever reason. Specifically, when the client crashes, then the parent will hang while trying to write into the parent fd, in this line:
cpython/Lib/multiprocessing/popen_spawn_posix.py
Line 62 in a705c1e
(I have this case because of some unpickling error happening in the child, but that's not really the point of the issue here.)
The fix should be easy: Add extra code to close the child fds, right after the spawn. This is also the standard pattern for this kind of code.
Also reported here: rwth-i6/returnn#1514
CPython versions tested on:
CPython main branch, CPython 3.11
Operating systems tested on:
Linux, Other
Linked PRs