Bump UDS `listen()` backlog 1 -> 128 for multi-actor unreg
A backlog of 1 caused `ECONNREFUSED` when multiple sub-actors simultaneously connect to deregister from a remote-daemon registrar. Now matches the TCP transport's default backlog (~128). Also, - add cross-ref comments between `_uds.close_listener()` and `async_main()`'s `parent_is_reg` deregistration path explaining the UDS socket-file lifecycle (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-codesubint_spawner_backend
parent
cd287c7e93
commit
70dc60a199
|
|
@ -300,7 +300,23 @@ async def start_listener(
|
||||||
):
|
):
|
||||||
await sock.bind(str(bindpath))
|
await sock.bind(str(bindpath))
|
||||||
|
|
||||||
sock.listen(1)
|
# NOTE, the backlog must be large enough to handle
|
||||||
|
# concurrent connection attempts during actor teardown.
|
||||||
|
# Previously this was `listen(1)` which caused
|
||||||
|
# deregistration failures in the remote-daemon registrar
|
||||||
|
# case: when multiple sub-actors simultaneously try to
|
||||||
|
# connect to deregister, a backlog of 1 overflows and
|
||||||
|
# connections get ECONNREFUSED. This matches the TCP
|
||||||
|
# transport which uses `trio.open_tcp_listeners()` with
|
||||||
|
# a default backlog of ~128.
|
||||||
|
#
|
||||||
|
# For details see the `close_listener()` below which
|
||||||
|
# `os.unlink()`s the socket file on teardown — meaning
|
||||||
|
# any NEW connection attempts after that point will fail
|
||||||
|
# with `FileNotFoundError` regardless of backlog size.
|
||||||
|
# The backlog only matters while the listener is alive
|
||||||
|
# and accepting.
|
||||||
|
sock.listen(128)
|
||||||
log.info(
|
log.info(
|
||||||
f'Listening on UDS socket\n'
|
f'Listening on UDS socket\n'
|
||||||
f'[>\n'
|
f'[>\n'
|
||||||
|
|
@ -316,6 +332,16 @@ def close_listener(
|
||||||
'''
|
'''
|
||||||
Close and remove the listening unix socket's path.
|
Close and remove the listening unix socket's path.
|
||||||
|
|
||||||
|
NOTE, the `os.unlink()` here removes the socket file from
|
||||||
|
the filesystem immediately, which means any subsequent
|
||||||
|
connection attempts (e.g. sub-actors trying to deregister
|
||||||
|
with a registrar whose listener is tearing down) will fail
|
||||||
|
with `FileNotFoundError`. For the local-registrar case
|
||||||
|
(parent IS the registrar), `_runtime.async_main()` works
|
||||||
|
around this by reusing the existing `_parent_chan` instead
|
||||||
|
of opening a new connection; see the `parent_is_reg` logic
|
||||||
|
in the deregistration path.
|
||||||
|
|
||||||
'''
|
'''
|
||||||
lstnr.socket.close()
|
lstnr.socket.close()
|
||||||
os.unlink(addr.sockpath)
|
os.unlink(addr.sockpath)
|
||||||
|
|
|
||||||
|
|
@ -1848,11 +1848,17 @@ async def async_main(
|
||||||
failed_unreg: bool = False
|
failed_unreg: bool = False
|
||||||
rent_chan: Channel|None = actor._parent_chan
|
rent_chan: Channel|None = actor._parent_chan
|
||||||
|
|
||||||
# XXX check if the parent IS the registrar so we can
|
# XXX check if the parent IS the registrar so we
|
||||||
# reuse the existing channel (avoids opening a new
|
# can reuse the existing `_parent_chan` (avoids
|
||||||
# connection which fails when the listener socket is
|
# opening a new connection which fails when the
|
||||||
# already closed, e.g. UDS transport unlinks the socket
|
# listener socket is already closed, e.g. UDS
|
||||||
# file during teardown).
|
# transport `os.unlink()`s the socket file during
|
||||||
|
# teardown).
|
||||||
|
#
|
||||||
|
# See `ipc._uds.close_listener()` for details on
|
||||||
|
# the UDS socket-file lifecycle and why this
|
||||||
|
# optimization is necessary for the local-registrar
|
||||||
|
# case.
|
||||||
parent_is_reg: bool = False
|
parent_is_reg: bool = False
|
||||||
if (
|
if (
|
||||||
rent_chan is not None
|
rent_chan is not None
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue