rad:z3gqcJUoA1n9HaHKufZs5FCSGazv5 heartwoodd513158566da0f4d942206ea6503c0e317f23612
{
"request": "trigger",
"version": 1,
"event_type": "patch",
"repository": {
"id": "rad:z3gqcJUoA1n9HaHKufZs5FCSGazv5",
"name": "heartwood",
"description": "Radicle Heartwood Protocol & Stack",
"private": false,
"default_branch": "master",
"delegates": [
"did:key:z6MksFqXN3Yhqk8pTJdUGLwATkRfQvwZXPqR2qMEhbS9wzpT",
"did:key:z6MktaNvN1KVFMkSRAiN4qK5yvX1zuEEaseeX5sffhzPZRZW",
"did:key:z6MkireRatUThvd3qzfKht1S44wpm4FEWSSa4PRMTSQZ3voM",
"did:key:z6MkgFq6z5fkF2hioLLSNu1zP2qEL1aHXHZzGH1FLFGAnBGz",
"did:key:z6MkkPvBfjP4bQmco5Dm7UGsX2ruDBieEHi8n9DVJWX5sTEz"
]
},
"action": "Updated",
"patch": {
"id": "9e557d1ad972dab998c920a82122de8e8b921c73",
"author": {
"id": "did:key:z6MktwkohCx8aHZ1QCjVZUiLmX92oPZFxRiFZkbq32Tk5Tkm",
"alias": "2color"
},
"title": "node/wire: Always report fetch results to the service",
"state": {
"status": "open",
"conflicts": []
},
"before": "07f748475beacd41463ee5ebc0d7a93539ab8f55",
"after": "d513158566da0f4d942206ea6503c0e317f23612",
"commits": [
"d513158566da0f4d942206ea6503c0e317f23612",
"2ad6cc40f14f79b3769dc00b8e8840ce2de1a264",
"1083e2f0dd02a08e644d6779878b9a887aff38fe",
"ff9f9f6bee3caca9b514255e32b25ea1440ad372",
"e86272264ff5aa728b411a4bdb2ae5a277077b23",
"2ad59beb7306ee77ba82952171dffa0323d04563"
],
"target": "07f748475beacd41463ee5ebc0d7a93539ab8f55",
"labels": [],
"assignees": [],
"revisions": [
{
"id": "9e557d1ad972dab998c920a82122de8e8b921c73",
"author": {
"id": "did:key:z6MktwkohCx8aHZ1QCjVZUiLmX92oPZFxRiFZkbq32Tk5Tkm",
"alias": "2color"
},
"description": "A fetch result was discarded whenever the peer was no longer `Connected`\nby the time the worker reported, and queued fetches were dropped silently\nwhen the peer disconnected before the `Io::Fetch` was processed. In both\ncases the fetcher's `active` entry for the repo was never cleared. Since\nthat entry is keyed by repo, it blocked the repository from being fetched\nfrom any node until the process restarted.\n\n- Report the result in `worker_result` even when the peer is no longer\n connected, instead of returning early\n- Report a failed fetch when an `Io::Fetch` is dropped for a disconnected\n peer, so the active entry is cleared",
"base": "07f748475beacd41463ee5ebc0d7a93539ab8f55",
"oid": "2ad6cc40f14f79b3769dc00b8e8840ce2de1a264",
"timestamp": 1781902413
},
{
"id": "b3845fd2dcbbe292ae5ec764521c225414f37676",
"author": {
"id": "did:key:z6MktwkohCx8aHZ1QCjVZUiLmX92oPZFxRiFZkbq32Tk5Tkm",
"alias": "2color"
},
"description": "fetcher: Guard fetched against stale node results\n\n- Clear active[rid] only when the result's node matches the node that\n started the fetch; mismatched results now report NotFound\n- Prevent a late completion from a disconnected peer clearing a newer\n fetch for the same repo started by a different node\n- Add test covering the stale-from mismatch path",
"base": "07f748475beacd41463ee5ebc0d7a93539ab8f55",
"oid": "44e7cbded222bc0bd45ba246eff793ba2188d152",
"timestamp": 1781904439
},
{
"id": "4baa4f9815c32935c5f08cae1c4737da8783947c",
"author": {
"id": "did:key:z6MktwkohCx8aHZ1QCjVZUiLmX92oPZFxRiFZkbq32Tk5Tkm",
"alias": "2color"
},
"description": "",
"base": "07f748475beacd41463ee5ebc0d7a93539ab8f55",
"oid": "d513158566da0f4d942206ea6503c0e317f23612",
"timestamp": 1781904594
}
]
}
}
{
"response": "triggered",
"run_id": {
"id": "f28af52f-2ef0-4870-86bb-0e96682ed663"
},
"info_url": "https://cci.rad.levitte.org//f28af52f-2ef0-4870-86bb-0e96682ed663.html"
}
Started at: 2026-06-19 23:31:13.727761+02:00
Commands:
$ rad clone rad:z3gqcJUoA1n9HaHKufZs5FCSGazv5 .
✓ Creating checkout in ./...
✓ Remote cloudhead@z6MksFqXN3Yhqk8pTJdUGLwATkRfQvwZXPqR2qMEhbS9wzpT added
✓ Remote-tracking branch cloudhead@z6MksFqXN3Yhqk8pTJdUGLwATkRfQvwZXPqR2qMEhbS9wzpT/master created for z6MksFqXN3Yhqk8pTJdUGLwATkRfQvwZXPqR2qMEhbS9wzpT
✓ Remote cloudhead@z6MktaNvN1KVFMkSRAiN4qK5yvX1zuEEaseeX5sffhzPZRZW added
✓ Remote-tracking branch cloudhead@z6MktaNvN1KVFMkSRAiN4qK5yvX1zuEEaseeX5sffhzPZRZW/master created for z6MktaNvN1KVFMkSRAiN4qK5yvX1zuEEaseeX5sffhzPZRZW
✓ Remote fintohaps@z6MkireRatUThvd3qzfKht1S44wpm4FEWSSa4PRMTSQZ3voM added
✓ Remote-tracking branch fintohaps@z6MkireRatUThvd3qzfKht1S44wpm4FEWSSa4PRMTSQZ3voM/master created for z6MkireRatUThvd3qzfKht1S44wpm4FEWSSa4PRMTSQZ3voM
✓ Remote erikli@z6MkgFq6z5fkF2hioLLSNu1zP2qEL1aHXHZzGH1FLFGAnBGz added
✓ Remote-tracking branch erikli@z6MkgFq6z5fkF2hioLLSNu1zP2qEL1aHXHZzGH1FLFGAnBGz/master created for z6MkgFq6z5fkF2hioLLSNu1zP2qEL1aHXHZzGH1FLFGAnBGz
✓ Remote lorenz@z6MkkPvBfjP4bQmco5Dm7UGsX2ruDBieEHi8n9DVJWX5sTEz added
✓ Remote-tracking branch lorenz@z6MkkPvBfjP4bQmco5Dm7UGsX2ruDBieEHi8n9DVJWX5sTEz/master created for z6MkkPvBfjP4bQmco5Dm7UGsX2ruDBieEHi8n9DVJWX5sTEz
✓ Repository successfully cloned under /opt/radcis/ci.rad.levitte.org/cci/state/f28af52f-2ef0-4870-86bb-0e96682ed663/w/
╭────────────────────────────────────╮
│ heartwood │
│ Radicle Heartwood Protocol & Stack │
│ 186 issues · 41 patches │
╰────────────────────────────────────╯
Run `cd ./.` to go to the repository directory.
Exit code: 0
$ rad patch checkout 9e557d1ad972dab998c920a82122de8e8b921c73
✓ Switched to branch patch/9e557d1 at revision 4baa4f9
✓ Branch patch/9e557d1 setup to track rad/patches/9e557d1ad972dab998c920a82122de8e8b921c73
Exit code: 0
$ git config advice.detachedHead false
Exit code: 0
$ git checkout d513158566da0f4d942206ea6503c0e317f23612
HEAD is now at d5131585 fetcher: Guard fetched against stale node results
Exit code: 0
$ rad patch show 9e557d1ad972dab998c920a82122de8e8b921c73 -p
╭────────────────────────────────────────────────────────────────────────────────╮
│ Title node/wire: Always report fetch results to the service │
│ Patch 9e557d1ad972dab998c920a82122de8e8b921c73 │
│ Author 2color z6Mktwk…2Tk5Tkm │
│ Head d513158566da0f4d942206ea6503c0e317f23612 │
│ Base 07f748475beacd41463ee5ebc0d7a93539ab8f55 │
│ Branches patch/9e557d1 │
│ Commits ahead 6, behind 0 │
│ Status open │
│ │
│ A fetch result was discarded whenever the peer was no longer `Connected` │
│ by the time the worker reported, and queued fetches were dropped silently │
│ when the peer disconnected before the `Io::Fetch` was processed. In both │
│ cases the fetcher's `active` entry for the repo was never cleared. Since │
│ that entry is keyed by repo, it blocked the repository from being fetched │
│ from any node until the process restarted. │
│ │
│ - Report the result in `worker_result` even when the peer is no longer │
│ connected, instead of returning early │
│ - Report a failed fetch when an `Io::Fetch` is dropped for a disconnected │
│ peer, so the active entry is cleared │
├────────────────────────────────────────────────────────────────────────────────┤
│ d513158 fetcher: Guard fetched against stale node results │
│ 2ad6cc4 node/wire: Always report fetch results to the service │
│ 1083e2f test: Add connected contrast for worker_result │
│ ff9f9f6 test: Prove worker_result discards fetch on disconnect │
│ e862722 test: Reproduce orphaned fetch at service layer │
│ 2ad59be test: Add fetcher orphaned-active invariant │
├────────────────────────────────────────────────────────────────────────────────┤
│ ● Revision 9e557d1 @ 07f7484..2ad6cc4 by 2color z6Mktwk…2Tk5Tkm 37 minutes ago │
│ ↑ Revision b3845fd @ 07f7484..44e7cbd by 2color z6Mktwk…2Tk5Tkm 3 minutes ago │
│ ↑ Revision 4baa4f9 @ 07f7484..d513158 by 2color z6Mktwk…2Tk5Tkm 1 minute ago │
╰────────────────────────────────────────────────────────────────────────────────╯
commit d513158566da0f4d942206ea6503c0e317f23612
Author: Daniel Norman <daniel@norman.life>
Date: Fri Jun 19 23:26:39 2026 +0200
fetcher: Guard fetched against stale node results
- Clear active[rid] only when the result's node matches the node that
started the fetch; mismatched results now report NotFound
- Prevent a late completion from a disconnected peer clearing a newer
fetch for the same repo started by a different node
- Add test covering the stale-from mismatch path
diff --git a/crates/radicle-protocol/src/fetcher/state.rs b/crates/radicle-protocol/src/fetcher/state.rs
index cc9ad2b26..071612ff0 100644
--- a/crates/radicle-protocol/src/fetcher/state.rs
+++ b/crates/radicle-protocol/src/fetcher/state.rs
@@ -210,9 +210,20 @@ impl FetcherState {
///
/// [`Fetched`]: command::Fetched
pub fn fetched(&mut self, command::Fetched { from, rid }: command::Fetched) -> event::Fetched {
- match self.active.remove(&rid) {
- None => event::Fetched::NotFound { from, rid },
- Some(ActiveFetch { from, refs }) => event::Fetched::Completed { from, rid, refs },
+ // Only the node that started the active fetch may complete it. A result from any
+ // other node is stale — e.g. a late completion delivered after the peer disconnected
+ // and the repo was re-fetched from elsewhere. Clearing the entry on a mismatch would
+ // drop the wrong (newer) fetch, so we leave it and report `NotFound`.
+ match self.active.get(&rid) {
+ Some(ActiveFetch { from: active_from, .. }) if *active_from == from => {
+ match self.active.remove(&rid) {
+ Some(ActiveFetch { from, refs }) => {
+ event::Fetched::Completed { from, rid, refs }
+ }
+ None => event::Fetched::NotFound { from, rid },
+ }
+ }
+ _ => event::Fetched::NotFound { from, rid },
}
}
diff --git a/crates/radicle-protocol/src/fetcher/test/state/command/fetched.rs b/crates/radicle-protocol/src/fetcher/test/state/command/fetched.rs
index 1c8883c90..80839a7e6 100644
--- a/crates/radicle-protocol/src/fetcher/test/state/command/fetched.rs
+++ b/crates/radicle-protocol/src/fetcher/test/state/command/fetched.rs
@@ -121,6 +121,45 @@ fn complete_one_of_multiple() {
assert!(state.get_active_fetch(&repo_3).is_some());
}
+// A result from a node other than the one that started the active fetch is stale
+// (e.g. a late completion from a disconnected peer after the repo was re-fetched
+// from elsewhere). It must not clear the newer entry.
+#[test]
+fn stale_from_does_not_clear_active() {
+ let mut state = FetcherState::new(helpers::config(1, 10));
+ let node_a: NodeId = arbitrary::r#gen(1);
+ let node_b: NodeId = arbitrary::r#gen(1);
+ let repo_1: RepoId = arbitrary::r#gen(1);
+ let config = FetchConfig::default();
+
+ // node_a holds the active fetch for repo_1.
+ state.fetch(command::Fetch {
+ from: node_a,
+ rid: repo_1,
+ refs: helpers::gen_refs(1),
+ config,
+ });
+
+ // A stale result arrives attributed to node_b for the same repo.
+ let event = state.fetched(command::Fetched {
+ from: node_b,
+ rid: repo_1,
+ });
+
+ assert_eq!(
+ event,
+ event::Fetched::NotFound {
+ from: node_b,
+ rid: repo_1,
+ }
+ );
+ // node_a's active fetch is untouched.
+ assert_eq!(
+ state.get_active_fetch(&repo_1).map(|f| f.from),
+ Some(node_a)
+ );
+}
+
#[test]
fn non_existent_returns_not_found() {
let mut state = FetcherState::new(helpers::config(1, 10));
commit 2ad6cc40f14f79b3769dc00b8e8840ce2de1a264
Author: Daniel Norman <daniel@norman.life>
Date: Fri Jun 19 22:06:25 2026 +0200
node/wire: Always report fetch results to the service
A fetch result was discarded whenever the peer was no longer `Connected`
by the time the worker reported, and queued fetches were dropped silently
when the peer disconnected before the `Io::Fetch` was processed. In both
cases the fetcher's `active` entry for the repo was never cleared. Since
that entry is keyed by repo, it blocked the repository from being fetched
from any node until the process restarted.
- Report the result in `worker_result` even when the peer is no longer
connected, instead of returning early
- Report a failed fetch when an `Io::Fetch` is dropped for a disconnected
peer, so the active entry is cleared
- Flip the wire regression test to assert the entry is now cleared on
disconnect
diff --git a/crates/radicle-node/src/wire.rs b/crates/radicle-node/src/wire.rs
index f2f0a56b9..72884a0d2 100644
--- a/crates/radicle-node/src/wire.rs
+++ b/crates/radicle-node/src/wire.rs
@@ -414,10 +414,10 @@ where
.push_back(Action::Send(fd, frame.encode_to_vec()));
}
} else {
- // If the peer disconnected, we'll get here, but we still want to let the service know
- // about the fetch result, so we don't return here.
- log::debug!(target: "wire", "Peer {nid} is not connected; ignoring fetch result");
- return;
+ // If the peer disconnected, we still let the service know about the fetch result.
+ // Otherwise the fetcher's `active` entry for this repo is never cleared, which blocks
+ // the repository from being fetched from any node until the node restarts.
+ log::debug!(target: "wire", "Peer {nid} is not connected; reporting fetch result anyway");
};
// Only call into the service if we initiated this fetch.
@@ -1025,9 +1025,18 @@ where
else {
// Nb. It's possible that a peer is disconnected while an `Io::Fetch`
// is in the service's i/o buffer. Since the service may not purge the
- // buffer on disconnect, we should just ignore i/o actions that don't
- // have a connected peer.
+ // buffer on disconnect, we drop fetches that don't have a connected peer.
+ // We must still report the failure so the fetcher clears its `active`
+ // entry; otherwise the repository can no longer be fetched from any node.
log::debug!(target: "wire", "Peer {remote} is not connected: dropping fetch");
+ self.service.fetched(
+ rid,
+ remote,
+ Err(crate::worker::FetchError::Io(io::Error::new(
+ io::ErrorKind::NotConnected,
+ "peer disconnected before fetch could start",
+ ))),
+ );
continue;
};
let (stream, channels) = streams.open(
@@ -1464,12 +1473,11 @@ mod test {
}
}
- // Reproduces the orphaning bug directly at the wire layer: when the worker
- // reports a fetch result for a peer that is no longer `Connected`,
- // `worker_result` discards it without calling `service.fetched`, so the
- // `active[rid]` entry is never cleared — wedging the repository.
+ // Regression test: a fetch result for a non-`Connected` peer must still reach
+ // `service.fetched` so `active[rid]` is cleared; otherwise the repo can no
+ // longer be fetched from any node until the process restarts.
#[test]
- fn worker_result_orphans_active_when_peer_disconnecting() {
+ fn worker_result_clears_active_when_peer_disconnecting() {
let (mut wire, rid, bob_id, _addr) = wire_with_active_fetch();
// Bob is mid-disconnect at the wire layer: present, but not `Connected`.
@@ -1485,8 +1493,8 @@ mod test {
wire.worker_result(timed_out_fetch_result(rid, bob_id));
assert!(
- wire.service.fetcher().active_fetches().contains_key(&rid),
- "fetch result was discarded; active[rid] is orphaned"
+ !wire.service.fetcher().active_fetches().contains_key(&rid),
+ "fetch result must be reported even when the peer disconnected"
);
}
commit 1083e2f0dd02a08e644d6779878b9a887aff38fe
Author: Daniel Norman <daniel@norman.life>
Date: Fri Jun 19 21:57:43 2026 +0200
test: Add connected contrast for worker_result
- Assert that with the peer still `Connected`, `worker_result` clears
`active[rid]` as expected
- Confirm the disconnect test is exercising the discard path rather than
passing vacuously
diff --git a/crates/radicle-node/src/wire.rs b/crates/radicle-node/src/wire.rs
index 3a3db06a4..f2f0a56b9 100644
--- a/crates/radicle-node/src/wire.rs
+++ b/crates/radicle-node/src/wire.rs
@@ -1490,4 +1490,21 @@ mod test {
);
}
+ // Contrast: with the peer still `Connected`, the active entry clears too,
+ // confirming the test above exercises the disconnect path, not a vacuous
+ // assertion.
+ #[test]
+ fn worker_result_clears_active_when_peer_connected() {
+ let (mut wire, rid, bob_id, addr) = wire_with_active_fetch();
+
+ wire.peers
+ .insert(Token(1), Peer::connected(bob_id, addr, Link::Outbound));
+
+ wire.worker_result(timed_out_fetch_result(rid, bob_id));
+
+ assert!(
+ !wire.service.fetcher().active_fetches().contains_key(&rid),
+ "fetch result should have cleared active[rid]"
+ );
+ }
}
commit ff9f9f6bee3caca9b514255e32b25ea1440ad372
Author: Daniel Norman <daniel@norman.life>
Date: Fri Jun 19 21:57:12 2026 +0200
test: Prove worker_result discards fetch on disconnect
- Drive `worker_result` through the real wire layer with the peer in
`Disconnecting` state, mirroring a fetch that ends mid-disconnect
- Assert `active[rid]` survives, proving the result is discarded without
calling `service.fetched`, which is what orphans the entry
- Add helpers to wrap a service with an active fetch in a `Wire`
diff --git a/crates/radicle-node/src/wire.rs b/crates/radicle-node/src/wire.rs
index bebca6750..3a3db06a4 100644
--- a/crates/radicle-node/src/wire.rs
+++ b/crates/radicle-node/src/wire.rs
@@ -784,7 +784,7 @@ where
})) => {
if let Some(s) = streams.get_mut(&stream) {
metrics.received_git_bytes += data.len();
-
+ // Send via channel to the worker thread
if s.channels.send(ChannelEvent::Data(data)).is_err() {
log::warn!(target: "wire", "Worker is disconnected; cannot send data");
}
@@ -1279,7 +1279,11 @@ mod logger {
#[cfg(test)]
mod test {
use super::*;
+ use crate::crypto::test::signer::MockSigner;
+ use crate::identity::RepoId;
+ use crate::node;
use crate::service::{Message, ZeroBytes};
+ use crate::test::storage::MockStorage;
use crate::wire;
use crate::wire::varint;
@@ -1408,4 +1412,82 @@ mod test {
assert!(de.is_empty());
}
}
+
+ // Builds a service holding an active fetch from `bob`, wrapped in a `Wire`.
+ // Returns the wire, the repo id, and bob's id/address.
+ #[allow(clippy::type_complexity)]
+ fn wire_with_active_fetch() -> (
+ Wire<crate::node::Database, MockStorage, MockSigner>,
+ RepoId,
+ NodeId,
+ NetAddr<HostName>,
+ ) {
+ use crate::test::peer::Peer as TestPeer;
+
+ let storage = crate::test::arbitrary::nonempty_storage(1);
+ let rid = *storage.repos.keys().next().unwrap();
+ let mut alice = TestPeer::with_storage("alice", [7, 7, 7, 7], storage);
+ let bob = TestPeer::new("bob", [8, 8, 8, 8]);
+ let bob_id = bob.id;
+ let bob_addr = NetAddr {
+ host: HostName::Ip(net::IpAddr::from([8, 8, 8, 8])),
+ port: node::DEFAULT_PORT,
+ };
+
+ // Start a fetch from Bob so the service holds `active[rid] = { from: bob }`.
+ alice.connect_to(&bob);
+ let (cmd, _recv) =
+ crate::service::Command::fetch(rid, bob_id, radicle::node::DEFAULT_TIMEOUT, None);
+ alice.command(cmd);
+ assert!(
+ alice.fetches().any(|(r, _)| r == rid),
+ "fetch should be initiated"
+ );
+ assert!(alice.fetcher().active_fetches().contains_key(&rid));
+
+ let (worker_tx, _worker_rx) = chan::unbounded::<Task>();
+ let wire = Wire::new(alice.service, worker_tx, Device::mock());
+
+ (wire, rid, bob_id, bob_addr)
+ }
+
+ fn timed_out_fetch_result(rid: RepoId, remote: NodeId) -> TaskResult {
+ TaskResult {
+ remote,
+ stream: StreamId::git(Link::Outbound).nth(1).unwrap(),
+ result: FetchResult::Initiator {
+ rid,
+ result: Err(crate::worker::FetchError::Io(std::io::Error::from(
+ std::io::ErrorKind::TimedOut,
+ ))),
+ },
+ }
+ }
+
+ // Reproduces the orphaning bug directly at the wire layer: when the worker
+ // reports a fetch result for a peer that is no longer `Connected`,
+ // `worker_result` discards it without calling `service.fetched`, so the
+ // `active[rid]` entry is never cleared — wedging the repository.
+ #[test]
+ fn worker_result_orphans_active_when_peer_disconnecting() {
+ let (mut wire, rid, bob_id, _addr) = wire_with_active_fetch();
+
+ // Bob is mid-disconnect at the wire layer: present, but not `Connected`.
+ wire.peers.insert(
+ Token(1),
+ Peer::Disconnecting {
+ link: Link::Outbound,
+ nid: Some(bob_id),
+ reason: DisconnectReason::connection(),
+ },
+ );
+
+ wire.worker_result(timed_out_fetch_result(rid, bob_id));
+
+ assert!(
+ wire.service.fetcher().active_fetches().contains_key(&rid),
+ "fetch result was discarded; active[rid] is orphaned"
+ );
+ }
+
}
commit e86272264ff5aa728b411a4bdb2ae5a277077b23
Author: Daniel Norman <daniel@norman.life>
Date: Fri Jun 19 21:56:31 2026 +0200
test: Reproduce orphaned fetch at service layer
- Drive the real cancel-skip path: a connection conflict makes
`disconnected` early-return on a link mismatch, leaving the fetch
uncancelled
- Combined with an undelivered fetch result, the `active[rid]` entry is
orphaned and the repo can no longer be fetched from any node
- Confirm delivering the missing result clears the entry and unblocks
the queued fetch, showing what the fix must guarantee
diff --git a/crates/radicle-node/src/tests.rs b/crates/radicle-node/src/tests.rs
index 2a9142ec2..10452f3c7 100644
--- a/crates/radicle-node/src/tests.rs
+++ b/crates/radicle-node/src/tests.rs
@@ -1570,6 +1570,61 @@ fn test_queued_fetch_max_capacity() {
assert_eq!(alice.fetches().next(), Some((rid3, bob.id)));
}
+// Reproduces the orphaned-fetch failure mode: a fetch is started, but on
+// disconnect its result is never delivered and the disconnect skips `cancel`,
+// so the `active[rid]` entry is never cleared. The repo can then no longer be
+// fetched from any node even though the node keeps running. See
+// `wire::Wire::worker_result` (discards the result when the peer isn't
+// `Connected`) and `Service::disconnected` (skips `cancel` on a link mismatch).
+#[test]
+fn test_orphaned_fetch_blocks_repo_from_all_nodes() {
+ let storage = arbitrary::nonempty_storage(1);
+ let rid = *storage.repos.keys().next().unwrap();
+ let doc = storage.repos.get(&rid).unwrap().doc.clone();
+ let mut alice = Peer::with_storage("alice", [7, 7, 7, 7], storage);
+ let bob = Peer::new("bob", [8, 8, 8, 8]);
+ let eve = Peer::new("eve", [9, 9, 9, 9]);
+
+ // Alice dials Bob (outbound) and starts fetching the repo, occupying
+ // `active[rid]`.
+ alice.connect_to(&bob);
+ let (cmd, _recv) = Command::fetch(rid, bob.id, DEFAULT_TIMEOUT, None);
+ alice.command(cmd);
+ assert_matches!(alice.fetches().next(), Some((r, n)) if r == rid && n == bob.id);
+
+ // Bob dials Alice (inbound) while the outbound session is still up: a
+ // connection conflict. The service overwrites the session's link to inbound.
+ alice.connect_from(&bob);
+
+ // The outbound transport, the one the fetch is running over, drops. Because
+ // the session's link is now inbound, `Service::disconnected` early-returns
+ // without cancelling the fetch, so `active[rid]` survives.
+ alice.disconnected(
+ bob.id,
+ Link::Outbound,
+ &DisconnectReason::Session(session::Error::Timeout),
+ );
+
+ // Meanwhile the fetch result is never delivered (the `worker_result` discard).
+ // The entry is now orphaned: nothing will ever clear it.
+ assert!(
+ alice.fetcher().active_fetches().contains_key(&rid),
+ "active fetch should be orphaned"
+ );
+
+ // A different seed offers the same repo. It must not be fetched: the
+ // orphaned entry blocks the repo from every node.
+ alice.connect_to(&eve);
+ let (cmd, _recv) = Command::fetch(rid, eve.id, DEFAULT_TIMEOUT, None);
+ alice.command(cmd);
+ assert_matches!(alice.fetches().next(), None);
+
+ // Delivering the missing result (what the fix guarantees) clears the entry
+ // and the queued fetch from the other node proceeds.
+ alice.fetched(rid, bob.id, Ok(fetch::FetchResult::new(doc)));
+ assert_eq!(alice.fetches().next(), Some((rid, eve.id)));
+}
+
#[test]
fn test_queued_fetch_from_ann_same_rid() {
let storage = arbitrary::nonempty_storage(1); // We're testing both public and private repos.
commit 2ad59beb7306ee77ba82952171dffa0323d04563
Author: Daniel Norman <daniel@norman.life>
Date: Fri Jun 19 21:56:24 2026 +0200
test: Add fetcher orphaned-active invariant
- Verify an uncleared `active[rid]` entry blocks the repo from being
fetched from any node, since both `fetch` and `dequeue` refuse to
start while the entry remains
- Document the state-machine trap behind repos that silently stop
being fetched after a disconnect
diff --git a/crates/radicle-protocol/src/fetcher/test/state/invariant.rs b/crates/radicle-protocol/src/fetcher/test/state/invariant.rs
index f79cc2cd9..0305af19f 100644
--- a/crates/radicle-protocol/src/fetcher/test/state/invariant.rs
+++ b/crates/radicle-protocol/src/fetcher/test/state/invariant.rs
@@ -1,7 +1,7 @@
use radicle::test::arbitrary;
use radicle_core::{NodeId, RepoId};
-use crate::fetcher::state::command;
+use crate::fetcher::state::{command, event};
use crate::fetcher::test::state::helpers;
use crate::fetcher::{FetchConfig, FetcherState};
@@ -49,3 +49,57 @@ fn queue_integrity_after_merge() {
let second = state.dequeue(&node_a);
assert!(second.is_none());
}
+
+// If a fetch is started but its completion is never delivered (neither
+// `fetched` nor `cancel`), the `active` entry is orphaned. Since `active` is
+// keyed by `RepoId`, both `fetch` and `dequeue` refuse to start while it
+// remains, blocking the repo from every node. This is the state-machine trap
+// behind the "repository silently stops being fetched" bug; the wire layer can
+// leave such an entry by discarding a fetch result on disconnect.
+#[test]
+fn orphaned_active_blocks_repo_from_all_nodes() {
+ let mut state = FetcherState::new(helpers::config(1, 10));
+ let node_a: NodeId = arbitrary::r#gen(1);
+ let node_b: NodeId = arbitrary::r#gen(1);
+ let repo: RepoId = arbitrary::r#gen(1);
+ let config = FetchConfig::default();
+
+ // A fetch from node_a is started, occupying `active[repo]`...
+ state.fetch(command::Fetch {
+ from: node_a,
+ rid: repo,
+ refs: helpers::gen_refs(1),
+ config,
+ });
+ assert!(state.get_active_fetch(&repo).is_some());
+
+ // ...but its completion is never delivered (no `fetched`, no `cancel`),
+ // simulating the wire layer discarding the result on disconnect.
+
+ // A different node now offers the same repo. It must not start: the orphaned
+ // entry forces it onto node_b's queue instead.
+ assert_eq!(
+ state.fetch(command::Fetch {
+ from: node_b,
+ rid: repo,
+ refs: helpers::gen_refs(1),
+ config,
+ }),
+ event::Fetch::Queued {
+ rid: repo,
+ from: node_b,
+ }
+ );
+
+ // And it can never be dequeued, because `dequeue` skips repos already in
+ // `active`. The repository is wedged until the entry is cleared.
+ assert!(state.dequeue(&node_b).is_none());
+
+ // Clearing the orphaned entry (as a fix, or a restart, would) unblocks it.
+ state.fetched(command::Fetched {
+ from: node_a,
+ rid: repo,
+ });
+ let dequeued = state.dequeue(&node_b).expect("repo is fetchable again");
+ assert_eq!(dequeued.rid, repo);
+}
Exit code: 0
shell: 'export RUSTDOCFLAGS=''-D warnings'' cargo --version rustc --version cargo fmt --check cargo clippy --all-targets --workspace -- --deny warnings cargo build --all-targets --workspace cargo doc --workspace --no-deps --all-features cargo test --workspace --no-fail-fast '
Commands:
$ podman run --name f28af52f-2ef0-4870-86bb-0e96682ed663 -v /opt/radcis/ci.rad.levitte.org/cci/state/f28af52f-2ef0-4870-86bb-0e96682ed663/s:/f28af52f-2ef0-4870-86bb-0e96682ed663/s:ro -v /opt/radcis/ci.rad.levitte.org/cci/state/f28af52f-2ef0-4870-86bb-0e96682ed663/w:/f28af52f-2ef0-4870-86bb-0e96682ed663/w -w /f28af52f-2ef0-4870-86bb-0e96682ed663/w -v /opt/radcis/ci.rad.levitte.org/.radicle:/${id}/.radicle:ro -e RAD_HOME=/${id}/.radicle rust:trixie bash /f28af52f-2ef0-4870-86bb-0e96682ed663/s/script.sh
+ export 'RUSTDOCFLAGS=-D warnings'
+ RUSTDOCFLAGS='-D warnings'
+ cargo --version
info: syncing channel updates for '1.95-x86_64-unknown-linux-gnu'
info: latest update on 2026-04-16, rust version 1.95.0 (59807616e 2026-04-14)
info: downloading component 'cargo'
info: downloading component 'clippy'
info: downloading component 'rust-docs'
info: downloading component 'rust-src'
info: downloading component 'rust-std'
info: downloading component 'rustc'
info: downloading component 'rustfmt'
info: installing component 'cargo'
info: installing component 'clippy'
info: installing component 'rust-docs'
info: installing component 'rust-src'
info: installing component 'rust-std'
info: installing component 'rustc'
info: installing component 'rustfmt'
cargo 1.95.0 (f2d3ce0bd 2026-03-21)
+ rustc --version
rustc 1.95.0 (59807616e 2026-04-14)
+ cargo fmt --check
Diff in /f28af52f-2ef0-4870-86bb-0e96682ed663/w/crates/radicle-protocol/src/fetcher/state.rs:215:
// and the repo was re-fetched from elsewhere. Clearing the entry on a mismatch would
// drop the wrong (newer) fetch, so we leave it and report `NotFound`.
match self.active.get(&rid) {
- Some(ActiveFetch { from: active_from, .. }) if *active_from == from => {
- match self.active.remove(&rid) {
- Some(ActiveFetch { from, refs }) => {
- event::Fetched::Completed { from, rid, refs }
- }
- None => event::Fetched::NotFound { from, rid },
- }
- }
+ Some(ActiveFetch {
+ from: active_from, ..
+ }) if *active_from == from => match self.active.remove(&rid) {
+ Some(ActiveFetch { from, refs }) => event::Fetched::Completed { from, rid, refs },
+ None => event::Fetched::NotFound { from, rid },
+ },
_ => event::Fetched::NotFound { from, rid },
}
}
Exit code: 1
{
"response": "finished",
"result": "failure"
}