Recovery and upgrades
Daemon recovery and upgrade guide for Coven operators: stale sockets, orphaned PTYs, session ledger recovery, live-session checks, and safe upgrade flow.
2 min read
The daemon is responsible for recovering local state without pretending every session is still live. Treat crash recovery and upgrades as state transitions: inspect first, then restart, then reattach or replay.
Stale socket
If clients cannot connect but $COVEN_HOME/coven.sock exists, the socket may be stale.
First inspect status:
coven daemon statusThen try a normal restart:
coven daemon restartPrefer daemon commands over manual file deletion. Manual cleanup can remove evidence needed to understand why the daemon stopped.
Orphaned sessions
An orphaned session is a session record whose harness PTY is no longer attached to a live daemon process. The ledger may still be useful even when the process is gone.
Clients should treat orphaned or non-live sessions as replay surfaces:
- Show recorded events.
- Preserve the session record.
- Offer a new launch from the same project root.
- Avoid forwarding input to a session that is not live.
- Avoid presenting
killas a meaningful action after the process is gone.
Store recovery
The SQLite ledger is the durable source for sessions and events. On restart, the daemon should reopen the store, preserve completed records, and avoid fabricating live status for processes it does not own.
If the store cannot open:
- Stop the daemon.
- Check file ownership and disk space.
- Preserve a copy of
$COVEN_HOMEbefore risky repair work. - Restart and verify
/api/v1/health.
Upgrade flow
Use this flow for low-drama upgrades:
- Finish or pause important live sessions.
- Run
coven daemon status. - Install the new Coven version.
- Run
coven daemon restart. - Call
GET /api/v1/health. - Confirm
apiVersionis still the contract your clients support. - Check
GET /api/v1/capabilitiesbefore using optional features.
If a breaking daemon API contract ships in the future, clients should fail closed with an update message instead of assuming older response shapes.
When to escalate
Escalate with a diagnostics packet when:
- Restart cannot bind the socket.
- Health fails after restart.
- The ledger cannot open.
- Sessions repeatedly move to failed or orphaned state.
- A client receives structured errors it does not understand.
Include the evidence from Observability, especially status, health, capabilities, and the structured error code.
Related
Last updated on