Runbooks turn architecture into repeatable maintenance. This page focuses on the service classes that most directly determine whether a community node can still authenticate users, serve public docs, relay social activity, and recover after outages without improvising under stress.
Not every process deserves the same urgency. The first runbooks should cover the services that keep identity legible, local documentation reachable, public coordination possible, and recovery material fresh enough to matter.
The current TheEtherNet implementation uses username-based login, short access tokens, refresh-token rotation in PostgreSQL, and a client interceptor that retries once on 401 by posting to `/ethernet-api/auth/refresh`. That means the auth runbook must verify both the server-side token store and the client-side refresh behavior.
401 bursts after a previously valid session, invalid refresh-token responses, guest-only failures, or sockets rejecting with unauthorized while normal page loads still work.
Keep read-only or already-authenticated local use alive where possible, but avoid issuing risky trust changes until refresh integrity and time sync are verified.
Refresh tokens appear invalid across multiple users, guest creation fails unexpectedly, or any fix requires rotating secrets, changing trust material, or break-glass recovery.
Mirrors and static artifact lanes are often treated as secondary because they do not feel interactive. In practice they are the fastest path back to competence during an outage. If the community loses the docs mirror, it also loses diagrams, restore steps, and public notices.
404s on public docs, stale BOMs, missing diagrams, broken asset paths, or a mirror that loads locally but not through the regional or public route.
Build artifact drift, bad copy job, storage pressure, stale sync job, or remote publication failure rather than a total local-service loss.
The only current copy is at risk, checksum/signature confidence is gone, or a mirror mismatch could mislead operators during a real repair event.
TheEtherNet’s social layer is split across authenticated HTTP routes and Socket.IO connections. The site proxy sends REST calls through `/ethernet-api` and websocket traffic through `/ethernet-socket.io`, so a relay incident can be partial: page loads may work while live messaging fails, or auth may work for sockets but not for post creation.
The backup lane is where service continuity becomes real. Snapshots, exports, and peer copies should be recent enough, verifiable enough, and documented enough that a damaged node can come back without guessing which file matters or who is allowed to unlock it.
The handbook defines who can authorize rotation, restore, peer suspension, and custody actions when these runbooks cross into governance.
Open Operator HandbookThe broader runbook covers cadence, drills, spares, and incident logging. This page gives the service-specific decision path.
Open Operations RunbookUse the identity guide when a restore depends on secret material, revoked peers, or recovery authority that must be re-established cleanly.
Open Identity & Trust GuideThe service matrix helps decide whether a degraded mode should stay local, mirror outward, or wait for steward approval.
Open Service MatrixBackups and mirrors matter most when they can restore a site across failure domains instead of only inside one cabinet.
Open Federation GuideThese runbooks exist to keep the social layer locally survivable and regionally recoverable without hiding authority boundaries.
Open TheEtherNet