Internal reference · backend/recruiter

Ten agents behind Mira & Atlas.

The Hitch backend is a multi-agent system: a conversational voice, a tool-using operator, two end-of-turn planner lanes (brief + identity), a deep-research crew, a personalization pass, a PDF art-director, and a two-part concierge that negotiates with talents. This page maps who each agent is, what context it reads, what it produces, and — most of all — how they are wired to one another.

Engine · OpenAI gpt-5.4-mini (every agent) Runs in · the worker, never the web request Push · Pusher channels only Last updated · 2026-06-10

01The mental model

Read these five sentences and you have the whole system. Everything after is detail.

One model, ten hats.

Every agent here is the same model — OpenAI gpt-5.4-mini. They diverge only in their system prompt, their tool allow-list, the context injected per call, and decoding knobs (reasoning_effort, temperature, forced output schemas). The four settings keys — agent_model, voice_conversational_model, research_subagent_model, voice_filler_model — all resolve to the same default. So "which agent" is really "which prompt + which tools + which context."

The ten run across three theaters plus one on-demand job:

  • Theater I — the live turn. One user message fans out into four parallel lanes. MIRA is the only voice the user hears. ATLAS thinks and acts silently with ~66 tools. At end-of-turn two planner lanes compose in parallel, each emitting one structured plan: MASON the brief, IRIS the "About You" identity file. (ATLAS authors neither — his edit_identity tool is gone.)
  • Theater II — background research. ATLAS fires SAGE and forgets it. SAGE investigates the client, then a silent chain runs: IRIS dedups the findings into the dossier, LENS (vision) places harvested images, and STYLO re-skins the workspace.
  • Theater III — the concierge, between turns. After the client settles on a shortlist, SCOUT finds talents most likely to respond and STERLING negotiates with each one, scores their offers, and reveals finalists — all behind a client-approval gate and an anonymity wall.
  • On demand — HUE art-directs a bespoke CSS stylesheet when the client clicks "Design & export PDF."

Agents never call each other over HTTP.

They communicate through five quieter channels: two end-of-turn planner lanes (MASON + IRIS, each one structured plan), a fire-and-forget task (ATLAS → SAGE), one tagged notes channel read a turn later (ATLAS · MASON · IRIS → MIRA), shared workspace state (MASON's checklist; the activity board), and Pusher pushes to the browser. Section 03 lays these out.

02System map

Where the agents live relative to the runtime. The browser only ever does short REST; the web tier authenticates and enqueues; all agent work happens in the worker; results return to the browser as Pusher events and durable state lands in Postgres.

Browser client + talent SPAs short REST only Web (FastAPI) auth · validate enqueue · return fast job queue WORKER — every agent runs here THEATER I · the live turn — run_agent_dual MIRA ATLAS MASON IRIS end-of-turn planners: MASON + IRIS THEATER II · background research chain SAGE IRIS LENS STYLO THEATER III · concierge (between turns) SCOUT STERLING ON DEMAND · PDF export HUE Postgres workspace · brief · offers Pusher channels client · talent (push only) REST dequeue events pushed to the browser
in-process / sync queued / async agent or service datastore user-visible push
The browser does short REST; web enqueues a job and returns; the worker runs the agents; state persists to Postgres and surfaces to the browser via Pusher. The theaters are entered by different jobs: agent_turn (Theater I, which spawns Theater II), the concierge jobs (Theater III), and identity_design (HUE).

03How agents talk to each other

The most important thing to understand about this system is its communication. No agent makes a network call to another agent. Instead there are exactly these channels — each with different timing and durability. When you read an agent card below, the "Invoked by", "Produces", and "Triggers" rows always map onto one of these.

ChannelWho → whoTimingWhat actually moves
Planner lanes dual_lane → MASON · IRIS end-of-turn, awaited Two composer lanes run once ATLAS's tools commit: MASON emits one BriefPlan, IRIS one IdentityPlan. The runner validates and persists each (persist_architect_plan / persist_identity_plan); the slow lane waits for both (bounded ≤28s) so their writes land this turn.
Fire-and-forget ATLAS → SAGE (→ chain) async, detached deep_buyer_research does asyncio.create_task(...) and returns "STARTED" immediately. The research chain outlives the turn and persists silently.
Tagged notes ATLAS · MASON · IRISMIRA verbatim, +1 turn ONE channel for the whole team: each lane's note is stored as a source-tagged event message (metadata.source = atlas|mason|iris). MIRA reads them next turn as [ATLAS NOTE] / [MASON NOTE] / [IRIS NOTE] lines; the client chat filters event messages out. No compression (the old compressor is gone).
Shared state MASON → MIRA · ATLAS · UI · gate same turn MASON writes per-item checklist verdicts + a note_for_mira to checklist_state. That single record drives the enable_search gate, the right-rail UI, and MIRA's next question. Both planners also attach a one-line client-facing rationale to their architect-applied / identity-applied events — the transient "Brief" / "Profile" chips under MIRA's reply.
Activity board SAGE/STYLO → MIRA async Background workers write started/done rows to agent_activity; MIRA reads the board so she can say "I'm still researching" on the next turn.
In-process chain SAGEIRISLENSSTYLO awaited, ordered One background coroutine (_run_deep_research_background) awaits each stage in turn, threading block-ids forward so LENS attaches images to the blocks IRIS just minted.
Job queue Web → Worker durable, retried Every theater is entered by an enqueued job (agent_turn, concierge_discover, concierge, seller_turn, concierge_answer_ingest, identity_design). Idempotent & keyed so a retry can't double-act.
Pusher Worker → Browser push only Client channel (agent stream, concierge.progress, personalization-applied, profile.design.ready) and talent channel (chat + thread nudges). Oversized payloads are slimmed with resync:true → the client re-reads GET /api/workspace.

04Agent index

All ten at a glance. "Tools" counts function-tools the agent itself may call; agents with a forced/structured output (MASON, LENS, STYLO) emit one schema'd object instead.

CodenameTheaterRoleToolsFile
MIRA live turn The voice. The only agent the user hears; acks, asks one question, paraphrases ATLAS's note. none agent/conversational.py
ATLAS live turn The thinker. Tool-using operator — research, search prefs, facts, the task graph. Silent; hands a tagged note to MIRA. ~66, stage-gated agent/runner.py
MASON live turn The brief composer. Designs the brief as a printable doc at end-of-turn; converges in place. none · BriefPlan JSON agent/architect_subagent.py
IRIS live turn + research The identity planner — MASON's twin for "About You". End-of-turn IdentityPlan; add/update only, never delete; also reconciles research findings. none · IdentityPlan JSON agent/identity_editor_subagent.py
SAGE research Deep client research. Fire-and-forget loop that builds the client file from real data + the web. 13 (SEO·web·browser) agent/research_subagent.py
LENS research Vision image placement. Per harvested image: skip / reference / body. Skips liberally. 1 · place_images forced agent/image_placement_subagent.py
STYLO research Personalization. One write that re-skins the workspace (accent, tone, density, chips, locale). none · StyloPayload JSON agent/personalization_subagent.py
SCOUT concierge Talent scout. Seeds from Fiverr, then LLM-picks 4–6 talents most likely to respond. none · Fiverr seed + pick concierge/search_agent.py
STERLING concierge Negotiator. Deterministic state machine + LLM brain: writes outreach, judges replies, scores offers. 4 function tools · required concierge/runner.py · sterling.py
HUE on demand PDF art-director. Art-direct → design → refine, emitting one sanitized CSS stylesheet. none · emits CSS agent/brief_design_subagent.py

Deterministic, non-LLM glue (not agents, but load-bearing — see §09): the intent classifier intent.py, the checklist/gate renderer brief_checklist.py, the dual-lane orchestrator + tagged-notes handoff dual_lane.py, the shared conversation projector conversation.py, the SSRF-guarded image harvester site_image.py, the concierge pure-function tools, and the thin LiveSellerAdapter. The lineage's voice-filler and partner-note compressor are gone; ATLAS's edit_identity tool is gone too (identity belongs to the IRIS lane); STYLO is not a per-turn lane (it runs in the research chain).

05Theater I — the live turn, lane by lane

One user message enters run_agent_dual, which spawns four asyncio lanes that share a single event queue, merged into one ordered stream. The fast lane (MIRA) is unlocked so she replies instantly; the slow lane (ATLAS) holds a per-project lock; the two planner lanes — MASON (brief) and IRIS (identity) — wait for ATLAS's tools to commit, then each composes one structured plan. ATLAS spawns SAGE within his turn; the notes all three thinkers leave become MIRA's tagged reading next turn. Each lane runs in its own agent <name> OTel span, so a turn's trace breaks out per agent.

ONE USER TURN — four parallel lanes over a shared event queue, merged into one stream User message dual_lane.run_agent_dual MIRA fast · unlocked · no tools streams the reply the user sees ATLAS slow · locked · ~66 tools silent — runs the tool loop MASON brief planner · waits for ATLAS composes BriefPlan at end-of-turn IRIS identity planner · waits for ATLAS composes IdentityPlan at end-of-turn classify_intent → ATLAS prompt checklist_state gate · UI · MIRA SAGE → chain Theater II (async) Postgres brief · identity · checklist Pusher client channel spawns (fire-and-forget) BriefPlan + checklist IdentityPlan tagged notes atlas·mason·iris verbatim · +1 turn assistant deltas (user-visible) architect-applied · identity-applied (+ "Brief"/"Profile" rationale chips)
user-visible async spawn tagged notes (+1 turn) MASON persists BriefPlan IRIS persists IdentityPlan
MIRA and ATLAS each read their own workspace snapshot and run in parallel; ATLAS spawns SAGE (async) within his turn. Once ATLAS's tools commit, the two planner lanes compose — MASON a BriefPlan (+ the checklist that gates search), IRIS an IdentityPlan over "About You" — and the orchestrator persists both in one combined save, together with the team's tagged notes ([ATLAS NOTE] / [MASON NOTE] / [IRIS NOTE]) MIRA reads next turn. The 1-turn lag is bridged by that shared checklist + the notes, not by intra-turn messaging.

MIRA conversational / voice · the fast lane

The only voice the user hears. Acknowledges intent, asks exactly one question, and paraphrases whatever the team did last turn — in one or two spoken sentences.

File agent/conversational.py Model voice_conversational_model · reasoning_effort:none Tools none
Invoked by
The fast lane of run_agent_dual, unlocked, in parallel with ATLAS. Two entry points: stream_conversational_reply (voice, token deltas) and generate_conversational_reply (text routes, may return one widget).
Consumes
Her own workspace read; a slim persona; the live stage + search_enabled gate; the industry tag; the brief checklist verdicts; the activity board; and chat history — where the team's notes arrive as source-tagged [ATLAS NOTE] / [MASON NOTE] / [IRIS NOTE] lines (hidden from the client) and her own past questions appear as assistant turns. Her prompt knows the full team and what each tag means.
Produces
Text deltas the orchestrator wraps as assistant-start/assistant-delta, or one ConversationalReply (text + tappable suggestion chips, or a single widget). Persisted early so a fast follow-up turn can't make her re-ask.
Triggers
Nothing — she is a terminal leaf. She only consumes the team's tagged notes and MASON's checklist.
Hard rules
  • Never claim a tool action. "I updated your brief," "here are five talents" — that's ATLAS's work, not hers.
  • One question, ever. 1–2 short sentences, no markdown or lists (it's read aloud), always ending on a single open question.
  • The transcript is truth. Never re-ask something already answered, even if the (lagging) checklist or note shows it missing.
  • Paraphrase notes, never recite — and surface genuinely new findings, or an honest heads-up when a note begins FAILED:.

ATLAS tools / thinking · the slow lane

The tool-using operator. Runs research, manages search preferences, captures facts, shapes the task graph — silently. His output is never spoken; it becomes his tagged note for MIRA.

File agent/runner.py + prompts/{base,overlays,context}.py Model agent_model Tools ~66, stage-gated
Invoked by
The slow lane of run_agent_dual, under a per-project lock. (A legacy single-lane mode exists where ATLAS also speaks and asks; production always runs dual-lane.)
Consumes
His own workspace read; the deterministic classify_intent result; and a 4-part composed prompt — BASE_SYSTEM + per-stage overlay + a live context block (brief snapshot with block-ids, identity, search prefs, task graph, checklist) + TURN_RULES, plus a dual-mode overlay. Chat history and attached images are appended.
Produces
In dual-mode he suppresses his own assistant bubble and emits a voice_thinking_text sentinel → stored verbatim as his tagged note to MIRA. Streams tool and mid-turn block-*/stage-change events as tools run.
Triggers
SAGE via deep_buyer_research (fire-and-forget). He authors neither the brief (MASON's lane) nor the identity file (IRIS's lane — his edit_identity tool was removed), and enable_search only flips a gate; the user clicks Run.
Hard rules
  • Never invent talents, numbers, or ratings — every fact traces to a tool output or the snapshot.
  • Never trigger search; only enable_search (hard-gated on the checklist). Never claim "I ran the search."
  • In dual-mode, don't speak and don't ask (ask_buyer is stripped): state what's needed and why in ≤150 words, lead with what he did — MIRA phrases the question.
  • Tools are stage-scoped. Each call is re-checked against the current stage's allow-list (entry · brief · search · results · project_map + the concierge-flow stages, concierge_dashboard through fiverr_handoff) and rejected if out of stage.

MASON brief structure · the composition lane (was ARCHITECT)

Composes the entire brief as a printable document — section spine, ordering, typed blocks, plus the per-item checklist — in one structured call. It replaced ATLAS as the brief-structure author.

File agent/architect_subagent.py Model voice_filler_modelagent_model · reasoning_effort:none Output forced BriefPlan JSON
Invoked by
One of the two end-of-turn planner lanes (architect_producer) — gated behind slow_tools_done_event so it sees ATLAS's research/fact writes. The slow lane waits ≤28s for both planners; otherwise their writes appear next turn. Also reused by the concierge's concierge_answer_ingest job, folding each answered clarification into the brief.
Consumes
One signal payload: recent conversation, industry, client name, identity + brief summaries, existing sections & editable blocks (so it can converge), the do-not-touch user-edited block-ids, and the checklist items.
Produces
A BriefPlan: section ops, block add/update/delete ops, one checklist verdict per item, a note_for_mira (delivered on the tagged-notes channel), a client-facing rationale (the "Brief" chip), confidence. The runner validates it, then persists. ai_image ops materialize via a server-side gpt-image-2 call. Emits architect-status/architect-applied.
Triggers
Converges in place (not a chain). Its checklist drives the enable_search gate, the right-rail UI, and MIRA's next question — persisted every turn, even when the plan is empty.
Hard rules
  • Rule zero — never fabricate. No stated facts → empty plans, confidence ≈ 0.
  • Converge, don't accumulate. Prefer update/noop over add; re-stating an existing idea as a new block is the #1 failure.
  • Never touch user-edited blocks; ops only target real block UUIDs (also enforced in validation).
  • Silent. Never speaks to the user; never touches identity, search, or tasks. On failure, falls back to an industry-keyed section skeleton.

IRIS identity planner · the brief lane's twin for "About You"

The "About You" identity planner: reads the full conversation at end-of-turn and emits one structured IdentityPlan — add genuinely-new blocks, update stale ones in place, never delete. The one agent that works in all three theaters.

File agent/identity_editor_subagent.py Model agent_model Output forced IdentityPlan JSON
Invoked by
Three callers. (1) The identity planner lane of run_agent_dual at end-of-turn — MASON's twin, applied by repo.persist_identity_plan; emits identity-applied (+ the client-facing "Profile" chip rationale), and the frontend refetches the dossier. (2) The research chain, reconciling SAGE's findings as candidate_blocks. (3) The concierge's answer-ingest job, folding a client's clarification answer into the dossier (alongside MASON).
Consumes
The full-fidelity identity snapshot (every block's whole payload + section + id) plus the full conversation (the shared, uncapped conversation.py projector MASON also reads) or the researched batch.
Produces
One IdentityPlan of add/update ops + a note_for_mira (tagged-notes channel). The runner validates and persists it. In the research path it mints the block-ids that LENS later attaches images to; if IRIS can't run, a direct-persist fallback keeps the findings.
Output
Add / update ops only — the plan has no delete op; dedup-aware by design.
Hard rules
  • No duplicates — the single most important job; scan the whole file before adding.
  • Never delete or merge (no delete op exists); an empty plan is the correct answer when everything's captured.
  • File into the right section — professional / business / voice / notes.
  • Paraphrase, never paste raw user words.

06Theater II — the background research chain

When ATLAS calls deep_buyer_research, it spawns a detached task and returns instantly. That task — _run_deep_research_background — runs a strict, ordered chain inside the worker. It emits no Pusher events; results land in Postgres and surface on the user's next workspace read. The order matters: IRIS must persist blocks before LENS so the vision pass has real block-ids to attach to.

_run_deep_research_background — awaited, ordered; persists silently (no push) ATLAS create_task SAGE deep research loop 13 tools · ≤24 iters SEO · web · browser IRIS dedup reconcile candidate_blocks mints block-ids site_image harvest (SSRF-safe) og · icon · screenshot LENS vision placement place_images (forced) skip · ref · body STYLO re-skin workspace StyloPayload accent · tone · locale Postgres (silent) agent_activity board started/done → MIRA next turn persist
spawn (fire-and-forget) awaited chain step deterministic helper Postgres (no push)
SAGE returns validated blocks; IRIS reconciles them into "About You" and mints block-ids; site_image harvests the client's own-site imagery (behind an SSRF guard); LENS decides placement against those block-ids; STYLO re-skins. Every stage after SAGE is best-effort and swallows its own errors so the worker loop never crashes. SAGE and STYLO log to the activity board MIRA reads next turn.

SAGE deep client research · fire-and-forget

A second, self-contained tool loop that investigates the client step-by-step — DataForSEO, the web, and a headless browser — and returns a validated "client file" of identity blocks plus a personalization payload.

File agent/research_subagent.py Model research_subagent_modelagent_model Tools 13 · ≤24 iters
Invoked by
ATLAS's deep_buyer_research tool → asyncio.create_task → returns "STARTED IN BACKGROUND." Fire-and-forget; idempotent per (user, project) with an in-flight guard.
Consumes
The client handle, website URL, industry, optional focus hints, and injected callables for web search, fetch, and the four browser tools.
Produces
A report of validated blocks (text/finding/kpi/quote/bullet/callout, each tagged to an About-You section) + a personalization dict + tool traces + cost. It persists nothing itself; it raises on budget-exceeded or malformed finish (no partial success).
Triggers
Returns to the orchestrator, which runs IRIS → harvest → LENS → STYLO.
Hard rules
  • Every number traces to a tool result — no estimating.
  • Every finding needs a real source_url seen this run; invented URLs fail validation.
  • Competitors come from real tool results, never inferred from name similarity.
  • Forbidden from emitting image URLs (it hallucinates them) — imagery is captured deterministically instead.

LENS vision image placement

A single-shot vision call that looks at every harvested client image alongside the persisted blocks and decides, per image: skip, attach as a reference thumbnail, or insert as a body image — and to which block.

File agent/image_placement_subagent.py Model agent_model (vision · detail:low) Tools 1 · forced
Invoked by
The image pipeline inside the research chain, after IRIS persists blocks so it can target real block-ids. Synchronous within the background task; degrades to "no placements" on failure.
Consumes
The persisted blocks + a catalog of harvested images (id, kind, source page, alt, downscaled thumbnail). A "matches finding" hint is computed when an image's host matches a finding's source.
Produces
Validated placement decisions → reference ops (append a thumbnail to a block) or body ops (insert an ai_image block). Only images it actually places get stored as assets — skipped ones never hit the Files panel.
Tools
place_images, forced. It picks an integer image-id from the catalog, never a URL; unknown ids/blocks are dropped.
Hard rules
  • Skip liberally — default is skip; icons, glyphs, and tiny files are chrome.
  • The homepage screenshot is the best body candidate — the client's front door.
  • One image per idea; de-dup near-duplicates.
  • Only real ids — reference/body must target a listed block; no inventing.

STYLO workspace personalization · the chain's last step

One structured-JSON call that re-skins the workspace — accent colors, tone, density, corner radius, suggestion chips, copy overrides, currency/locale — from whatever signals exist. Touches nothing in the brief, search, or identity.

File agent/personalization_subagent.py Model voice_filler_modelagent_model · 12s Output forced StyloPayload
Invoked by
The research chain, as its final, durable step (it used to be a flaky per-turn lane; that lane has since been removed). Best-effort.
Consumes
The freshly-written workspace: industry, client name, identity summary, brief summary, and existing personalization (only honored when ≥3 non-default fields are set).
Produces
A personalization payload the caller persists. The worker path emits no event — the re-skin surfaces on the next workspace sync. A deterministic industry-keyed palette table backs it when the model returns weak output.
Triggers
Terminal — the last step of the chain. Nothing fires after it.
Hard rules
  • Never leave the profile neutral when there's any signal; derive ≥3 chips + tone/density.
  • Agent names & hero motif are off-limits — excluded from the schema entirely.
  • Don't undo user or prior confirmations; currency + locale must match the market.
  • Empty fields + confidence 0 is the correct answer for no-signal input.

07Theater III — the concierge, between turns

After the client settles on what they want, the concierge takes over between turns and talks to talents on their behalf. It is event-driven, not polled: every reactive step is a job enqueued from the web layer (a talent reply, a client answer, an approval). Two invariants dominate — a client-approval gate before anything reaches a talent, and an anonymity wall that keeps the client's view anonymized until the reveal. Two newer wires: a run auto-finalizes the moment its last thread resolves with ≥1 offer (finalize_if_complete — no manual reveal click), and every answered client clarification is also folded back into the brief + About You by MASON + IRIS (concierge_answer_ingest) — the concierge feeds Theater I's planners.

BETWEEN TURNS — discovery is optional; the approval gate is not Client shortlist / approve SCOUT Fiverr seed + LLM pick /discover only · default OFF runner.start drafts outreach threads: queued (unsent) approval gate approve_and_send STERLING deterministic state machine + LLM brain runner.py start·send·tick finalize (deterministic) sterling.py writes msg · judges reply (LLM) Talents talent SPA + DB inbox offer scorer (LLM) + rank_offers (det.) auto-finalize → reveal fires when the last thread lands (≥1 offer) · top-3 ranked · then client picks /discover /start (skips SCOUT) opening msgs reply → seller_turn job anonymity wall — client sees anonymized view until reveal →
concierge action approval gate reveal to client anonymity wall
Two entry points converge on runner.start, which only drafts. Nothing reaches a talent until the client approves. STERLING's runner is deterministic (start/send/tick/finalize); its brain (sterling.py) is the LLM that writes each message and judges each reply. Talent replies arrive as seller_turn jobs; offers are scored by a separate LLM call and ranked deterministically; finalize reveals the top three.

SCOUT concierge talent-scout

Discovers candidate talents from Fiverr and LLM-picks 4–6 most likely to respond favorably to a custom-brief DM at the brief's budget tier — a responsiveness bet, distinct from the public-gig fit funnel.

File concierge/search_agent.py Model agent_model · temp:0.3 Tools none (Fiverr seed + pick)
Invoked by
Only the no-shortlist path: POST /api/concierge/discoverConciergeDiscoverJob → worker → discover_and_start. The explicit-shortlist /start path skips SCOUT entirely.
Consumes
The brief sections, industry, and a Fiverr client. It seeds candidates (search fan-out over ≤3 keywords → dedupe → top-12 → profile enrichment), then reasons over the full brief + enriched profiles in one pick.
Produces
A list of picks — each {seller, brief-relative rationale, UI signal tags, concierge_fit_score} — handed to runner.start (carrying the rationale so finalist cards can show why each talent was picked).
Triggers
STERLING's runner.start, which drafts outreach (and sends nothing yet).
Hard rules
  • Config-gated, default OFF (fiverr_mcp_enabled) — degrades gracefully, never a zero-seller run.
  • No invented talents — a pick's username must match a real candidate exactly.
  • fit_score is responsiveness, not quality.
  • Anonymity holds — discovery exposes only progress phases over Pusher, never raw talent PII pre-run.

STERLING concierge negotiator · state machine + brain

Two parts. The runner is the deterministic state machine (start / approve-and-send / tick / finalize). The brain is the pure LLM that writes each talent-facing message, judges each reply, and flags offers — acting only through four function tools.

File concierge/runner.py + sterling.py Model agent_model (no reasoning_effort — 400s with tools=) Tools 4 · tool_choice:required
Invoked by
Runner entrypoints driven by worker jobs (start on-cycle; send/tick/finalize/extend enqueued by web routes). Conversational turns run in the separate seller_turn handler, enqueued when a talent replies or the client answers a clarification.
Consumes
A context: the frozen brief + client-card snapshots, the trigger (opening / talent message / client answer), the full thread history, the latest inbound — plus the client↔assistant chat + the full IRIS dossier (load_client_knowledge) and known_answers: every answer the client has given on the run, across all threads, so he never re-asks what another talent already got answered.
Produces
His tool calls — send_reply / escalate_to_client / register_offer / register_decline — fold into a SterlingOutput: reply text, escalation question, offer_detected, decline_reason, next thread state, guardrail flags. LLM: the prose, the judgment, offer detection (with an LLM fallback extraction when the $-price/timeline regex can't parse a flagged offer). Deterministic: all persistence, push, and offer scoring/ranking.
Triggers
Talent messaging (the adapter's send is a no-op — the talent SPA + DB are the inbox); escalations → a client clarification, whose answer relays to each asking thread and enqueues one concierge_answer_ingest job (MASON + IRIS fold it into brief + About You); finalize_if_complete auto-reveals ranked offers + a recommended winner when the last thread lands.
Hard rules
  • Client-approval gate. start only drafts; approve_and_send refuses without a recorded approval. No path bypasses it.
  • Pure brain, no I/O — and tools only. One LLM call, zero side effects, tool_choice:required; a turn with no usable tool call degrades (ok=False), never guesses. The handler owns all persistence and push.
  • Never invent an offer — if neither the regex nor the LLM fallback can extract one, the thread stays open; no fabricated scorecard.
  • Guardrails: stay on-platform, decline out-of-scope, sanity-check price/timeline vs budget. (Full client disclosure to talents is allowed — a deliberate decision, separate from the client-side anonymity wall.)

08On demand — HUE, the PDF art-director

HUE is not part of any turn. It runs when the client clicks "Design & export PDF." The button only records a turn and enqueues a job (doctrine: no slow work on the web cycle); the worker then runs HUE through three phases and stores a bespoke stylesheet for the export.

HUE PDF art-direction · CSS generator

Looks at the brief, designs a custom CSS stylesheet for the export, then critiques and rewrites it. CSS-on-a-fixed-DOM, so the model can never drop content or emit broken markup.

File agent/brief_design_subagent.py Model agent_model · 75s Output sanitized CSS (no tools)
Invoked by
POST /api/profile/identity/designIdentityDesignJob → the identity_design worker handler. Never in a web request.
Consumes
A token-cheap digest of the brief: project + summary, industry, brand accent hex, theme, tone, block-type counts, section titles.
Produces
A result (css, base template, rationale, passes). The handler persists the CSS only when it's valid, then publishes one profile.design.ready Pusher event; the HTML endpoint layers the stored CSS.
Phases
Art-direct (JSON: template ∈ editorial/modern/classic, mood, palette, type) → design (full CSS) → refine (critique-and-rewrite, 0–3 passes).
Hard rules
  • CSS is sanitized before inlining — it's a <style> injection sink: drops every <, strips @import/url()/expression()/javascript:, caps at 60KB.
  • Output only CSS — no markup, no markdown. Scope every rule under .doc--ai.
  • Never hide text-bearing content; stay print-safe (avoid fixed positioning, keep the cover one page).
  • Avoid the generic-AI look — tune to the brand, industry, and mood.

09The deterministic glue (not agents, but load-bearing)

Several non-LLM components do the unglamorous work that lets the agents stay focused. They have no prompt and make no model call — but the system breaks without them.

intent.py — intent classifier

Pure keyword/stage rules that label the turn (run_search, analyze_file, small_talk…). Feeds a HINT: line into ATLAS's prompt; the model may override it.

brief_checklist.py — gate & render

Joins the checklist template with MASON's persisted verdicts into the rendered block read by both lanes. The enable_search gate refuses while required items are missing.

dual_lane.py — orchestrator + handoff

Spawns the four lanes over one queue, holds the per-project lock, persists the fast reply early, and writes the team's tagged notes (atlas|mason|iris) for MIRA — verbatim, no compressor.

conversation.py — shared projector

One uncapped conversation projector both MASON and IRIS read, so the two planner lanes compose from the same full transcript.

site_image.py — image harvester

Fans out over the client's own pages for og/twitter/icon/inline imagery (and a homepage screenshot for JS SPAs), behind an SSRF guard, deduped by hash. Feeds LENS.

concierge tools.py — pure functions

The deterministic outreach template, the regex offer-extractor, and rank_offers. The LLM offer-scorer is a separate call layered on top; ranking itself has no model.

LiveSellerAdapter — thin send

The only messaging adapter; its send_message is intentionally a no-op. The talent test-platform SPA and the seller_messages table are the inbox; the mock adapter/scenarios are gone.

agent_identity — canonical names

Stdlib-only telemetry-label → display-agent map (runneratlas, …) — the single source the debug dashboard, the agent-trace store, and the per-lane OTel spans (gen_ai.agent.name) all name agents by.

Not in the active set

STYLO does not run as a per-turn lane; its durable trigger is the research chain (the old inert no-op lane was removed). The legacy lineage also carried a voice filler and a partner-note compressor; the compressor is gone (notes are verbatim) and the filler is not part of the active set. ATLAS's edit_identity tool is gone as well — identity authoring moved into the IRIS planner lane.

10Rules every agent obeys

Beyond each agent's own prompt, the platform doctrine binds all of them. These are why the design looks the way it does.

  • No slow work on the web request. Every agent runs in the worker; the web tier authenticates, enqueues, and returns in milliseconds.
  • No streaming HTTP, no polling. Server→client is Pusher channels only; the agent package itself just yields events, transport-agnostic, and the worker republishes them.
  • Async is durable and idempotent. Jobs carry stable ids; externally-visible effects are keyed (uuid5 + ON CONFLICT DO NOTHING) so a retry can't double-send or double-write.
  • Degrade, don't crash. Every background stage swallows its own errors; LLM failures fall back to deterministic skeletons, palettes, or templates rather than an empty result.
  • Never invent. Across ATLAS, MASON, SAGE, IRIS, and SCOUT the same line recurs: every fact, number, source, and talent must trace to real evidence.
  • Mind the ~10KB Pusher cap. A full workspace is never pushed; oversized payloads set resync:true and the client re-reads GET /api/workspace.
  • Every call is visible. Each LLM call logs tokens, latency, and dollar cost; each lane runs in its own agent <name> OTel span (gen_ai.agent.name, named via agent_identity) nested under the worker's job span — a turn's distributed trace breaks out per agent.