Architecture Tour

Key flows, patterns, and design choices — without the wall of boxes.

AURA is a tool-using, multi-agent system for industrial facilities.
The product is durable work + grounded evidence, not “chat”.
Context

Industrial AI needs infrastructure

The model is the least reliable component in the system. The job of the architecture is to make its output grounded, durable, and safe.

Grounded

Answers cite data, documents, or explicit tool results. If we can’t point to evidence, we treat it as a hypothesis.

Durable

Runs survive refreshes, retries, and worker restarts. The UI is a subscriber, not the source of truth.

Safe

Write operations require explicit confirmation and produce an audit trail. “No invisible side effects.”

System Map

A simplified topology

We’ll keep one diagram and reuse it. Each flow just highlights different edges.

Flow

Chat turn: request → run → stream

  • UI posts a message and attaches to SSE for run_id.
  • front starts the run (or resumes), enforces a single in-flight turn per session, and streams events.
  • chat-agent executes a durable workflow; the model output is streamed, not buffered.
Pattern

Pulse → SSE: the UI tails a log

  • Stream identity is out-of-band: the SSE URL uses /chat/runs/{run_id}/events.
  • Resume is a first-class feature: from_event_id lets the UI reconnect without guessing.
  • Typed events: thinking, tool_call, tool_result, artifacts — debug is a trace, not a vibe check.
Flow

Tool calls: one interface, many providers

  • Tool IDs are canonical: atlas.read.get_time_series, todos.list, gdrive.search.
  • tool-registry routes registry-backed tools to the right service (schema + endpoint at runtime).
  • goa-ai gives us typed schemas/codecs and a consistent execution contract across providers.
Pattern

Bounded results: don’t hand the model a firehose

Tool results have to fit inside a context window. We treat “too much data” as an interface bug. The fix is structural: bounded results + refinement hints + UI-only artifacts.

LLM-facing

Small, deterministic summaries and capped lists. If truncated, the response carries a refinement hint.

{
  "returned": 120,
  "total": 1842,
  "truncated": true,
  "refinement_hint": "Try filtering by compressor_1 and narrowing to last 6h"
}

UI-facing

Artifacts (“sidecars”) for charts and structured displays. Artifacts are not sent to the model.

Learning moment: when ATLAS Data returned unbounded lists, the model spent tokens summarizing the dump instead of reasoning. Bounded results turned “please summarize” into “please decide”.
Pattern

Session: a ledger, not chat history

  • session_id is the logical container; conversation_id is a thread; run_id is a single execution.
  • Turn locking: if a run is active, front rejects a second turn (turn_in_progress).
  • Auditability: tool calls, results, and artifacts are stored so we can answer: “what did the model see?”
Pattern

Agent-as-tool: specialists without spaghetti

  • When it’s an agent: it needs its own prompt + policies + tool access + a multi-step reasoning loop.
  • Invocation model: chat-agent calls specialists as tools, creating child runs with linked streams.
  • Learning moment: Ada used to take an extra model turn after every tool completion. Now: return tool results directly unless synthesis is required.
Flow

Attachments: vision vs RAG

  • Vision path (pixels): photos/screenshots/nameplates. We inline images into the model input (subject to limits) and keep evidence linkable.
  • RAG path (chunks + citations): manuals/SOPs. We retrieve scoped excerpts through documents and return citations.
  • Task runs: images can be inlined at run start; documents are retrieved on-demand via scoped tools.
Flow

Storyboard: draft transformation

  • User edits the draft in the UI. The model proposes changes; it does not “own” the task definition.
  • tasks-designer-agent returns structured edits (think patch/diff), so we can apply, review, and keep intent explicit.
  • tasks persists versions and run history — authoring and execution are separate concerns. Smart rule: if the user can’t see what changed, we shouldn’t do it.
Flow

Task run: a new thread with its own run_id

  • front starts a run via POST /tasks/{task_id}/run and returns {conversation_id, run_id}.
  • chat-agent loads/compiles the task definition at run time and executes as a standard tool-using run.
  • Progress is streamed: step status updates are events — the UI is not polling.
Pattern

Write tools: confirmation is an architectural feature

  • Explicit approval: the model proposes; the human commits.
  • Separation of concerns: atlas-commands owns write semantics and auth boundaries.
  • Ledger trail: before/after is recorded alongside the conversation.
  • Learning moment: the “building shutdown freakout” was missing operational context. Fix: capture site facts + treat ambiguous states as questions, not emergencies.
Flow

Long-term memory: extract facts from runs

  • Knowledge Agent runs background workflows to extract stable “Site Facts” from sessions.
  • Facts are treated as data: they’re stored, cited, and re-used — not re-invented each turn.
  • Compounding effect: future runs start with better context and fewer clarifying turns.
Principles

Where we draw boundaries

Tool vs Agent

Tools are deterministic interfaces (fetch/act) with typed schemas. Agents are planners with policies, memory, and tool access.

Orchestrator hierarchy

chat-agent orchestrates. Specialists are invoked as tools to keep responsibilities crisp and runs traceable.

Contracts everywhere

Goa types + generated schemas enforce the boundary. We prefer loud failures over silent drift.

A usable heuristic: if it needs its own prompt + policies + multi-step loop, it’s an agent. Otherwise it’s a tool.
Demos

Demos that land with engineers

1) “Show me the stream”

Run a chat turn, then reconnect SSE from a prior from_event_id. Proves durability and observability in one move.

2) Agent-as-tool run tree

Trigger Ada from a chat turn and show child-run events in Pulse. Great moment to talk about boundaries and policies.

3) Attachments vs RAG

Upload an image + a PDF. Then show vision grounding vs citation-backed retrieval.

4) Task run

Start a seeded task run and watch step progress events stream. It feels like “chat”, but behaves like a workflow.

Q&A
1 / 1