Core ECP

Discovery Architecture

Draft → Validate → Commit

A flexible system for populating devices and I/O from any source

The Problem

Onboarding is Painful

⏱️ 1-3 Weeks Per Site

Manual data entry, spreadsheet wrangling, back-and-forth with site contacts.

🧩 Ad-Hoc Process

Ops stitches together P&IDs, tag exports, PLC code, and tribal knowledge. Different every time.

No Visibility

"What's done? What's missing?" No systematic way to track completeness.

🎯 The Goal

Multiple ingest paths → unified draft → validated commit. Clear picture of gaps at every step.

Architecture

The Three Layers

Discovery Sources

P&ID scan, tag import, network sniff, operator chat, PLC export, photos, manual entry

Draft Layer

I/O Draft, Device Draft — partial, inconsistent, tracks provenance and gaps

Commit

Validates against templates → Creates real I/O + Devices in Blueprint

Why This Matters

The draft layer is the key insight:

Partial — not all fields populated yet

Inconsistent — conflicts from different sources OK

Explicit gaps — "SuctionPressure: ???"

Provenance — "came from P&ID" vs "operator said"

Mergeable — multiple sources contribute

Ingest Flows

Discovery Sources

Not all sources apply to all sites. That's the point.

Source What It Gives Us Limitations
Tag Import I/O points, aliases, device hints No connections, no metadata
P&ID Scan Devices, connections, topology No I/O mapping, may be outdated
Network Sniff Controllers, IPs, protocols No semantic meaning
PLC Code Control logic hints, setpoints Messy, not always available
Operator Chat Fill gaps, validate, metadata Slow, requires site access
Photos Equipment specs, nameplates Requires LLM extraction
Reality Check

Every Site is Different

The system must handle variability gracefully.

Has PLCs?

→ PLC export possible

Current P&IDs?

→ P&ID scan valuable

Friendly site contact?

→ Operator chat viable

Network access?

→ Network sniff possible

No single discovery flow will work everywhere.
The draft layer unifies them all.

Visibility

The Completeness View

Ops sees exactly what's done, what's missing, and where conflicts exist.

compressor_1

Template: FrickCompressor

Populated: 8 / 12 fields

Missing: SlideValvePosition_2, OilPressure, Model, Tonnage

Conflict: SuctionPressure has 2 different aliases

condenser_1

Template: Condenser

Populated: 6 / 8 fields

Missing: FanCount, Location

✓ No conflicts

AI Assist

Where LLMs Add Value

📄 P&ID Parsing

"Here's a PDF — extract devices, connections, and topology"

🏷️ Tag Inference

"Here's a tag list — infer device types and map to templates"

📸 Photo Extraction

"Here's a nameplate photo — extract manufacturer, model, specs"

💬 Operator Chat

"Ask the operator questions to fill remaining gaps"

🔍 Validation Hints

"This value looks wrong based on similar devices"

🧠 Gap Filling

"Based on device type, suggest likely missing values"

LLMs help fill the last gaps — there will always be some manual work.

Broader Vision

Beyond Just Control

Device templates capture more than what the control system needs.

🎛️ Control Points

I/O for real-time monitoring and control. The core use case.

⚡ EnergyAI Inputs

Setpoints, schedules, operating parameters for optimization apps.

🤖 AURA Context

Names, descriptions, locations, relationships — so AURA can reason about the facility.

📋 Equipment Specs

Manufacturer, model, tonnage, install date — for reporting and maintenance.

DeviceTemplates enforce validation across all these domains.
One source of truth for the entire platform.

Next: Brainstorm

Questions for the Team

1. Other Discovery Sources?

What else could we tap that we haven't considered?

2. Draft Data Structure?

What should the intermediate representation actually look like?

3. Conflict Resolution?

When two sources disagree, how do we handle it?

4. LLM Sweet Spots?

Where can AI add the most value with least risk?

5. What's the MVP?

Minimum viable version we could ship and iterate on?

Draft → Validate → Commit

Flexible ingest. Unified representation. Clear visibility.

Let's brainstorm.

1 / 10