Data infrastructure that leverages human context across all models and AI modalities
When we talk about "AI for analytics," we're actually talking about two completely different problems that happen to share some vocabulary. Getting them confused is why most solutions feel wrong for at least one use case.
Exploratory mode: grad students helping experts
Constrained mode: translator for a curated view
When you're working closely with the model, you want exploration and novelty. You want discovery of new facts, which inherently comes with risk of hallucination and incorrect conclusions. In many other workflows, you just want answers.
These two environs imply different trust boundaries, not just different prompts. The exploratory agent's ceiling is human judgment. The constrained agent's ceiling is the dashboard itself.
The AI Data Stack provides four categories of infrastructure that any agent can consume. Think of these as the primitives: the things an agent needs to know to make good decisions about data, or to have its decisions audited.
The stack sits beneath the model. Different users get different experiences, but the infrastructure is the same. The model consumes the stack; the user's context determines how much of it surfaces.
Not all data is created equal. A governed metric with clear ownership and lineage is fundamentally different from a raw log table someone created for debugging. The agent needs to know the difference.
This is where most semantic layers fail. They tell you what things are, but not what combinations are valid. An agent that doesn't know composition rules will happily join a sampled table to an unsampled table and give you nonsense with confidence.
Composition rules encode things like:
This is the core insight: governed metrics must compile down to inspectable SQL. Not "generate SQL probabilistically." Not "approximate with LLM reasoning." Compile: deterministically, repeatably, cacheably.
Why does this matter? Because if you can't show the SQL, you can't trust the number. And if the agent can't trust the number, neither can you trust the agent's reasoning about it.
Every piece of data has a story: where it came from, what transformations it went through, who owns it, when it was last refreshed. Lineage makes that story machine-readable.
For exploratory agents, lineage is about explanation: the agent can tell you why it trusts (or doesn't trust) a particular number. For constrained agents, lineage is about validation: the agent can verify that its answer is grounded in the same sources as the dashboard.
| Tool | What It Gives the Agent | Why It Matters |
|---|---|---|
| Trust Ratings | "How much should I believe this?" | Agent can prefer governed metrics, explain when it can't |
| Composition Rules | "Can I join these two things?" | Prevents semantically invalid combinations |
| SQL Compilation | "Show me exactly what this means" | Auditability, caching, no hidden magic |
| Lineage | "Where did this come from?" | Agent can explain its reasoning, human can verify |
Same four tools, completely different usage patterns. The AI Data Stack doesn't change, but the agent's posture does.
Notice the difference: the exploratory agent went beyond the governed metrics into raw logs, and said so explicitly. The constrained agent stayed within the dashboard's scope, and stopped when it hit the boundary.
Exploratory agents can do clever things that would be reckless in constrained mode: mining old queries from experts to discover how tables join, inferring relationships from historical usage patterns, trying novel combinations to surface insights. These are features, not bugs, when an expert is reviewing the work. Constrained agents need to be much more careful. They're not discovering, they're translating.
Both agents used the same infrastructure. The difference is posture.
When agents don't have access to trust ratings, composition rules, and deterministic compilation, you get predictable failure modes. These aren't edge cases, they're the default.
| Without... | What Breaks | What It Looks Like |
|---|---|---|
| Trust Ratings | Agents can't prioritize sources | Agent treats a debug log table the same as a governed metric. Gives you "answers" from unreliable data without flagging the risk. |
| Composition Rules | Semantically invalid joins | Agent joins a 10% sampled table with unsampled data, or sums a non-additive metric across regions. The number is precise but meaningless. |
| SQL Compilation | Hidden magic, no audit trail | The semantic layer "figures out" the join path with LLM reasoning. Sometimes right, sometimes wrong. You can't tell which without running queries and eyeballing results. |
| Lineage | Unexplainable answers | Agent joins three tables and gives you a number. You ask "where did that come from?" It generates a plausible-sounding but wrong explanation. No verifiable trail. |
| Dashboard Scope | Constrained agents that drift | You want the agent to only use dashboard data, but "what's in this dashboard" isn't machine-readable. Agent hallucinates metrics that sound right but don't match the view. |
Without trust ratings, agents can't prioritize. Without composition rules, they can't avoid invalid combinations. Without deterministic compilation, you can't audit their work. Without lineage, you can't verify their reasoning. Every failure mode traces back to missing infrastructure.
The AI Data Stack isn't a product you buy. It's a set of capabilities you build into your data infrastructure. Here's what each piece looks like in practice.
Every table, column, and metric gets a trust rating stored in your metadata catalog. This isn't just documentation, it's machine-readable configuration that agents consume at query time.
Encode which combinations are valid, and which aren't. This is where you capture tribal knowledge that currently lives in people's heads.
This is the heart of it. A governed metric is a high-level representation that compiles to SQL. Not probabilistically generates it, but deterministically transforms it.
The compilation step is where you enforce composition rules, inject trust ratings, and produce auditable output. If the metric can't compile cleanly, the agent knows something is wrong before it runs anything.
Finally, expose all of this through an interface agents can consume:
The agent doesn't need to understand your data warehouse. It needs to understand the contracts your data warehouse exposes. That's what the AI Data Stack provides.
You don't need to catalog everything on day one. Start with your most-used metrics, the ones that show up in executive dashboards. Get those to AAA. Add composition rules as you discover invalid joins in the wild. The stack grows organically from the metrics that matter most.