K
Watchlist
← Dealbook
Resolve.ai logoRE

Resolve.ai

Horizontal AI
C
5 risks

Resolve.ai is positioning as a series a horizontal AI infrastructure play, building foundational capabilities around knowledge graphs.

resolveai.one
series aGenAI: coreSan Francisco, United States
$40.0Mraised
15KB analyzed17 quotesUpdated May 1, 2026
Event Timeline
Why This Matters Now

As agentic architectures emerge as the dominant build pattern, Resolve.ai is positioned to benefit from enterprise demand for autonomous workflow solutions. The timing aligns with broader market readiness for AI systems that can execute multi-step tasks without human intervention.

Resolve.ai provides an AI platform that automates software production operations and incident management for engineering teams.

Core Advantage

The combination of domain credibility (OpenTelemetry co‑creators and ex‑Splunk observability leadership), deep tool + telemetry + code integrations, a multi‑agent reasoning/execution layer, and a privacy‑first architecture that allows enterprises to safely grant agents operational access without data leakage or cross‑tenant model use.

Build SignalsFull pattern analysis

Knowledge Graphs

4 quotes
medium

Resolve appears to maintain structured, permission-aware representations of production entities and their relationships (services, infra, runbooks, telemetry, chats). This is likely implemented as an index or graph-like knowledge store to support scoped queries, entity linking, and causal timeline construction.

What This Enables

Emerging pattern with potential to unlock new application categories.

Time Horizon12-24 months
Primary RiskLimited data on long-term viability in this context.

Natural-Language-to-Code

3 quotes
high

They convert freeform user intent into executable artifacts (kubectl, PRs, scripts). Likely implemented with LLM-driven prompt-to-command/code generation plus environment-aware templating and safety checks to create working code and infra commands.

What This Enables

Emerging pattern with potential to unlock new application categories.

Time Horizon12-24 months
Primary RiskLimited data on long-term viability in this context.

Guardrail-as-LLM

3 quotes
emerging

There are strong operational and policy guardrails described (access controls, no data mixing, no write access). However, the content does not explicitly describe a secondary LLM layer that checks or filters outputs. The guardrails appear emphasized at infra/policy level; model-level safety validators are plausible but not explicit.

What This Enables

Accelerates AI deployment in compliance-heavy industries. Creates new category of AI safety tooling.

Time Horizon0-12 months
Primary RiskAdds latency and cost to inference. May become integrated into foundation model providers.

Micro-model Meshes

4 quotes
high

Resolve describes an orchestration of specialized agents/models that handle different subtasks (investigation, telemetry parsing, remediation). This implies a router/orchestrator selecting among task-specific or fine-tuned models (an explicit micro-model mesh or ensemble of expert models).

What This Enables

Cost-effective AI deployment for mid-market. Creates opportunity for specialized model providers.

Time Horizon12-24 months
Primary RiskOrchestration complexity may outweigh benefits. Larger models may absorb capabilities.
Technical Foundation

Resolve.ai builds on Claude Sonnet 4.6, Anthropic Opus 4.6, leveraging Anthropic infrastructure. The technical approach emphasizes hybrid.

Model Architecture
Primary Models
Anthropic Opus 4.6Claude Sonnet 4.6
Fine-tuning

Claims of exclusive per-customer fine-tuning/adaptation are made; no low-level technique (LoRA, delta tuning) is specified in content. — Customer-specific operational data (runbooks, chats, incidents, telemetry-derived summaries) as implied by "Builds a living model...captures tribal knowledge" and "Your Data Trains Only Your Models".

Compound AI System

Controller/orchestrator that spawns specialized agents which operate tools and query systems in parallel to test multiple hypotheses and build causal timelines; agents are tool-aware (can run kubectl, update JIRA, generate PRs).

Inference Optimization
Cost-vs-quality model selection (benchmarking to pick model and operating point)Operational practices implied for latency and scale (24/7 on-call inference) but no explicit techniques like quantization or caching are described
Team
Spiros Xanthos• Co-founder and CEOhigh technical

Co-creator of OpenTelemetry; led Splunk Observability as GM; Chief Architect at Splunk Observability; long-standing focus on production systems and observability; educated at University of Illinois Urbana-Champaign

Previously: Splunk, VMware

Mayank Agarwal• Co-founder and CTOhigh technical

Co-creator of OpenTelemetry; led Splunk Observability as Chief Architect; experienced in enterprise software and production tooling; educated at University of Illinois Urbana-Champaign

Previously: Splunk, VMware

Founder-Market Fit

Strong: founders' backgrounds in OpenTelemetry and Splunk Observability align closely with building AI-driven production reliability and incident response platforms at scale.

Engineering-heavyML expertiseDomain expertiseHiring: AI researchersHiring: observability engineers / SREsHiring: security/compliance specialistsHiring: backend/infrastructure engineers
Considerations
  • • Limited public detail beyond founders; broader team composition not well documented
  • • In-person SF-based culture may limit remote recruitment and geographic diversification
  • • Public GitHub footprint appears minimal, which may reflect early-stage branding or product readiness
Business Model
Go-to-Market

content marketing

Target: enterprise

Pricing

custom

Enterprise focus
Sales Motion

hybrid

Distribution Advantages
  • • Enterprise-grade security and compliance (SOC 2 Type II, GDPR, HIPAA)
  • • Unified, secure gateway with granular data access (Resolve AI satellite)
  • • No data ingestion and exclusive fine-tuning to prevent data leakage across customers
  • • Multi-agent orchestration across code, infrastructure, telemetry, and knowledge
Customer Evidence

• Meir Amiel testimonial

• Logos: Coinbase, Zscaler, Toast (and others)

• MTTR reduction and production incident improvements

Product
Stage:beta
Differentiating Features
Living model of production that continuously adapts to environment and guidanceData privacy controls: no data ingestion, data remains in customer environments, least privilege, no cross-customer data mixing, exclusive fine-tuningTool-agnostic operation that reduces dependence on query languages (PromQL, Kubectl) and pre-defined tool expertiseSecure gateway (Resolve AI satellite) enabling granular data access control and RBAC/SSO
Integrations
PromQLKubectlJIRAGit (PRs, code changes)Observability toolsCode, infrastructure, and telemetry sources
Primary Use Case

Automated on-call incident investigation and triage with AI agents

Novel Approaches
Metadata-first selective retrieval with configurable scrapingNovelty: 7/10Retrieval & Knowledge

The emphasis on metadata-only access and configurable scraping frequency is distinct from many RAG implementations that default to ingesting wide swaths of data; this reduces data exposure and aligns retrieval with least-privilege principles.

On-prem/edge-capable satellite gateway with enterprise auth and least-privilege controlsNovelty: 7/10Operations & Infrastructure (LLMOps)

Packaging a configurable satellite that can either scrape metadata or act as a proxy and explicitly enforce no-write/no-persist rules is a strong enterprise pattern for regulated environments; it's a pragmatic hybrid of on-prem control with cloud-managed models.

Customer-exclusive model adaptation with minimal raw-data footprintNovelty: 7/10Data Strategy

The explicit claim of exclusive fine-tuning combined with a no-raw-data policy addresses a hard enterprise tension (models that learn from customers while preserving privacy) — the implementation details are critical and noteworthy if true.

Competitive Context

Resolve.ai operates in a competitive landscape that includes PagerDuty, Datadog, Splunk (including VictorOps/On-Call capabilities).

PagerDuty

Differentiation: Resolve.ai emphasizes autonomous multi‑agent investigations that reason across code, telemetry and infrastructure and can take remediation actions (PRs, kubectl, scripts). It also markets a privacy-first architecture (no data ingestion, tenant‑specific fine‑tuning) and deeper code/system-level integrations beyond PagerDuty’s incident routing and runbook automation.

Datadog

Differentiation: Datadog focuses on metrics/traces/logs and dashboards; Resolve.ai layers agentic AI on top of observability to autonomously investigate incidents, build causal timelines linking code changes to telemetry, and operate tools. Resolve markets tighter integration with code, PR generation, and automated remediation workflows rather than primarily telemetry visualization.

Splunk (including VictorOps/On-Call capabilities)

Differentiation: Resolve.ai claims to be built specifically for autonomous production operations with multi‑agent reasoning and agentic tool execution. Founders are ex‑Splunk and OpenTelemetry co‑creators, positioning Resolve as more focused on LLM agents that act across code/infra/telemetry rather than Splunk’s broader data platform approach.

Notable Findings

Privacy-first 'satellite' gateway that can be configured in metadata-only mode or act as a secure proxy — enabling deep runtime access without centralizing raw customer data. This is a distinct engineering tradeoff (edge proxy + ephemeral context) versus SaaS ingestion-first models.

Per-customer 'living model' and exclusive fine-tuning claims: they emphasize no cross-customer models and no raw-data ingestion while still offering continuous learning from runbooks/chats/incidents. That implies on-prem or customer-specific parameter-efficient fine-tuning (LoRA/adapter-style) or encrypted/ephemeral embedding strategies rather than standard multi-tenant training.

Multi-agent orchestration focused on parallel hypothesis-driven incident investigations: an orchestration layer that spawns specialized agents (telemetry, code, infra, knowledge), coordinates async evidence collection, and synthesizes causal timelines across traces, logs, and commits.

Tool-operating agents with action gating: the system not only analyzes but generates executable remediation (kubectl, PRs, Jira updates) under strict RBAC and SSO-controlled service accounts — coupling inference with audited effectors and safety constraints.

Deep integration with observability standards and lineage: co-creators of OpenTelemetry on the team signals a likely tight coupling to instrumentation/trace semantics (e.g., deterministic mappings from spans to service ownership and code commits), which reduces brittle mapping problems other vendors face.

Risk Factors
Wrapper Riskhigh severity
Feature, Not Productmedium severity
No Clear Moathigh severity
Overclaiminghigh severity
What This Changes

If Resolve.ai achieves its technical roadmap, it could become foundational infrastructure for the next generation of AI applications. Success here would accelerate the timeline for downstream companies to build reliable, production-grade AI products. Failure or pivot would signal continued fragmentation in the AI tooling landscape.

Source Evidence(17 quotes)
“Multi-agent system that connects code, services, infrastructure, and telemetry. Operates tools and reasons through complex problems like your expert engineers.”
“AI for prod. It works across your code, infrastructure, telemetry, and knowledge to help engineering teams run production more reliably and efficiently.”
“On-call 24/7 AI that is always debugging, triaging and debugging incidents with full context across telemetry, infra, code, and tools.”
“Autonomously investigates incidents and builds initial theories before your on-call engineer even looks. Gets you to root cause.”
“Executes remediation actions: generates Git PRs, kubectl commands, code fixes, or scripts that work with your setup.”
“AI agents deployed into complex, domain-specific workflows don't work out of the box. Here's why Forward Deployed Engineering is the critical path to enterprise AI adoption.”