Edge Delta Benchmarks
Open, neutral benchmarks for AI incident reasoning and observability pipeline performance. Every model gets the same data and the same tools. We measure the reasoning.

Blast Radius Bench
62% top pass rate
Can AI reconstruct the failure chain?
When five services are on fire at once, can a model separate the root cause from the blast radius and recover the directed path the failure took?

Noise Bench
88% top pass rate
Can AI tell a real incident from alert noise?
It's 2am and twenty alerts just fired. A few are real, most are noise. Can a model decide who to wake up, catching every real incident without drowning in the flaps?

RCA Bench
99% top pass rate
Can AI find the commit that broke prod?
Drops a frontier model into a frozen incident and asks the only question that matters at 3am: which commit caused this?

Pipeline Performance Bench
4.6x faster
Can a pipeline handle petabytes of data?
HTTP log ingestion throughput compared across Edge Delta, Cribl, the OpenTelemetry Collector, and Fluentd under identical conditions.