Making Sense of Observability Data 

In this post, we’ll break down the three core pillars of observability — metrics, logs, and traces — and map them to practical use cases. We’ll also discuss how Edge Delta’s Telemetry Pipelines not only ingest this data, but also transform, enrich, and correlate it into high-value, actionable signals that can be forwarded to any downstream destination.

Chris Satter
Senior Solutions Engineer
Jul 25, 2025
4 minutes

Subscribe to Our Newsletter

See Edge Delta in Action

Share

In today’s increasingly complex, distributed, and cloud-native environments, observability is essential for maintaining system reliability, performance, and security. Unlike traditional monitoring, which focuses on predefined failures or thresholds, observability enables teams to ask new questions about system behavior, diagnose unknown issues, and perform real-time debugging and root cause analysis. By providing deep, correlated insights across infrastructure and applications, observability empowers developers, SREs, and operators to proactively detect anomalies, reduce downtime, and continuously improve user experience.

Observability spans a wide range of telemetry data types and tools, and it can be challenging to know what to focus on. In this post, we’ll break down the three core pillars of observability — metrics, logs, and traces — and map them to practical use cases. We’ll also discuss how Edge Delta’s Telemetry Pipelines not only ingest this data, but also transform, enrich, and correlate it into high-value, actionable signals that can be forwarded to any downstream destination.

Whether you’re building from scratch or refining your current approach, this guide is designed to help you get the most out of your observability and security data at scale.

Metrics

Metrics are used for monitoring system or application behavior by tracking quantitative measurements over time. They help answer questions such as “how much,” “how often,” or “how long” and are used in dashboards, alerts, anomaly detection, trend analysis, and performance baselines.

General Use Case Examples for Metrics

CategoryUse CaseMetric Examples
System PerformanceMonitor infrastructure health and performanceCPU usage, memory consumption, disk I/O, network throughput
Application HealthDetect app slowdowns, errors, or bottlenecksRequest latency (p50/p95/p99), error rate, throughput, DB query time
Capacity PlanningForecast resource needs and avoid scaling issuesDisk usage trends, memory pressure, concurrent users, and container limits
Incident DetectionAlert on performance degradation or failures5xx error rate spikes, latency thresholds, service unavailability
Security MonitoringDetect anomalies and suspicious activityLogin attempts per user, failed auth rate, unusual network ports accessed
Business AnalyticsTrack business KPIs and user behaviorDaily active users (DAU), conversion rate, cart abandonment, revenue per region
SLA/SLO TrackingMonitor and enforce service-level agreements% requests < 500ms, uptime percentage, error budget burn rate
Feature RolloutMeasure the impact of new features or releasesFeature adoption rate, session duration, crash frequency post-release
User Experience (UX)Optimize UX by monitoring real user metricsPage load time, API latency, time to first byte (TTFB), frontend errors
Developer ProductivityMonitor CI/CD health and engineering efficiencyBuild times, deployment frequency, and test failure rates
Cost OptimizationIdentify underused resources and optimize cloud spendIdle VM time, low-traffic services, high egress bandwidth costs
Custom Domain MetricsTrack domain-specific KPIs relevant to your product or serviceNumber of trades (for fintech), dropped frames (for streaming), battery usage (IoT)

Real-Time Metric Enrichment with Edge Delta

Metric data often lacks the contextual signals needed to correlate it with a specific service, component, or workflow — context that’s essential for effective issue detection and resolution. With Edge Delta’s Telemetry Pipelines, teams can enrich metrics with custom tags before they’re indexed, enabling more cohesive and actionable insights within their observability platforms.

For example, imagine we’re collecting Prometheus metrics and routing them into Edge Delta:

In this scenario, we need to tag the metrics with the correct team in order to clarify who’s responsible for monitoring them downstream. Using Edge Delta Live Capture, we can swiftly build OTTL enrichment logic to assign these metrics to the observability platform team:

By tagging these Prometheus metrics in real time, we ensure they’re delivered downstream in a clean, standardized, and context-rich format for the right team.

Logs

Logs are unstructured or semi-structured records of discrete events. They help you understand what happened in a system — often after the fact — in response to unexpected or undefined behaviors like application crashes, failures, or potential security incidents. In these cases, you might not yet know which metrics to track, but you expect the relevant details are being logged by the application, system, or infrastructure components.

Logs are especially useful for providing narrative context during investigations of complex, time-sensitive, or multi-step issues. They’re often enriched with metadata such as timestamps, log level (e.g., INFO, WARN, ERROR), user IDs, and session IDs — all of which help you reconstruct the sequence of events, understand severity, correlate related events across systems, and trace issues back to specific users or sessions.

General Use Case Examples for Logs

CategoryUse CaseLog Examples
DebuggingUnderstand application failures, crashes, or unexpected behaviorStack traces, exception messages, and debug output
Incident ResponseInvestigate production incidents and outagesError logs, HTTP 500s, timeouts, service restart messages
Security MonitoringDetect unauthorized access or suspicious activityFailed login attempts, privilege escalations, and access log anomalies
Compliance and AuditingProve that systems are behaving according to policy and regulationUser activity logs, data access logs, and configuration changes
Performance TuningIdentify bottlenecks or inefficiencies through patterns in log messagesLong-running queries, slow API responses, and retry loops
Change TrackingTrack configuration changes, deployments, and updatesGit commit logs, CI/CD pipeline logs, and infrastructure change events
Workflow MonitoringMonitor business logic and end-to-end process flowOrder processing steps, transaction state changes, job start/complete logs
User BehaviorAnalyze how users interact with systems or applicationsLogin logs, clickstream logs, feature usage events
Root Cause AnalysisReconstruct what happened before and during an incidentLog traces correlated across systems, error breadcrumbs
Threat HuntingProactively identify malicious or abnormal activityLateral movement, brute-force patterns, use of unusual tools or IPs
Distributed TracingSupplement trace spans with context-rich logsLogs that include trace_id, span_id, user/session metadata
Infrastructure MonitoringDiagnose issues with servers, containers, and cloud infrastructureSystemd logs, container stdout/stderr, service crash dumps

Intelligent Patternization with Edge Delta

Log data is essential for deep investigations and forensic analysis, but it can be difficult to understand overall log behavior by looking at a running list of individual log items. With Edge Delta’s Telemetry Pipelines, teams can generate high-level log summaries by extracting recurring patterns from log bodies in real time — providing immediate insight into trends. 

To enable this functionality, simply add a Log to Pattern processor to your desired multi-processor:

Once created, patterns can be sent to any downstream destination. This includes Edge Delta’s Observability Platform, which provides an out-of-the-box view into pattern behavioral trends:

From this “Patterns” page, teams can quickly understand log behavior at a glance, and easily drill into the full-fidelity log items associated with a particular negative-sentiment pattern:

Traces

Traces are end-to-end records of individual requests or transactions as they move through a system, broken into spans that represent each operation or service hop along the way. They help us understand the “where and why.”

Each span captures key details like timing, status, metadata, and parent-child relationships — all of which come together to form a complete request flow graph.

Tracing is especially valuable in microservices, serverless, and other distributed architectures, where logs and metrics can’t provide the context needed to understand performance bottlenecks or failure points.

General Use Case Examples for Traces

CategoryUse CaseTrace Examples
Distributed Request TrackingFollow a single request as it moves across microservices or tiersTrace ID spans covering API Gateway → Auth Service → DB → Cache → Response
Latency Bottleneck IdentificationIdentify where time is spent in request flowsSlow span in DB query or downstream service call stands out in the trace timeline
Root Cause AnalysisUnderstand which component in a chain caused an error or slowdownError or exception tagged in the relevant span with stack trace or status code
SLA/SLO EnforcementTrack service performance against latency or availability objectives99th percentile latency for a specific endpoint exceeding the threshold
Service Dependency MappingVisualize service-to-service communication and dependenciesAuto-generated service map from collected spans and traces
Cold Start and Initialization MonitoringIdentify slow start times in serverless or container-based appsFirst spans show long delays before actual processing starts
Load Testing and Stress AnalysisUnderstand system behavior under loadSpans show degraded performance or queue buildup under high traffic
Debugging Intermittent FailuresSpot flaky behavior or inconsistencies in service interactionsTraces reveal patterns (e.g., only fails when Service C is involved)
Feature Rollout ObservabilityCompare request behavior before and after feature flag togglesTraces from different release versions or experiments show changed span durations
Queue and Message Bus TimingMeasure time spent in async systems like Kafka, RabbitMQ, and SQSSeparate spans for message enqueue, dequeue, processing, and acknowledgment
Contextual AlertingEnrich alerts with trace context for faster resolutionAlert links to slow or failed traces with span hierarchy and service metadata
Security and ComplianceTrack data flows or access to sensitive componentsTraces showing which user accessed which resource, when, and how

Tail-Based Sampling with Edge Delta

Modern systems generate incredibly high volumes of trace data due to the scale and complexity of their service-to-service interactions. 

With Edge Delta’s Gateway Pipeline deployment, teams can apply tail-based sampling to filter out low-value traces before they’re shipped downstream, reducing ingestion costs and eliminating noise during troubleshooting. 

It’s easy to configure tail-based sampling within Edge Delta’s UI — let’s walk through how it’s done using a demo pipeline running in our Playground environment

First, we’ll enter the pipeline, click on the “Trace Tail Sampling” multi-processor node, and search for the Tail Sampling processor:

Once added, the Tail Sampling processor can be configured to sample trace data probabilistically — or by evaluating attributes like trace latency, span count, and status code:

In this example, we want to focus on analyzing traces with a high span count. We can quickly add a new span count sampling policy to filter out small traces (in this case, those with fewer than 12 spans), which reduces trace volume by 63%:

Once the change is saved, the Gateway Pipeline immediately begins dropping low-value traces at the source, before they clog up our costly downstream monitoring and analysis platforms — helping us evaluate critical trace data to pinpoint issues more efficiently.

Enhanced Observability with Edge Delta

Observability is no longer optional — it’s essential for operating and securing modern, distributed systems. Whether you’re tracking high-level performance through metrics, investigating unexpected behavior via logs, or analyzing request flows with traces, observability gives teams the visibility and insight they need to maintain reliability, resolve incidents quickly, and deliver exceptional user experiences.

But implementing an observability strategy requires more than data collection — it demands the ability to process, correlate, and act on that data in real time. That’s where Edge Delta comes in.

Edge Delta’s Telemetry Pipelines are purpose-built to help organizations harness the full power of observability. They not only ingest logs, metrics, and traces from your environment but also transform and enrich this data at the source to generate high-value, actionable signals. This enables you to detect anomalies earlier, reduce noise, and gain deeper insights — without waiting for data to reach a backend. Whether you’re storing data in Edge Delta’s own Observability Platform or forwarding it to other tools, you maintain control over performance, cost, and data fidelity.

No matter where you are on your observability journey, Edge Delta empowers you to build a smarter, more scalable observability architecture tailored to your needs.If you’d like to experiment with Edge Delta, check out our free, interactive Playground. You can also book a live demo with a member of our technical team.

See Edge Delta in Action

Get hands-on in our interactive playground environment.