Content Delivery Network (CDN) logs provide valuable insight into the performance and user experience of your web-based content. However, it can be challenging to analyze CDN logs in real-time and at scale when you’re shipping, ingesting, and indexing massive log volumes to a centralized observability platform before analyzing them. As a result, it can be difficult to detect changes in system behavior and anomalous activity before they impact your users.
Edge Delta solves this problem by analyzing log data as it’s created at the source, using software agents deployed in a distributed architecture. Then, Edge Delta routes the outputs – insights, statistics, and aggregates – to your downstream observability platform instead of all raw logs all the time. This approach generates faster insights, improves performance, and dramatically reduces ingest/index volumes when compared to using traditional observability platforms alone.
However, there are scenarios where customers are not able to deploy Edge Delta’s agents within a given compute environment. This is the case with Fastly’s CDN SaaS offering. So, how can you receive the benefits detailed above? This blog will discuss how you can analyze Fastly CDN logs as they’re created in real-time when you use Edge Delta’s Clustered Agent architecture combined with a hosted endpoint (“Hosted Agent”).
Overview of Edge Delta Clustered Agents
Edge Delta’s Clustered Agent architecture serves as a plug-and-play offering when combined with a Hosted Agent, where Edge Delta provides an endpoint to perform data processing. Hosted Agents are most useful in scenarios like this, where the customer cannot deploy the agent or analyze data within their own compute environment.
When using a Hosted Agent, it is difficult to predict the volume of data that will be ingested and processed by Edge Delta. To account for potentially high data volumes, Edge Delta takes a Clustered Agent approach, consisting of multiple “worker” agents running on a Kubernetes cluster that process different parts of the data being generated. In this architecture, all agents given the same configuration – the goal is for them to act as if they were one agent.
Instead of routing the data processing outputs directly to the streaming destination (e.g., your observability platform), the worker agents route everything to an Aggregator Agent. The Aggregator Agent is automatically deployed behind the scenes to coordinate the worker agents, stitching together their outputs to eliminate gaps or redundancies in analysis. From there, aggregated metrics and observability data are reported to the streaming destinations of the workflow.
This setup properly supports external push-based sources, such as Fastly CDN logs consumed via HTTPS, because the worker agents apply parallel processing to the source, therefor increasing throughput. Hosted Agents also provide high availability with a FIFO queue architecture to prevent requests from getting lost or dropped.
Clustered Agents in a User's Own Infrastructure
There are also scenarios in which a customer might want to use Clustered Agents, but host the processing endpoint within their own compute environment (versus with Edge Delta hosting the endpoint via a Hosted Agent). Here, customers also have the option of launching the Clustered Agent architecture via a helm deployment. In this setup, the same Clustered Agent will be deployed to users’ own Kubernetes infrastructure via our published helm charts in our public repository.
helm repo add edgedelta https://edgedelta.github.io/charts
helm repo update
helm template edgedelta edgedelta/edgedelta \
--set apiKey=****** \
--set aggregatorProps.enabled=true \
--set httpRecorderProps.enabled=true \
--set httpRecorderProps.ingress.enabled=true \
--set httpRecorderProps.ingress.host="" \
-n edgedelta --create-namespace > ed_clustered_agent.yml
kubectl apply -f ed_clustered_agent.yml
How to Use Edge Delta Clustered Agent in Edge Delta Hosted Agent Architecture
To start using the Edge Delta Clustered Agent Architecture within a Hosted Agent environment, follow the steps below.
1. Sign up or login to the Edge Delta web application.
2. Create a Hosted Agent by navigating down to “Hosted Agents” under “Data Pipeline.” From this page, click the “Create” button.
3. Adjust your worker specs according to your data rate. Feel free to reach out for adjusting these specs according to your needs.
4. Copy the https endpoint for your CDN provider (Fastly). This will be used to denote where log data will be sent.
5. Paste the copied URL to your CDN provider as the URL to which log data will be sent. For Fastly CDN, you can follow the steps in their website for https endpoint.
After following these steps, you’ll gain visibility to your CDN logs in real-time, while also excluding no- or low-value raw data in your streaming destinations.
Conclusion
As you’ve seen in this post, Edge Delta is providing a clustered approach to increase processing throughput and to provide more granular insights especially for high cardinality systems, like Fastly’s CDN. By doing this, even for the large amount of distinct time series data, users gain more logical and associated insights. We've combined this dynamic approach with our Hosted Agent environment so that it can be used with one-click-install.
Note: If you’d like support for another CDN provider in addition to Fastly, reach out to us here.