In August 2023, Datadog released Flex Logs – a lower-cost, warm storage tier for log data. Datadog encourages customers to use the Flex tier for high-volume datasets that are queried infrequently and do not require real-time results.
The addition of Flex Logs gives Datadog Log Management customers three tiers to store data:
|$0.10 per GB ingested + $2.55 per million log events per month
|Hot storage intended to support real-time use cases, including dashboarding, alerting, and troubleshooting.
|$0.10 per GB ingested + $0.05 per million log events per month + undisclosed query costs
|Warm storage intended to support historical, less urgent investigations.
|$0.10 per GB ingested
|Cold storage intended to support long-term retention for compliance or other use cases*.
* To analyze data residing in the Archive Tier, you must “rehydrate” into the Standard Tier and pay accordingly.
We applaud Datadog for providing a lower-cost option to support massive volumes of data. However, as you evaluate the offering, we want to draw your attention to two significant drawbacks of making Flex Logs a part of your observability strategy. In this blog, we’ll walk through both.
The Shortcomings of Datadog Flex Logs
#1: Query Performance vs. Cost Tradeoff
Datadog can put a $0.05-per-million-event sticker on Flex Logs because it decouples compute and storage/retention costs. At the time of writing, Datadog has not published compute costs. They have disclosed that multiple levels of compute are available to query logs in Flex indexes.
With the offering, there is an implicit tradeoff: sacrifice performance to reduce storage costs.
In other words, storing your data in the Flex tier is cheaper, but it will take longer to run queries and return results compared to the standard index. To incrementally improve performance, you give up cost savings.
This trade-off requires you to know how you (and everyone in your organization) will use the data after you ingest it. As we’ll explore in the next shortcoming – this is a complex question to answer.
#2: What Data Goes in What Tier?
To realize the value of Flex Logs, you must first choose which logs belong in each storage tier: Standard (hot) vs. Flex (warm) vs. Archive (cold). And to understand this, you must be intimately familiar with the data you’re ingesting into Datadog’s log management solution. That means answering questions like:
What’s the overall shape, composition, and volume of the data ingested daily?
How valuable is the data? Is it redundant and noisy, critical to our observability practice, or somewhere in between?
How does my team use the data? For populating dashboards, running ad hoc queries, and more? Or do we only need it for compliance? What about other teams that may use the same data for different purposes?
How frequently do users query the data? With what urgency do they need results returned when querying a given dataset?
Datadog provides no in-application assistance to help you answer these questions. The burden of figuring out which logs to route to which tier falls solely on the customer (you). That is unless you are willing to pay an additional cost for Premier Support Services.
Very few observability teams know this information off the top of their heads, nor do they have the spare cycles to figure it out.
The Cost of Getting Flex Logs Wrong
It’s important to understand these shortcomings because there is a cost to “getting it wrong.”
Datadog’s marketing materials provide examples of best-fit datasets: security, transaction, and network logs. These examples consist of wide buckets such as “application logs” and “security logs.” When we dive deeper into these buckets, the guidance falls down.
Let’s take application logs, for example. Application logs can contain a wealth of information about different use cases:
Information about how the application is performing
Information to facilitate troubleshooting
Information relating to fraud, such as a user doing something malicious in your app
Information pertaining to a security breach
Which bullets above should reside in the Standard tier, and which are security logs intended for the Flex tier? It isn’t clear. Instead, Datadog needs you to decide whether to pay a premium or push to the Flex tier.
Moreover, when you do need to query your Flex tier logs, will lesser performance suffice for your investigation?
If you make the wrong call, it may cost you during a breach or incident. And even when you make the “right call,” there is still risk: Let’s say you retain a portion of your logs for an extra 90 days in the warm Flex tier in case you need them, and you get breached 180 days later. In this scenario, you’d still need to rehydrate the data, and most attackers loiter for over 90 days.
Conclusion: There Are Better Ways to Reduce Logging Costs
When adding new storage tiers (or any feature for that matter), it’s imperative to provide guidance and help your customers be successful. Since Datadog provides no in-application assistance, it’s on the customer to determine how to succeed.
Given the time and effort it takes to understand your data and Datadog usage patterns at this level, a feature like Flex Logs is unlikely to add value unless you have a pressing overage problem. (Even in that case, the vendor ought to help you figure out why you’re experiencing so many overages and what data you can exclude from the platform.)
Edge Delta Log Search & Analytics provides one, fair price point to help you avoid these questions. For $0.20 per GB, you can store all your data in a hot storage target suitable for any use case. After 30 days, retention is $0.03 per GB/month.
Learn more about Log Search & Analytics by signing up for a demo here.