🤖 Meet OnCall AI, our observability copilot that makes troubleshooting easy. Read announcement.
Many of us have hobbies. Many of them are beautiful or useful to the world. Mine is not my personal white whale; it is to find the perfect way to peel an orange. Years of research and experimentation have not yet led to an ideal solution, but that's also precisely why it's taught me four key principles about observability.
Number one, have a clear vision of success. I know exactly what my ideal peeled orange looks like: whole segments, spotlessly clean hands, zero pit, with simple, repeatable techniques. By contrast, observability is often playing catch up in organizations. And a clear vision is rarely articulated. Usually, change is driven either by contract expiration or something in the current bubble gum and tape solution finally falling apart. Observability needs a clear vision to know how to evolve. What are you capturing and when? Who owns what's sent and what's saved? What does the ideal investigation flow look like?
Number two, a do anything tool doesn't have to be a do everything tool. A knife in the kitchen is a great tool that with enough skill and a little luck could theoretically kill an orange perfectly, but that would be ignoring the other techniques that are more effective for some parts of the process. Similarly, observability tools today are tremendously powerful, but doing everything in one tool isn't always the best approach. Observability is an inherently distributed system, and with that, decentralization comes options to make changes all along the data path. There's no need to wait to the end of the process to apply a hammer to a problem that really needs a little finesse at the beginning.
Number three, crystallize tradeoffs. Every technique in orange peeling has tradeoffs. Either you get your hands sticky or you break the natural segmentation of the wedges, or you guess at how deep the rind is and on and on. Observability is the same way. Analyzing more data means putting more compute somewhere along the data path. More alerts bring more noise, tighter fitting, regular expression based configurations for alerts and filters lead to more brittle systems as code and log lines change. All of these techniques have their place, but they should be weighed against a clear vision of success so that teams don't play Whac-A-Mole running from one friction point to another.
Number four, what's popular might not be for you. After asking 100 people how they peel an orange, I can say with confidence some of the most common techniques are absolutely incorrect, at least for my goals as I define them in observability. You have to ask yourself the same question: Does our practice and tooling help us ship faster? Do we do so with more confidence? Do the right people have the ability to make changes without too many approvals? Is there a gap that most teams don't care about, but our team definitely does? The details matter and they should all tie back to your vision.