Compare Scalability, Cost, and Performance
There have been no shortage of self-proclaimed "Splunk Killers" and log analytics products throughout the years as hype and buzzwords get thrown about like candy at a parade. We know... we personally experienced this problem. Unlike candy, however, these offerings left a rotten taste in our mouths. If you're in the market for a log management platform and you're evaluating Gravwell, or any other tools, there are some crucial factors to consider. In this post we'll go through 5 important questions to ask that can help identify whether a solution may be a sweet fit.
1. Are your data requirements increasing but not your budget?
It's a surprise to no one that log volumes are growing exponentially and that legacy tools were not designed to handle the data requirements of today. Especially when we consider ephemeral cloud services and kubernetes, where logs are often the only way to actually see what's going on in these systems and environments. No organization should be forced to drop data on the floor because of budgetary requirements.
Gravwell makes it possible to log everything. Our customers laud our clear total cost of ownership and reasonable pricing. They collect more data and spend less! We can provide this raw value through our purpose-built data lake. Our secret sauce storage system reduces on-disk requirements and increases search speed performance (We're happy to dive into the weeds if you'd like to chat about it). Combine that with our always unlimited pricing model and you've got a truly unique solution in the space.
2. Does your team need the ability to ask questions and go "off rails" from pre-fabbed search?
This question is a quick way to separate tools that do great in a canned demo vs tools that will scale and mold to your unique organizational needs. Many solutions in this space start out solving a specific use case such as "I need to analyze logs from XYZ." They'll choose a database backend with little thought into scalability or maintenance (see the next question) and then add more use cases one at a time. There is a huge difference between this approach and creating a more extensible data science solution.
Gravwell backend storage is our data lake which is purpose-built to handle data of any kind and enable use cases our engineers had never even heard of, let alone built some canned demo for. That's not to say that common use cases aren't covered. Gravwell does have Kits, which contain pre-built dashboards and tools to analyze common data sources like netflow or Zeek. The real power of Gravwell, however, is our pipeline query system which enables analysts to apply cutting-edge techniques to extract the insights they need.
One Gravwell customer witnessed some unusual DNS activity where large high-entropy DNS requests were being observed, a key indicator of possible data exfiltration activity. Becuase the Gravwel search system is extensible and pipeline-driven, they were able to extract the DNS query payload, decode the value, and pipe that directly into a packet dissection module. The result was being able to analyze TCP traffic being obfuscated over DNS just as if the DNS was never present. The DNS tunneling exfiltration turned out to be a red-team activity and being able to detect and provide detailed analysis got the defenders huge kudos from the pentesters.
3. Is your team comfortable maintaining solutions that require care and feeding?
Unfortunately, it's not at all uncommon for solutions in this space to be nothing more than a few minor cosmetic changes in front of some glued-together open source tools. Elastic search is a fantastic project, but it's just not built for unstructured time-series data and notoriously requires significant care and feeding to keep operational when used in this manner.
Secondly, scaling software is a very difficult problem. Multiple vendors in this space originally supported multi-cloud and on-prem capabilities but switched to being a managed-only solution specifically because of this issue. At Gravwell, we are very blessed to have a fantastic team of engineers with rich backgrounds in supercomputing and emulytics. These folks know what it takes to scale a system. A minor technical slowdown when ingesting 100GB/day becomes destructive at 10TB/day. When scale of storage and search isn't considered from line 0 of product development, there is massive performance risk. This risk can manifest in non-obvious ways that show up in product restrictions like supporting only one cloud provider, weird pricing models, or not allowing customers to gain control over their own data.
Don't be afraid to demand performance benchmark testing when evaluating data analytics tools.
At Gravwell, we take pride in our unparalleled ease of setup and "near-zero" maintenance requirements. A new release upgrade usually requires nothing more than `apt upgrade gravwell` and even highly complicated large enterprise environments ingesting over 100TB/day are straightforward to architect and maintain.
4. Do you need the freedom to bring data insights to other business units?
The fact is, you can only get insights from data that you actually collect. One very common gripe we hear from organizations about log analytics is that there is a lot of infighting.
At the last conference I attended, I spoke with 3 people at an organization - one from DevOps, one from QA, and one from Security. In our discussion, I asked about log collection and how that works at their company. Though these people were clearly friends at the organization and the banter was jovial, there was real frustration in how they ribbed each other about who got to put data in Splunk. The QA department was frustrated that the Security team didn't let them put what they wanted into their log analytics tool. The DevOps team was upset because they didn't have access to view the data they needed to do their jobs.
It's important for solutions in this space to have the ability to support multiple business units within an organization, which means ingesting lots of data types from lots of places as well as supporting appropriate permissions requirements to make sure users can do their jobs without additional security risk.
We knew this was necessary when building Gravwell, and that's why we insisted on an unlimited pricing model, data permissions capabilities that support simple or complex requirements, and architecture that handles multiple business units appropriately.
5. Do you have a dynamic IT environment with multi-cloud or on-prem needs?
There are a number of solutions that will provide cloud logging but lock you into a specific provider. This causes major problems for any organization with a presence across providers. Sending data generated in AWS over to GCP for analysis comes with a hefty data transmission bill and more opportunities for latency and failure.
For those solutions that are actually cloud agnostic, it usually means you have a separate instance per cloud provider, which results in a swivel-chair workflow where analysts are having to log into multiple systems, which defeats the purpose of centralized event management in the first place.
Finally, there has been immense neglect of on-prem in the collective push to cloud computing. The fact is, a large number of organizations still have significant on-prem requirements. Further, many things (like cyberphysical systems where 1s and 0s create kinetic activity) can never be put in the cloud because of their very nature. Organizations shouldn't be shamed for being who they are and it's possible, if not likely, that the pendulum will swing back from thin-clients where cloud providers own everything and you are the product back over to a world where organizations control their own destiny.
Gravwell excels in this area. We have manufacturing and power companies who rely on Gravwell to gain observability into their on-prem and multi-cloud infrastructure. Going even further, Gravwell runs great even in isolated environments without any internet access. This planning is what enables us to do so well in multi-cloud environments where Gravwell can sit near the point of data generation and provide a federated approach to analytics, which I presented during this Security Weekly episode:
Collecting & Analyzing Logs in Hybrid Cloud Environments - Gravwell webcast
Gravwell helps customers to "own their orbit" and puts control back in their hands. These questions are helpful for evaluating any solution against another, but if you'd like to get further answers about Gravwell or see it in action, schedule a demo to talk to an engineer.
Co-Founder of Gravwell