Frankly Speaking 9/27/22 - Data is changing security!
Security products will need to change their value proposition.
Disclaimer: Opinions expressed are solely my own and do not express the views or opinions of my employer or any other entities with which I am affiliated.
Now, I feel pretty adjusted back into my operating role. I want to start reading more about security news and what’s going on. Usually, I go to Blackhat, RSA, etc., but I’m hoping to get my news from other sources than that… If any suggestions for good Substacks/blogs/newsletters are appreciated!
LET’S BE FRANK
In the past decade, security tools have evolved quite a bit. We have gone from firewalls and anti-virus to CSPMs and EDRs. Other than the fact we use acronyms more often it seems, these tools accomplish similar goals to their predecessors but have adapted to more “modern” architectures and technologies. For example, IDSes have largely been replaced by WAFs and SWGs to adapt to a zero-trust world. Although we have created new categories, the purpose has largely remained the same.
However, new products have been able to disrupt legacy products and cause what I call a “category transformation” where there is an existing market and a newcomer is able to disrupt and displace current products in that space through the creation of a better product for the same purpose, e.g. email security, etc. In my opinion, the main differentiator for these successful products is better data analysis. For example, EDRs were able to perform better than anti-virus because they moved away from the signature-based approaches and were able to do analysis in the cloud with data aggregated across all customers to better detect attacks. Another key insight is that he availability of “unlimited” computing and storage from the creation of the public cloud has enabled this change to be successful.
Much has been talked about how the public cloud has created a new category of security products, but what people don’t discuss enough is that the public cloud has enabled a new generation of security products for many existing categories, such as email security, endpoint security, etc. by providing the computing and storage necessary to improve core components of the product.
What’s the next trend? It’s the modern data stack. Working at a company (dbt Labs) that sits at the center of it, I believe the next generation of successful security products will take advantage of this trend.
In this newsletter, I’ll cover the following:
The modern data stack and what it means for cybersecurity
Why certain cybersecurity products are going in the wrong direction
How products can take advantage of this trend
What is the modern data stack?
I’m not going to rehash our CEO, Tristian’s blog post on the modern data stack. I agree with most, if not all of it. (I do work for the company after all.) I encourage you to read it as I believe it’s well-written and provides good context in general about the data market, in which cybersecurity will be an increasingly bigger consumer.
I wouldn’t do it justice by summarizing it here, but there are a couple of important points:
Data warehouses have driven much of the innovation in the modern data stack as they have allowed for access to unlimited computation and storage for data analysis
Verticialized, lightweight applications built on top of business intelligence tools will emerge
How does this translate into cybersecurity?
… and what does it mean for the industry?
To start, it’s important to recognize that there’s been a major shift from prevention to detection and response. Companies have accepted it’s increasingly difficult to prevent all attacks, but rather it’s important to detect any issues and respond to them before they escalate.
This shift in mentality has led to tools focusing on visibility and detecting issues in real-time. In order to accomplish this, security tools have had to ingest large amounts of data from various sources. For example, CSPMs ingest API data from cloud providers on configurations, etc. Vulnerability management products ingest data from vulnerability databases and add some of their own research to match against packages and dependencies in a company’s application.
For security teams buying these products, there is value in ingesting the data and having it in one spot with some dashboards around key metrics, but security teams see substantial value in the analysis of this data. Specifically, they want to know if any of the tools can detect any actionable problems/issues given the data.
However, providing meaningful security analysis is very difficult similar to the problems BI tools faced earlier on. These tools operate in data silos. They lack context from other tools, which have data on other parts of the application and infrastructure. As a result, many companies end up buying a SIEM as their security program matures because security tools themselves in isolation are not useful without context from the other tools. However, SIEMs have historically been problematic. They have been difficult to use and primarily have been log infrastructures that allow basic queries. They are expensive and don’t have substantial functionality. Interestingly, they suffer the same problems that led to the rise of data warehouses.
Unfortunately, until recently, SIEMs have been the only solution. (Of course, you can always solve the problem with more people, but that’s not a sustainable approach.) Without having data from all the security tools, any analysis will be riddled with inaccurate insights. The tools themselves have realized this, so there has been a push for tool “extensions” to ingest data from other sources, such as XDRs, which have EDRs as a core dataset and ingest data from other tools. The thesis is that security teams believe that endpoints are the main source of issues and attacks, so having more context is beneficial.
I don’t believe this is the right play for most tools (or even EDRs). Let’s just look at the modern data stack again and look at the EDR/XDR play. The EDR companies are data sources that have applications but are trying to pivot into a data warehouse. That’s a risky play given that they have to build a data warehouse while the actual data warehouses like Snowflake and Databricks are pivoting into security use cases.
What should cybersecurity companies do?
It seems like security tools are looking more like BI tools in the past, so I believe they will undergo a similar evolution to the modern data stack. Security practitioners will demand more context that will require data silos to be broken. Given the increased focus on actionable insights, security practitioners will need to develop their own “stack” to provide these insights.
This means that we will see similar tools like modern data stack. We will see data sources, ingestion, transformation, warehouses, and verticalized applications. Some of the existing players might go into security use cases, such as the data warehouses because security data is not that different than other data.
Now, cybersecurity products have an identity crisis. They have to decide which bucket they fall into. Are they the applications? Are they the data sources? Are they the ingestion engine?
In my opinion, most companies will need to pivot to verticalized, lightweight applications built on top of a data warehouse or a BI-like tool. Cloud providers and engineering tools, such as observability tools like Datadog, will serve as data sources. There seems to still be space for a transformation, analysis, and ingestion tool. Unfortunately, cybersecurity companies have to pick a category. They cannot continue to operate and cross categories for the same reasons BI tools couldn’t succeed in doing it — this will lead to cause the product to have duplicate functionality in the security “stack” that will not add additional value.
I do think this is going to happen as companies are already exploring using data warehouses as SIEMs to consolidate data infrastructure and reduce the burden on the engineering infrastructure teams. I am not sure of the time scale, but I do think this transformation will be a net positive for the industry and lead to more efficient security operations because it will lead to better and more effective data analysis.