Frankly Speaking 7/26/22 - Let's get rid of security operation centers (SOCs)!
Disclaimer: Opinions expressed are solely my own and do not express the views or opinions of my employer or any other entities with which I am affiliated.
Every week, I wonder if I will have an interesting topic to write on, but throughout the week, some interesting security engineering topic comes up, and I feel the need to collect my thoughts. The original purpose of this Substack was that in my career, I’ve always felt that writing has always allowed me to refine my opinions and be more coherent, especially in critical strategic decisions. So, with any topics where I am asked to share my opinion, I find it extremely helpful to write it out. It’s a good exercise to formulate more coherent and precise opinions as well as good documentation to understand what I considered when I made a decision at a time.
Anyway, enough of my advocating for writing. The fact that I’m able to come up with an interesting topic every week attests to the fact that security and the concept of security engineering are evolving almost as rapidly as software development principles. There’s a lot of change, and we haven’t definitively converged on best practices for everything. Even when we have, some change in tech will force security engineering to change their thoughts. To be clear, I feel like security engineering should be enabling rather than restricting software development efforts. In an ideal world, software engineers should actively seek security engineers’ input.
Enough of that, moving on to “Let’s Be Frank.”
LET’S BE FRANK
I’ve talked about how datacenter security is dead. I do think one aspect that should die with it is the notion of security operations centers (SOCs). Sorry old-school CISOs and RSA with your fancy SOC display every year at RSAC. No one cares about those anymore, despite looking them cool. Unfortunately, coolness doesn’t usually result in practical solutions.
I’ve mentioned in the past how I believe incidents should be handled in the cloud, and how traditional SOCs are going to disappear. As I spend more time as both a security practitioner and an employee at a company in the center of the modern data stack (yes plug for dbt Labs!), I’ve gotten more clarity on what the next evolution of SOCs/incident response will look like.
In this newsletter, I will talk about the following:
Summarize and refine my previous thoughts on SOCs and incident response
Why SOCs are disappearing
How we should be handling incident response in the cloud
How modern tech companies should model their incident response and properly resource them (especially in a world where good security practitioners are hard to find)
To clarify, I am advocating for the elimination of SOCs where analysts sit in a room with a lot of screens and stare at dashboards and talk to each other randomly about weird anomalies in the dashboard. I am not advocating for the elimination of incident response. I still believe that we need incident detection and response (IR) teams, but those teams will not need or want a SOC.
SOCs are going to disappear!
Let’s level set. If you’re an organization in the cloud, there’s no real reason to have a SOC. In the past, I’ve written this about traditional SOCs disappearing:
Traditional SOCs are going away. Centralized security operation centers (SOCs) where analysts sit in one place with various dashboard will go away in the cloud world. This model assumed that IR needed to monitor and detect malicious operations, but now DevOps has automated most of the operations. IR teams no longer need or can respond to incidents by themselves but need to coordinate with DevOps instead of IR. I believe we will see more “decentralized SOCs” that will triage events to DevOps. How will it work or look exactly? I’m not sure.
I believe most of that still, and I am more convinced that how IR teams will respond to potential incidents will trend closer and closer to having DevOps and product teams respond to incidents. DevOps and product teams don’t have a dedicated operations center, and they probably never will. However, at scale DevOps/SRE is responsible for managing and triaging these incidents, and I imagine in the future IR teams will behave similarly.
Moreover, to emphasize why SOCs are disappearing, there is no reason for analysts to sit in a singular room and monitor the dashboard. Like with DevOps and software development, there should be monitors in place to alert when something seems anomalous and route it to the appropriate team or person in charge. Having 24/7/365 SOCs feels outdated and inefficient.
So where do we go from here?
It’s clear that we still need some form of IR. How do organizations properly structure this function? There’s definitely going to be a shift from tools away from ones that support SOCs to ones that support IR. I’ve always believed in this since I believe IR teams are picking up on issues that have slipped through security tools.
Given the number of legacy processes and marketing in this area, it’s tempting to convert or make engineering adjustments to the current processes. However, that can sometimes create even more inefficiencies and cause frustrations around ownership. In my mind, the easiest way to start thinking about how to deal with security issues and incidents are from first principles.
What is the crux of the problem? We will continue to have a series of security tools and other engineering tools with audit logs that will give us visibility into important functions in the organization, e.g. access control, critical actions, etc.
In some less mature organizations, a security analyst regularly goes through these audit logs and monitors for anomalous activities or uses some tool built on top of that. In some larger organizations, there are teams whose job is to implement and maintain these specific tools, which works if your organization has very strict security needs and policies. I don’t see the strong necessity for this outside of the Fortune 500 and/or until a solid security engineering process is established.
Another way to do this is by aggregating them in a SIEM or log infrastructure. This way, the security team can have all the logs in one spot and can properly analyze potential signals that represent an attack from the MITRE ATT&CK framework. But wait… this sounds like a familiar problem.
It sounds like the business analytics problem! Business have data in various sources, aggregate them into a data warehouse, and then plug refined data into an analytics tool. Whoa!
Why now?
Why do I think security engineering and IR might undergo this transformation? Security engineering always lags a little bit in the adoption curve of new stacks and processes because well… security wants to see whether they are a good idea before implementing changes. In general, an organization would want security not to be the most innovative part of the organization.
With that said, there’s been more consensus on what the modern data stack will look like. Benn has a great Substack article dedicated to this topic.
Quick slight digression for additional context. For those unfamiliar with the rise of the modern data stack, many attribute it to the rise of data warehouses like Snowflake, Databricks, and BigQuery. Similar to the cloud, companies no longer have to run their own data infrastructure and now essentially have access to unlimited storage and computing to perform analytics. This has allowed for ETL (extract, transform, load) to become ELT (extract, load, transform). Now, we can load all our data into one spot and transform it rather than spending all our time transforming the data to make sure we can properly load it into our data infrastructure. dbt Labs (my employer) represents the T in ELT.
The future now becomes clear to me. IR will be using an analogy of the modern data stack to detect security events. Many modern tech companies already heavily rely on these analytics and data stacks to run their business, so they have expertise in building out these types of systems and can apply them to security. Similarly, even legacy companies can fuel this transformation because many are moving toward the modern data stack, so they can apply many of their learnings and/or re-use them for security.
What does this really look like?
I won’t be surprised if many companies push to have one or two data warehouses, similar to companies that have one or two cloud providers. In an ideal world, audit logs would be extracted and loaded to a data warehouse like Snowflake, transformed using a tool like dbt, and then send to an analytics engine. There can be a metrics layer and everything! Why re-invent a data stack for security when the data experts have invented it already?
This is tough for an organization because it means re-purposing many of its current security roles. It’s great for organizations that have yet to mature their security engineering functions because they no longer have to hire certain outdated security functions. However, it’s important to note that security experts will still be required to run this stack.
In fact, there will still be security tools obviously to generate the appropriate logs. SIEMs and log infrastructures dedicated to security will probably go away. However, there will be applications built on top of data warehouses that will allow them to behave more similarly to SIEMs. I think a company that’s interesting in this space is Panther Labs which is helping companies use Snowflake for the security incident response. Finally, XDRs and SOARs will probably convert into analytics engines dedicated to finding problems in transformed logs inputted into the tool.
So, we have brought the SOC and IR into the modern data stack!
Some open questions:
How do the various roles in the former security operations center change? Do we have data analysts, analytics engineers, etc. learn security or embed them into the security team? Or, do we ask these former security operations analysts to learn about data?
How can organizations make this change incrementally?
Are there going to be new roles in security as a result of this change?
What new tools will result from this change? Will current tools need to evolve? What tools can evolve, and what tools will become outdated?
Will parts of the security function be embedded into the data function?