A blueprint for the modern government security operations center
- By Shai Gabay
- Apr 13, 2017
Government cybersecurity teams are facing increasingly sophisticated threats, as malicious actors of all kinds set their sights on disrupting and damaging the reliable delivery of public services. In the past few years, government agencies have suffered painful attacks, including the theft of 4 million federal employee records from the Office of Personnel Management by Chinese hackers -- the biggest data breach in U.S. government history.
I have had the good fortune to work with and lead some of the best security operations teams in both industry and government, giving me a unique perspective on the differences between public- and private-sector security operations. Below is my outline for building a modern government security operations center (SOC) to effectively face the new landscape of cyber threats.
This may sound obvious to some, but the day-to-day realities of responding to an endless barrage of cyberattacks doesn’t leave much room for pursuing proactive risk mitigation. In order to get the upper hand on attackers, government SOCs must break out of the defensive position of constantly reacting to external cyber events and go on the offensive.
In the past, we focused most of our energy on securing the perimeter and preventing network penetration. Today, we must assume that malicious actors have already managed to enter the network, and we must shift our focus to detecting and responding to active breaches. Make sure security teams have the tools, training and time to proactively hunt for breaches.
Automate as much as possible
Automation has become a hot topic for futuristic developments in everything from cars to medicine, but IT security can benefit from automation today. Much of the work done manually today by SOC team members can and should be automated. Any process that is governed by a clear set of rules and doesn’t pose a production-related risk can be automated. Automation allows SOC managers to do more with fewer resources. It saves analysts time, shortens time to detect and time to response and reduces human mistakes.
But how do we know which processes should be automated and which should remain in the hands of human analysts? Any operation that is repetitive and is carried out according to solid, well-defined rules is a good candidate for automation. Information gathering, data enrichment and internal processes like documentation should all be automated. User actions like making verification phone calls, sending email notifications and carrying out defined steps in the approval process can all be automated. These actions carry little to no risk as they can’t cause real damage to the organization.
Moderate- to high-risk actions should not be automated. Start by examining what could go wrong if an automated action is taken incorrectly. Generate as many scenarios as possible to try to discover circumstances in which the action could cause damage. From my experience, remediation steps like blocking IPs or disconnecting users should not be automated.
Scale must also be considered when deciding which processes are appropriate for automation. For example, running a tool against a single system to gather some information can be automated when performed on a small scale, but if the same action is run on thousands of hosts, it can have a detrimental effect on the network stability. For these types of actions, set thresholds for type, amount and time frame of automated activities to protect the network.
Activities that cannot be fully automated can still be performed more efficiently and accurately if they can all be performed from one central console or dashboard. Consolidating everything into a single pane of glass and control center makes it much easier for SOC team members to perform, improves oversight and makes documentation and teamwork more efficient. It also makes information sharing within the SOC team and with other departments (IT, finance, physical security, etc.) easier, and it reduces human error.
In this age of big data, we now have enormous amounts of information at our fingertips. Now, the challenge is putting that data to good use. Behavioral analytics and machine learning can help security operations teams effectively analyze large datasets from several sources and quickly discover correlations and anomalies that could indicate a serious threat.
Since Ponemon reported that advanced attacks can take over 200 days to detect, contain and remediate, SOC teams must be able to investigate months' worth of historical data, which makes it, by definition, big data. In addition to IT security logs, teams must also be able to interact and detect new kinds of threats in the business environment, so they must have visibility into business process data.
Take, for example, a typical case of a DNS exfiltration attempt. An alert comes in that includes basic information about the user, host, etc. But this is just the tip of the iceberg. Was it caused by an inside actor? Malware? To determine the root cause, the SOC team must be able to combine different data sources, such as physical security or logical security, to understand the wider context and determine if the user was interacting with the host during the time of the incident.
To do this analysis effectively, combining all relevant data sources into one single decision point or dashboard presents a clear picture of everything that is going on related to the alert. Security teams must be able to explore and search for long periods of time without being limited by data structure or having to decide what data to save. Furthermore, analytics will help reduce the noise of alerts and inform the security information and event management (SIEM) alert threshold tuning processes.
Advanced analytics allows faster, more efficient investigation and ensures that SOC managers haven’t missed any important indicators. It will help them improve SIEM rules and detection ratios and identify bottlenecks and other SIEM inefficiencies. With the right tools and processes in place, new recruits will be able to begin working in the SOC with much shorter training times.
Balance people, process and technology
There is no denying the skill shortage; well-trained, experienced cyber professionals are increasingly difficult to find. In fact, the lack of quality team members may soon become the Achilles’ heel of cybersecurity. Until this situation improves, SOC managers can incorporate technologies that will help them get new recruits qualified faster and help every team member be more productive.
SOC orchestration tools simplify the work and reduce the number of tools analysts must master, limiting the amount of training needed. Automation and automatic enforcement of rulebooks prevent mistakes and mitigate some of the risk of putting a newly trained analyst on duty in the SOC. Finally, invest in training entry-level candidates and don’t rely on recruiting top people. Training programs can quickly prepare new hires for their first day in the SOC with ample real-life, hands-on experience and ongoing reinforcement.
Collaborate with other agencies
Interagency cooperation is a hot buzzword, but it is much easier said than done, despite its importance. Today’s malicious actors begin an attack by launching a series of reconnaissance activities to find the vulnerabilities or potential attack vectors on a number of organizational networks -- usually targeting a large swath of a particular sector, like government. Hackers know they face substantial IT security defenses, so when they find a soft spot to exploit in one government agency, the chances are very high that other agencies have also been targeted with the same or very similar tactic.
Therefore, information sharing must be a top priority and included as an automatic step in SOC detection and response procedures. By sharing threats among agencies, we can effectively reduce the number of “unknown,” threats, making agency networks much simpler to defend.