Four Steps for Faster Security Alert Triage
Updated: Apr 19, 2021
Discover how to use your existing controls to consolidate alerts and tune severity scores to respond to the highest probability of active threats
The volume of security events and alerts remains the bane of most cybersecurity programs. In 2019, Imperva polled RSA Security conference attendees and found 55% of the 179 IT professional respondents indicated that their companies received over 10,000 alerts per day, with 27% indicating it was over 1 million! Most organizations admit they receive more alerts beyond their capacity to triage and investigate forcing a greater dependency on prioritization.
Is the problem the tools or how we are using them?
What leads to these extremely high rates of alerts? It is not just false positives. In Enterprise Management Associates’ (EMA) report, “InfoBrief: A Day in the Life of a Cyber Security Pro”, they found “on average, analysts were spending 24 to 30 minutes to investigate each incident they received” and only 31% of the time was the alert found to be a false positive. More likely, alerts were found to be incorrectly marked as critical and misprioritized.
Some misidentified alerts may be indicators of actual emerging threats. Without additional evidence and without sufficient time to thoroughly investigate, its easy to dismiss the alert as a false positive. However, many of our existing security controls allow methods to increase the integrity of our alerts while reducing the overall alert volume.
Think in Incidences — Not Alerts. Many organizations make the mistake of setting every correlation rule or behavior anomaly to generate an alert. Unless there is a high degree of confidence that a single behavior captures a credible Indicator of Compromise (IOC) every time, you should instead configure your correlation rules and behavioral rules to generate “events”. Note that the actual term may vary based on your security control (e.g. “notable events”, “observances”, etc.). With observable events, you can create “master” rules capturing multiple observable events into an ‘incident’ and only alert on these incidents. Master rules should look for a commonality across observable events, e.g. all stemming from the same user or the same target host within a specified amount of time. Think of these master rules as a use case, a scenario, or previously established IOC. While only your master rules generate alerts, you can still use all the observable events in dashboards to enable greater precision in threat hunting activities. This allows you to manage your Precision Recall Curve — or the ability to retain visibility to activities without increasing your false positives, while the consolidation of multiple observable events in your master rules reduces false negatives.
Tune Your Analytics. Now that we have established that there are specific behaviors we just want to ‘note’ versus ‘alarm’, we can tune accordingly. Even the correlation rules and ML/AI analytics the vendor says will just “work out of the box” — require tuning. Vendors’ off-the-shelf content makes general assumptions about IT environments, while every IT environment has its own idiosyncrasies. For each correlation and behavior rule you deploy in production, recognize that you have signed up for both an initial tuning phase and ongoing, periodic tuning. Monthly is a good start as you learn which environmental changes impact rules more dramatically. Later, you can reduce the frequency or even shift to a trigger-based system. The goal is not to tune rules so rigorously to only trigger in rare occurrences. Your ‘master’ rules will temper any observable events seen with a high frequency. You will need to constantly tune and adjust content so plan accordingly.
Tune Your Risk/Severity Calculation. Learn how your security controls calculate risk or severity scores and tune the parameters to align to your environment. This should also be an ongoing exercise, as frequent as weekly, since there are constant changes to your environment such as new Internet-facing assets, new subdomains/networks, new data sources into your security controls. This allows you to sort your new alerts (which are now only consisting of master rules) by risk/severity without relying on the vendor’s default outcomes. This gives you the best probability of reviewing and taking action on the alerts recognizing a qualified, emerging threat earlier.
Use automation. While you can invest in a SOAR solution, you can get started by automating low hanging fruit without making a purchase. The easiest place to find low hanging fruit is in the retrieving of additional details or context surrounding the observable events captured in the master rule. For example, if you manually copy/paste every external IP into VirusTotal to see the results when investigating external IPs — take the time to create a simple script that does it for you. Be careful when selecting your ‘low hanging fruit’ if you have to interface with a product’s API. You will find that many vendor’s APIs change frequently, requiring more of your time to maintain these scripts. Just like signing up for tuning for each rule you deploy, you are signing up for script maintenance for each script so prioritize and choose accordingly. Script maintenance means — when you execute the script does it still do what it is supposed to do. For this purpose, it is handy to have test variables you can run on a periodic basis that wont disrupt/interfere with your security controls and integrated systems. With most security controls, you can add these simple scripts to execute automatically each time the corresponding alert triggers. If this isn’t an option, having the script on-hand still saves you plenty of time. If you are not a programmer, it may be time to learn simple shell scripting like bash or sh. Most SOAR products use languages like Python or YAML.
If you take a use case oriented approach to steps 1–4, you will see big dividends in reducing the number of alerts while fulfilling specific use cases for yourself and demonstrate value back to your company.