Efficacy of Controls
Featured article
An exploration of Critical Function Assurance and the hazards pyramid applied to cybersecurity and controls efficacy in critical infrastructure
Article content
Use a Cyber FMEA to understand how your security products actually work, and embed this understanding into your own security operations plans.
One of the core principles in the Department of Energy Cyber-Informed Engineering (CIE) is the concept of Resilient Layered Defenses. The idea is that we should assume compromise and ensure that the defenses we rely on to protect us will fail in predictable ways and that when they do fail, we still have sufficient protection to meet our mission objectives. This allows us to minimize the opportunity for a single failure to negatively impact our critical processes or create undesirable downstream impacts.
OK, this all sounds good, and a little bit like zero trust architecture. Well, except that zero trust is not really what the name implies and tends to shift the trust decisions to more robust elements of your architecture. What we are really talking about in this article, is how we can ensure a defense in depth approach, or at the very least, identify when and how defenses can fail so we are prepared with mitigating controls.
One popular way to manage this, is through the concept of a purple team exercise, where controls are implemented and then tested to validate their effectiveness. That is certainly one valuable approach. But another less talked about way to accomplish this is through the concept of a FMEA, or failure mode effects analysis assessment.
The concept of FMEA is not new, and in fact is one of the elements taught in Six Sigma curriculum, but their roots go back to the 1940s in the US military to reduce quality variances in munitions. We can apply a similar approach to the evaluation of cybersecurity controls, but first let’s look at what is involved in a FMEA assessment:
STEP 1: Deconstruct the process
STEP 2: Identify potential failure modes
STEP 3: Document the potential effects of each failure
STEP 4: Assign a Severity Rating
STEP 5: Assign Occurrence metrics
STEP 6: Assign Detection metrics
STEP 7: Calculate the Risk Priority Number or RPN
STEP 8: Create your prioritized Action Plan
STEP 9: Take action
STEP 10: Calculate the resulting RPN
While not an exhaustive review of all the steps for a firewall, here are a few that came to mind as I was authoring the article. A proper Cyber FMEA should be far more exhaustive, and in real-world scenarios where I have performed this type of analysis, I typically plan for this to take several days, including some RFC research time, interviews with stakeholders, and reviews of control product documentation. Ideally including some hands-on time with the product being reviewed, but this is not always feasible. The numbering scale we will use here is on a 1 to 10 basis, with ten being the most severe.
Cyber FMEA Objective: Identify failure modes in a firewalls ability to mitigate malicious traffic
Once you have conducted your Cyber FMEA, you need to create an action plan that aligns to the recommendations that are the output of the assessment. By aligning the same scoring methodology, you can start to see how your efforts are addressing the failures, or where gaps occur that may require risk acceptance or other risk management measures. By understanding where your weaknesses are, you are better prepared to respond and recover quickly, even if you cannot remediate them immediately.
Using the FMEA RPNs above, we will prioritize which failures to address first. For this example, we will only look at the top 3 failures, ignoring the Fragmentation Attack and the Payload Smuggling for the time being, until we can tackle the more serious concerns.
RPN 800 – Any/Any Rule at Top of Rules – This one is by far the most concerning and the easiest to fix. The problem is this firewall was never properly implemented. By performing a firewall rule review and requiring a business justification for each entry in the firewall, we can drastically reduce the problem with this one. This may be more of a project than a task if this has never been done. It will likely require many conversations with various business stakeholders to accomplish.
RPN 160 – Log Injection Attacks while not terribly frequent, are serve in consequences since they can compromise the firewall administrator machine and gain elevated permissions. Likewise, they are not difficult to detect, but may require logging on the administrators actual desktop to identify when this is occurring. It may be difficult to completely mitigate this one depending on how the administrator interacts with the logs. If this is done via a web interface or built-in firewall administration interface, may require product vendor modification. But most products will support a CLI interface, that in almost all cases will mitigate this attack. Therefore, this is the recommended action we will take, and in parallel work with the vendor to address the lack of untrusted script blocking.
RPN 128 - Regex Mismatch may be challenging to mitigate at the firewall level unless there are heuristics capabilities available. This attack largely stems for an over-reliance on signature-based detection. Mitigating this may necessitate a newer firewall with these capabilities, or better correlation rules within the SIEM. One key challenge as you start to utilize behavior-based detection methods, is they tend to be very computationally expensive and you may find that you cannot enable all the features you want to unless you drastically oversize your new appliance.
All the above – A separate set of recommended actions we will explore seeks to reduce the severity of these attacks. By implementing the following desktop level protections, each of these scenarios becomes lessened in severity:
You will note that because of these actions, the top two were greatly reduced in risk, while the upgrade to a new firewall, despite the firewall vendor sales advice, resulted in far less, but still significant, value in reducing the risk of these failure conditions. In closing, these RPNs would be further reduced by the above host level controls, and if you extend this activity further, you may find many other opportunities for failure reduction through changes in process or configuration that provide far better risk reduction than investment in expensive security products.
For more information on how to build a Resilient Layered Defense or on CIE in general, get in tough with the Opswright team.