Efficacy of Controls
Featured article
An exploration of Critical Function Assurance and the hazards pyramid applied to cybersecurity and controls efficacy in critical infrastructure
Article content
In the first article on prioritization of configuration weaknesses – EPSS for Configuration Flaws, I posed the question of whether we need EPSS or an EPSS-like mechanism for prioritization of these issues. As you will see, we are going to focus heavily on classes of flaws such as what Common Weakness Enumeration (CWE) describes. The reason for this being that it’s the closest thing we have at the moment to a useful taxonomy. The reality is we will get more specificity on the hardening configs themselves from vendor specific guidance, but we are focused on the flaw here, not the control itself. They are obviously closely related though.
One of the important considerations here is establishing what risk mitigation looks like, as well as the simplicity inherent in the vulnerability discussion that is made quite a bit more complex in the realm of configuration flaws. What I mean by that, is that typically you only have a single choice with regards to a software vulnerability. Do I patch or not? You may decide on timing now vs later, but basically the assumption is that patching means the problem goes away, while not patching means I must accept the risk or mitigate the issue another way. Perhaps through configuration. But the core issue is a CVE. It has a CVSS score. It is documented in the CVE record, with many external references to vendor and security advisories, hotfixes and research, all explicitly tied to a singular issue. There really is not a lot of ambiguity here. We know what product is affected, under what conditions, what versions, and what the impact is.
With vulnerabilities, we have entire security product ecosystems built around software vulnerability testing, triaging, remediation, and full lifecycle management. We have multiple prioritization models such as EPSS and others. As an industry, if we adopt these approaches, we don’t have to think too hard about it, just do the work. Install the patch and the pain goes away. Or decide that the patch itself does not provide sufficient value and take a different approach. But it’s well understood. It does get a bit muddier when we consider component level vulnerabilities, reachability analysis and other factors, but the level of industry focus here means that most of our decisions can be made pretty defensibly.
With configuration flaws, we don’t have a CVE record. The specificity is not there. It can be ambiguous to understand why using hardcoded credentials will lead to exploitation. There’s no catch-all proof-of-concept exploit code that we can demonstrate. As a class of vulnerability, it only becomes clear when we laser in on a specific instance of the issue. But as a class of flaw, it becomes too abstracted for concrete evidence and stretches our ability to imagine what could go wrong instead of showing what is likely to go wrong given this code we found on GitHub or Pastebin.
The reality is that as a class of flaw, the risk is much greater. Let’s use the hardcoded credential example and look at CWE-798, Use of Hard-coded Credentials at opencve.org.
There are 1121 CVEs associated with this flaw. If your scanner just told you that you had this issue, what would you do about it? You’d need to know exactly where it was, and if it was in software you did not control, you’d have to contact your software vendor for each instance, a CVE would need to be created if it did not already exist and wait for security patches to apply. Typically, I’ve seen that this waiting period is about 2 weeks for every layer deep in the dependency map. So, a component of a component of your vulnerable software, is probably six weeks or more, if you are lucky. But without this understanding, without an explicit CVE for the flaw, there probably is not any action taking place.
As I alluded to in my prior post, we can get to a CVSS score here, or a CCSS score would be more accurate. We’ve released an open-source tool to do this, but it still requires certain inputs that typically would come from a CVE record. Must we manually answer these questions? CWE does not give us these answers, at least not completely as we will demonstrate in a moment.
In the example below I’ve scored one instance of this flaw using our initial CCSS tool, imaginatively named ccss.py, but we are talking about configuration issues here, and across many different products. Implementation matters! You can’t score at this level of granularity without understanding how the credentials are accessed or what the credentials are used for in real-world implementations.
Let’s consider for a moment that we want to develop an EPSS model to do this work for us. But let's be very clear, EPSS is about calculating likelihood, what is the probability that this flaw will be exploited in the next 30 days? Very little of this approach will help you understand consequences of exploitation. But this is certainly half the battle, perhaps less depending on your perspective and focus on the Black Swan type of events. But more on this in a future article!
Using the standard approach used for CVEs, these are some likely candidates. These are the same methods documented in the EPSS project for CVEs, so it’s a reasonable starting point for our discussion.
But before we dive in, let's examine the feasibility criteria we will use for this analysis. We have a catalog of hundreds of feasibility questions to be used in secure engineering workflows in our product, but that's a bit much for a blog post. I will break it down into 3 simple questions.
We will make this very easy and assign a qualitative value for each source of Low, Moderate or High. This might not be very scientific, but it should be enough to get a general picture of feasibility.
Identifies the vendor of the product affected by the vulnerability.
EPSS uses certain vendors like Microsoft, Adobe and Apache to determine if the risk is greater since these are more commonly targeted by adversaries. We can’t do this generically across all CWE, we would need to consider the cross-section of the CWE (if that is the basis for our flaws) with the list of vendor products. For every instance where we had an intersection with say, Microsoft, and CWE-798, we would have a match.
If we are capturing this generically as a class of flaw, unless we are at the variant level of abstraction for CWE as you will see below, we just don't have this information. The variant list is also very sparse. For instance CWE-1173 only has mapping for Apache Struts and ASP.NET, but clearly these are not the only validation frameworks. For instance, Django has a robust form validation framework that can be used to mitigate SQLi, but it does not have an entry here. But you could theoretically use your own product inventory and make some assumptions.
Feasibility? No - there is no data source in question that maps CPE or PURL or SWID or anything with unique vendor references to configuration flaws.
Considers how long a vulnerability has been known since its publication in the MITRE CVE list.
Now, weakness age is interesting. Consider things like SQL Injection, first discussed by rain.forest.puppy in 1998. We have known about SQLi for 26+ years. It’s a well understood flaw. There are countless articles on how to identify and mitigate this issue by OWASP and others around the web. It’s earned a place on the OWASP T 10 for decades. It’s taught in every web app hacking course I’ve ever taken. In 2024, it is unacceptable to see this flaw. But it still occurs.
Certainly, this amount of history should accelerate the risk prioritization. We have a CWE we can map to here as well. CWE-89: Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection'), and 10,864 instances of CVEs related to this flaw. But the lack of a CVE record, and a publish date for the flaw means we have to manually derive this information. That means we don't really have a data source to use here. This category relies on CVE, and without it, we have a gap.
Feasibility? No - there is no data source in question that maps the age of a flaw to a taxonomy of flaws.
Uses sources like the MITRE CVE List and NVD for detailed vulnerability descriptions.
We have no CVE record unless we are evaluating a specific CVE that relates to a configuration. If we want to secure the class of flaw, we probably need to leverage design patterns that align to CWE as we’ve discussed so far. The data just does not exist today.
Common Configuration Enumeration (CCE) presents a similar approach to CVE, except that it describes configurations and not the flaws itself but let's take a look at the actual data. First off, this is not very current as seen below, with the last releases now close to 2 years old.
The CCE project does not actually specify the schema, but the Security Content Automation Protocol (SCAP) project does have information useful for data mapping. Below we have listed the identifiers to be used:
But what is interesting, is when you look at the actual CCE files submitted by vendors, there is very little consistency in the data structure, even though they are in a structured data format. Some columns are misnamed or missing, and the reference columns are wildly different between vendors. I spent the better part of an afternoon trying to write a CCE parser before I stumbled on this very discomforting issue. It's probably not an easy data source to use for this purpose, and as you can see above it is still focused on the config. Not the flaw. Not surprising given the name.
But there are scattered instances of keywords we can trigger on such as 'vuln' which mostly references STIG vulnid, or in some cases 'weakness' identification. I found zero references to CWE though, even though vendors have the ability to specify any references they desire. The best information I found came from SUSE Linux and Red Hat, closely followed by Apache. One might surmise from this that the open-source community is a bit more closely aligned with our security use cases, but we did see some evidence from Apple and Microsoft as well, but again this was mostly due to STIG references.
One alternative is to leverage the configuration catalogs from CIS and DISA STIGs, but we still have an issue with correlating these recommendations to the configuration weakness they mitigate. These are closer to security patches than they are a CVE catalog, after all, this is how you resolve the concern. But what is the concern being mitigated? These are just policy statements without any regard for why.
Even if I follow the references in CIS to the vendor recommendation such as this one from Microsoft there is no content on this page that I can directly link to any sort of taxonomy for configuration flaws.
Looking at the content with DISA STIGs we see a similar issue, and there is a great place to do this with the Control Correlation Identifier (CCI) standard but it’s just mapping STIG controls to NIST publications. More policy statements. More compliance.
Feasibility? Moderate. NVD does list some CVEs related to config flaws, but it is not categorized in a way that would make it useful for this purpose. See our conclusion for actions we have taken to address this.
Extracts terms from the vulnerability description that help in understanding the vulnerability context.
Here is where natural language processing (NLP) and machine learning (ML) can aid in understanding of CWE entries. The reality is that the CWE catalog itself suffers from inconsistencies at times and attempts to map this data to CVE may fall prey to bias borne of lacking data governance in the taxonomy.
But it's clear given the rise of LLMs that this may be a promising area to explore. In fact, in my previous article, I explored using ChatGPT to start prioritizing types of controls that need to be implemented, and it seemed to do a decent job. For now, I think until we have better data sources, we need to table this part of the discussion. We need the data first.
Feasibility? Unknown, as this is not a data source, but processing of other data sources. This will require analysis to determine feasibility
Classifies the type of software weakness the vulnerability represents.
Let’s look at an example of this. Consider our SQLi scenario, where we have 2 related CWE
In CVE-2023-22794 we see explicit references to lack of input validation, but it is a SQLi issue categorized as CWE-89. Why is this? You could argue that these CWE entries are at different levels of abstraction, which could explain this issue, but let’s look at the CWE taxonomy briefly in the inline image.
There are 4 primary levels of abstraction. Pillar, Class, Base and Variant. This is one of the more confusing aspects of CWE, and although they do an excellent job defining parent and child relationships for the CWE, it's sometimes challenging to remember what level means what and how the CWEs that do not exist in the same tree interrelate. Category is just a logical grouping, that can sometimes occur at different layers of abstraction.
You can read more about the CWE schema at their website.
Back to our SQLi example, both of these CWE live in completely separate branches of the same tree. CWE-89 actually rolls up to CWE-943 (Class): Improper Neutralization of Special Elements in Data Query Logic. This sounds very similar to CWE-707 and in fact passes through 2 Class level relationships before it gets there. This can all be very convoluted to map and frequently leads to misclassification of the CWE.
But let’s assume for a moment that the CWE dataset is perfect, we do have some information here that we can use to map to our CCSS/CVSS use case, but first the data needs to be extracted. This is where we need human level understanding of text data that cannot be easily evaluated quantitatively without being heavily processed.
Sections in the CWE record include:
Common Consequences – which assigns a scope such as Confidentiality with a list of impacts. Great. But it still requires some processing. In CWE-798 we see “Read Application Data” but is this full or partial confidentiality impact? Hard to say. But it’s still a promising area to explore.
Likelihood of Exploit – In the example CWE-798 we see it is “High”. Not sure what this means, but it's somewhat useful as a ballpark estimate of likelihood. Not scientific enough for my tastes, but perhaps when combined with cyber threat intelligence we can get a bit closer here.
Demonstrative examples - in this section we get a bunch of real-world examples of this flaw, which could be used to create detection patterns. One reason I like CWE-798 is we also get some great examples of ICS specific CVEs related to the OT:ICEFALL report that focused on security by design weaknesses in OT products. Something my use cases find extremely relevant.
Potential Mitigations – In this example, these are largely architecture and design flaws, which can be very useful in creating design patterns. We have our own model for documenting design patterns I will share in a future post, but regardless of how you store this information, these Mitigations can be restated in a way that aligns to your secure design methodology.
Detections – This becomes useful in identifying mechanisms to detect these issues and which techniques are most effective for this class of flaw.
Taxonomy Mappings – this is both a fantastic, and a terrible, example for how to map design flaws to control statements. In the case of CWE-798, we have discrete references to IEC 62443 3-3 Req SR 1.5 and 4-2 Req CR 1.5. But there are very few CWEs that have received this treatment. This particular mapping is the result of a project between ISA and MITRE as part of the ICS-OT SIG I that focused on the highest profile vulnerabilities as specified in the SEI ETF project and the categories of security vulnerabilities within ICS. I participated in an early draft of this project, and it is a great effort, but highly unscalable. At the moment though, I don’t think we have anything better in industry. They have taken a prioritized approach to the work, and it's really awesome stuff.
Feasibility? Moderate, though this is one of our best data sources at the moment. There are challenges due to accuracy and depth of information, including specificity needed for mathematical operations on the data. It is mostly qualitative, and records may not be updated or accurate. But MITRE is a very credible source.
Provides a standardized score to reflect the severity of the vulnerability.
So why does severity matter if the purpose of EPSS is to prioritize based on likelihood? Attackers will focus on flaws that accomplish their objectives the easiest, so the assumption is that severity is a good indicator of what an adversary will seek to accomplish. We covered this above with the CCSS tool. But without a reference, we do need to create these ourselves. There just is not repository of this data today. But we will come back to this in our conclusion, because although CVSS is not perfect, there are elements of it we find useful, such as the updates from CVSS 4.0.
Feasibility? No -there is no data source in question that maps the age of a flaw to a taxonomy of flaws.
Tracks whether the vulnerability is being actively discussed in security communities or listed on key vulnerability sites.
So, this is also an opportunity for ingestion of unstructured data such as social media and other formats that requires textual processing to produce value. The challenge we have in these types of analysis is the lack of a standard taxonomy that is universally used in language. CWE fits the bill, and if people talked about CWE explicitly as much as they did CVE identifiers, this would be an easier problem. But it’s still a worthwhile activity to explore. Think about it, do you talk about the issue as CWE-798 or do you talk about it is hard-coded credentials? And do you hyphenate the term or not? But in terms of vulnerabilities, it’s probably using the CVE identifier unless it’s a celebrity vulnerability like log4j or log4shell.
Feasibility? Moderate, though this will require some data processing to be useful. This will be largely unstructured text that will require lexical analysis
Indicates if exploit code is available in repositories like Exploit-DB, GitHub, or tools like MetaSploit.
Exploit code is usually immature. It’s written for one specific scenario, and sometimes specific to a memory address it is meant to write to. It might not be general purpose tool that exploits a class of flaw. But some tools ARE written this way. Consider python-responder that is designed specifically targeting lack of LLNMR hardening in multiple versions of Microsoft Windows, or ike-scan that targets vulnerable IKE VPN implementations. These are classes of flaw that we could theoretically create a catalog of tools to identify a mapping for. We will explore this in the next section.
Feasibility? Low, though it is probably cheating to double-count this category since we have somewhat co-opted the last category below. I've reduced it one grade to compensate. Again, not scientific, but it helps me in thinking through the process.
Lists tools that could potentially be used to exploit the vulnerability.
In fact, if you look at MITRE ATT&CK and explore the mapping of TTPs in this way, you will quickly see that threat-informed models can be effective in understanding where tools are readily available to aid in exploitation. But there’s no quantification of risk in ATT&CK and it does not explicitly map to CWE.
Enter Common Attack Pattern and Enumeration Criteria (CAPEC), which is very similar to CWE in its taxonomy and can fulfill a purpose of being the glue for a CWE to the TTPs in question.
So, let’s say you were focused on our good friend CWE-798, and you wanted to understand the likelihood of attack using threat-informed methods. One approach could be to start with:
Feasibility? Moderate, but does require stitching together multiple data sources to produce value. See the workflow above. This is an example of something that could be resolved in a data catalog.
One other key category EPSS uses is exploitation activity. We have not included it on our analysis, and we generally find this to be useful, however most cyber threat intelligence does not map to the config flaw in this way. Again, it's the lack of structured references to the CWE or category of flaw that makes this challenging at times. The best we can hope for is TTP mapping and then reverse-engineering the flaws from that information.
So, what's the sum result? We have a mostly Low to Moderate capability to perform this work, with some pretty big gaps but likely enough to get us started. It's not quite as bleak as this, as the analysis was biased towards automated results as we do need this process to scale, but if we performed some manual data work, we could fill in more of these boxes. Let's dive into what we need to get this started and move beyond the basics.
We are partway here, but there's a few more dimensions I want to cover here. Some of this comes to us by way of CVSS 4.0 in additional data attributes, but also as we've discussed, we need a data catalog to perform this work at scale.
The supplemental metrics in CVSS v4 were a nice addition, and once we start seeing this information become available to us it creates some useful prioritization methods, especially within critical infrastructure. Let's take a look at what has been added here.
There a couple here I want to spend a bit of extra focus on, the first being Automatable. I will cover Safety in a future article on consequences, but right now we are focused on likelihood, and safety probably only drives this focus for actors that seek us personal hard. A small subset of motivations considering the overwhelming majority are focused on financial gain.
Automation speaks directly to the ease at which an attacker can execute their campaigns, and this is interesting, because it is also one of the key decision criteria in CISA SSVC decision trees for vulnerability prioritization. I'll cover decision trees in a future article, but these can certainly be a viable approach in doing vulnerability management. Understanding when a configuration flaw can be abused through automation is also a valuable piece of information and this will be nice to see.
Another being Value Density. This one is a bit weirdly worded for me, but the concept is extremely important because it speaks to the economics of attack. The idea is that a single exploitation will result in a greater resource compromise, so less effort = more reward. This is absolutely what we need to be thinking about, and if you think about it, this is why environment access and authentication based attacks are so prized. they maximize the reward providing many more avenues of attack from a single action.
Lastly, we really need to think about what our data catalog should look like here. But I think a detailed exploration of what this should look like will be a target of a future article. If we look at the biggest gaps in the approach so far, it really comes down to a lack of config based CVE records, or a classification in the CVE schema for configuration type issues. We do have a configurations element in the CVE schema, but it is really about mapping CPE and CPE combinations, and does not identify specific configuration weaknesses beyond the CWE mapping that led to the CVE.
What would be extremely useful would be the ability to filter on a subset of CVE that are configuration flaws and map those CWE and other related metadata such as CPE, age, etc which could help support our use case here. As near as I can tell, there is not a data catalog anywhere that directly supports configuration flaws, so we will think about that and what it would entail. In the meantime, I have submitted an issue to the cve-schema project, requesting that we add the ability to classify flaws that are based on configuration as an initial stop gap. This is not my first time doing this, previously I requested PURL be added for software identification and the CVEProject team is very receptive to these requests. it just takes some time to work into the next release cycle. If you agree with this approach, or have comments to add, please weigh in on the GitHub issue linked above.
Let us know your thoughts, what else would you want to see in such an open-source data catalog? We'd love to have you join our efforts here to drive prioritization for configuration flaws.