Collapse of Data into Unsafe Value

Draft Base
Structure: Simple
Description

This vulnerability occurs when an application's data filtering or transformation process incorrectly merges or simplifies information, producing a result that violates security rules. Essentially, safe input gets collapsed into a dangerous value.

Extended Description

Collapse of Data into Unsafe Value happens when security checks are applied to individual pieces of data in isolation, but the process that combines or reduces this data (like trimming, canonicalizing, or aggregating) creates a new, composite value that bypasses those original checks. For example, filtering separate 'script' and 'alert' strings might pass validation, but if they are later concatenated, they form a working XSS payload. The core failure is that validation logic doesn't account for how safe components can become dangerous when merged. Developers can prevent this by validating the final, assembled data in the exact context where it will be used, not just its individual parts. Treat any data aggregation or transformation step as a potential new input source that requires its own security evaluation. Implementing allow-list validation on the complete output string and using context-aware encoding libraries are key defensive strategies.

Common Consequences 1
Scope: Access Control

Impact: Bypass Protection Mechanism

Detection Methods 1
Automated Static AnalysisHigh
Automated static analysis, commonly referred to as Static Application Security Testing (SAST), can find some instances of this weakness by analyzing source code (or binary/compiled code) without having to execute it. Typically, this is done by building a model of data flow and control flow, then searching for potentially-vulnerable patterns that connect "sources" (origins of input) with "sinks" (destinations where the data interacts with external components, a lower layer such as the OS, etc.)
Potential Mitigations 4
Phase: Architecture and Design

Strategy: Input Validation

Avoid making decisions based on names of resources (e.g. files) if those resources can have alternate names.
Phase: Implementation

Strategy: Input Validation

Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue." Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylists can be useful for detecting potential attacks or determining which inputs are so malformed that they should be rejected outright.
Phase: Implementation

Strategy: Input Validation

Inputs should be decoded and canonicalized to the application's current internal representation before being validated (Incorrect Behavior Order: Validate Before Canonicalize). Make sure that the application does not decode the same input twice (Double Decoding of the Same Data). Such errors could be used to bypass allowlist validation schemes by introducing dangerous inputs after they have been checked.
Canonicalize the name to match that of the file system's representation of the name. This can sometimes be achieved with an available API (e.g. in Win32 the GetFullPathName function).
Observed Examples 6
CVE-2004-0815"/.////" in pathname collapses to absolute path.
CVE-2005-3123"/.//..//////././" is collapsed into "/.././" after ".." and "//" sequences are removed.
CVE-2002-0325".../...//" collapsed to "..." due to removal of "./" in web server.
CVE-2002-0784chain: HTTP server protects against ".." but allows "." variants such as "////./../.../". If the server removes "/.." sequences, the result would collapse into an unsafe value "////../" (Collapse of Data into Unsafe Value).
CVE-2005-2169MFV. Regular expression intended to protect against directory traversal reduces ".../...//" to "../".
CVE-2001-1157XSS protection mechanism strips a <script> sequence that is nested in another <script> sequence.
References 1
The Art of Software Security Assessment
Mark Dowd, John McDonald, and Justin Schuh
Addison Wesley
2006
ID: REF-62
Applicable Platforms
Languages:
Not Language-Specific : Undetermined
Modes of Introduction
Implementation
Taxonomy Mapping
  • PLOVER
  • The CERT Oracle Secure Coding Standard for Java (2011)
Notes
RelationshipOverlaps regular expressions, although an implementation might not necessarily use regexp's.