Permissive Regular Expression

Draft Base
Structure: Simple
Description

This weakness occurs when a regular expression is too permissive, failing to properly validate or sanitize input by allowing unintended values or patterns.

Extended Description

A permissive regex often arises from forgetting to anchor the pattern to the start (^) and end ($) of the input string. This causes a partial match, where the system accepts any substring that fits the pattern, rather than validating the entire input. For example, a regex meant to validate a 5-digit ZIP code like \d{5} would incorrectly accept '12345' within 'abc12345def', leading to incomplete validation. Other common mistakes include using overly broad wildcards (like .*) instead of specific character classes, or crafting patterns that fail to exclude dangerous or malformed data. This lax validation can open the door to data corruption, injection attacks, or logic flaws downstream, as the application processes input it assumed was already safe.

Common Consequences 1
Scope: Access Control

Impact: Bypass Protection Mechanism

Detection Methods 1
Automated Static AnalysisHigh
Automated static analysis, commonly referred to as Static Application Security Testing (SAST), can find some instances of this weakness by analyzing source code (or binary/compiled code) without having to execute it. Typically, this is done by building a model of data flow and control flow, then searching for potentially-vulnerable patterns that connect "sources" (origins of input) with "sinks" (destinations where the data interacts with external components, a lower layer such as the OS, etc.)
Potential Mitigations 1
Phase: Implementation
When applicable, ensure that the regular expression marks beginning and ending string patterns, such as "/^string$/" for Perl.
Demonstrative Examples 2

ID : DX-37

The following code takes phone numbers as input, and uses a regular expression to reject invalid phone numbers.

Code Example:

Bad
Perl
perl

looks like it only has hyphens and digits*

perl
An attacker could provide an argument such as: "; ls -l ; echo 123-456" This would pass the check, since "123-456" is sufficient to match the "\d+-\d+" portion of the regular expression.

ID : DX-154

This code uses a regular expression to validate an IP string prior to using it in a call to the "ping" command.

Code Example:

Bad
Python
python

The ping command treats zero-prepended IP addresses as octal*

python
Since the regular expression does not have anchors (Regular Expression without Anchors), i.e. is unbounded without ^ or $ characters, then prepending a 0 or 0x to the beginning of the IP address will still result in a matched regex pattern. Since the ping command supports octal and hex prepended IP addresses, it will use the unexpectedly valid IP address (Incorrect Parsing of Numbers with Different Radices). For example, "0x63.63.63.63" would be considered equivalent to "99.63.63.63". As a result, the attacker could potentially ping systems that the attacker cannot reach directly.
Observed Examples 8
CVE-2021-22204Chain: regex in EXIF processor code does not correctly determine where a string ends (Permissive Regular Expression), enabling eval injection (Improper Neutralization of Directives in Dynamically Evaluated Code ('Eval Injection')), as exploited in the wild per CISA KEV.
CVE-2006-1895".*" regexp leads to static code injection
CVE-2002-2175insertion of username into regexp results in partial comparison, causing wrong database entry to be updated when one username is a substring of another.
CVE-2006-4527regexp intended to verify that all characters are legal, only checks that at least one is legal, enabling file inclusion.
CVE-2005-1949Regexp for IP address isn't anchored at the end, allowing appending of shell metacharacters.
CVE-2002-2109Regexp isn't "anchored" to the beginning or end, which allows spoofed values that have trusted values as substrings.
CVE-2006-6511regexp in .htaccess file allows access of files whose names contain certain substrings
CVE-2006-6629allow load of macro files whose names contain certain substrings.
References 1
The Art of Software Security Assessment
Mark Dowd, John McDonald, and Justin Schuh
Addison Wesley
2006
ID: REF-62
Applicable Platforms
Languages:
Perl : UndeterminedPHP : Undetermined
Modes of Introduction
Implementation
Taxonomy Mapping
  • The CERT Oracle Secure Coding Standard for Java (2011)