Improper Neutralization of Data within XPath Expressions ('XPath Injection')

Incomplete Base
Structure: Simple
Description

XPath Injection occurs when an application uses unvalidated user input to build an XPath query for an XML database. Without proper sanitization, attackers can manipulate the query's structure.

Extended Description

This vulnerability allows an attacker to alter the intended logic of the XPath expression. By injecting special characters or control sequences, they can change which data is retrieved from the XML source, potentially bypassing application logic, authentication, or access controls. Successful exploitation can lead to unauthorized data exposure, information disclosure, or manipulation of application flow. Developers must treat all user input used in XPath queries as untrusted and implement proper validation or parameterization to prevent these attacks.

Common Consequences 2
Scope: Access Control

Impact: Bypass Protection Mechanism

Controlling application flow (e.g. bypassing authentication).

Scope: Confidentiality

Impact: Read Application Data

The attacker could read restricted XML content.

Detection Methods 1
Automated Static AnalysisHigh
Automated static analysis, commonly referred to as Static Application Security Testing (SAST), can find some instances of this weakness by analyzing source code (or binary/compiled code) without having to execute it. Typically, this is done by building a model of data flow and control flow, then searching for potentially-vulnerable patterns that connect "sources" (origins of input) with "sinks" (destinations where the data interacts with external components, a lower layer such as the OS, etc.)
Potential Mitigations 2
Phase: Implementation
Use parameterized XPath queries (e.g. using XQuery). This will help ensure separation between data plane and control plane.
Phase: Implementation
Properly validate user input. Reject data where appropriate, filter where appropriate and escape where appropriate. Make sure input that will be used in XPath queries is safe in that context.
Demonstrative Examples 1

ID : DX-211

Consider the following simple XML document that stores authentication information and a snippet of Java code that uses XPath query to retrieve authentication information:

Code Example:

Informative
XML
xml
The Java code used to retrieve the home directory based on the provided credentials is:

Code Example:

Bad
Java
java
Assume that user "john" wishes to leverage XPath Injection and login without a valid password. By providing a username "john" and password "' or ''='" the XPath expression now becomes

Code Example:

Attack
bash
This lets user "john" login without a valid password, thus bypassing authentication.
References 2
XPath Injection
Web Application Security Consortium
ID: REF-531
The Art of Software Security Assessment
Mark Dowd, John McDonald, and Justin Schuh
Addison Wesley
2006
ID: REF-62
Likelihood of Exploit

High

Applicable Platforms
Languages:
Not Language-Specific : Undetermined
Modes of Introduction
Implementation
Taxonomy Mapping
  • WASC
  • Software Fault Patterns
Notes
RelationshipThis weakness is similar to other weaknesses that enable injection style attacks, such as SQL injection, command injection and LDAP injection. The main difference is that the target of attack here is the XML database.