This vulnerability occurs when software fails to correctly process input that contains multiple character encodings within the same data stream.
Mixed encoding vulnerabilities happen when an application receives data that switches between different character sets—like UTF-8, UTF-16, ISO-8859-1, or others—without proper normalization. Attackers exploit this by injecting malicious payloads in one encoding that, when misinterpreted by the parser, bypass validation routines designed for a different encoding. This mismatch between what the security checks see and what the backend interpreter processes creates a dangerous gap in your defenses. To prevent this, developers should implement a strict input validation pipeline that first decodes all incoming data into a single, consistent internal character set (like UTF-8) before performing any security checks or processing. This normalization step ensures that validation logic and downstream components are all evaluating the same canonical representation of the data, eliminating the ambiguity that mixed encodings introduce.
Impact: Unexpected State
Strategy: Input Validation
Strategy: Input Validation
Strategy: Output Encoding
Strategy: Input Validation