CWE-838 Base Incomplet

Inappropriate Encoding for Output Context

This vulnerability occurs when a system uses one type of encoding for its output, but the component receiving that data expects a different encoding. The mismatch causes the downstream component to…

Définition

What is CWE-838?

This vulnerability occurs when a system uses one type of encoding for its output, but the component receiving that data expects a different encoding. The mismatch causes the downstream component to interpret the data incorrectly.
When the wrong encoding is applied, even if it's similar to the correct one, the receiving component may decode characters into unexpected control commands or special elements. This breaks the intended separation between data and executable instructions, potentially allowing injection attacks to bypass security checks like input validation. While common in web security—like using HTML entity encoding in a JavaScript context where it's ineffective—this issue can affect any system where data passes between components using different encoding rules. The core problem isn't a lack of encoding, but using encoding that doesn't match the context in which the data will be interpreted.
Impact réel

Real-world CVEs caused by CWE-838

  • Server does not properly handle requests that do not contain UTF-8 data; browser assumes UTF-8, allowing XSS.

Comment les attaquants l'exploitent

Parcours de l'attaquant étape par étape

  1. 1

    This code dynamically builds an HTML page using POST data:

  2. 2

    The programmer attempts to avoid XSS exploits (CWE-79) by encoding the POST values so they will not be interpreted as valid HTML. However, the htmlentities() encoding is not appropriate when the data are used as HTML attributes, allowing more attributes to be injected.

  3. 3

    For example, an attacker can set picAltText to:

  4. 4

    This will result in the generated HTML image tag:

  5. 5

    The attacker can inject arbitrary javascript into the tag due to this incorrect encoding.

Exemple de code vulnérable

Vulnerable PHP

This code dynamically builds an HTML page using POST data:

Vulnérable PHP
$username = $_POST['username'];
  $picSource = $_POST['picsource'];
  $picAltText = $_POST['picalttext'];
```
...* 
  
  echo "<title>Welcome, " . htmlentities($username) ."</title>";
  echo "<img src='". htmlentities($picSource) ." ' alt='". htmlentities($picAltText) . '" />';
  
   *...*
Charge utile de l'attaquant

For example, an attacker can set picAltText to:

Charge utile de l'attaquant
"altTextHere' onload='alert(document.cookie)"
Exemple de code sécurisé

Secure pseudo

Sécurisé pseudo
// Validate, sanitize, or use a safe API before reaching the sink.
function handleRequest(input) {
  const safe = validateAndEscape(input);
  return executeWithGuards(safe);
}
What changed: the unsafe sink is replaced (or the input is validated/escaped) so the same payload no longer triggers the weakness.
Liste de contrôle de prévention

How to prevent CWE-838

  • Implementation Use context-aware encoding. That is, understand which encoding is being used by the downstream component, and ensure that this encoding is used. If an encoding can be specified, do so, instead of assuming that the default encoding is the same as the default being assumed by the downstream component.
  • Architecture and Design Where possible, use communications protocols or data formats that provide strict boundaries between control and data. If this is not feasible, ensure that the protocols or formats allow the communicating components to explicitly state which encoding/decoding method is being used. Some template frameworks provide built-in support.
  • Architecture and Design Use a vetted library or framework that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid. For example, consider using the ESAPI Encoding control [REF-45] or a similar tool, library, or framework. These will help the programmer encode outputs in a manner less prone to error. Note that some template mechanisms provide built-in support for the appropriate encoding.
Signaux de détection

How to detect CWE-838

Automated Static Analysis High

Automated static analysis, commonly referred to as Static Application Security Testing (SAST), can find some instances of this weakness by analyzing source code (or binary/compiled code) without having to execute it. Typically, this is done by building a model of data flow and control flow, then searching for potentially-vulnerable patterns that connect "sources" (origins of input) with "sinks" (destinations where the data interacts with external components, a lower layer such as the OS, etc.)

Correction automatique Plexicus

Plexicus détecte automatiquement CWE-838 et ouvre une PR de correction en moins de 60 secondes.

Codex Remedium analyse chaque commit, identifie cette faiblesse précise et livre une pull request prête à être relue avec le correctif. Pas de tickets. Pas de transferts.

Questions fréquentes

Frequently asked questions

Qu'est-ce que CWE-838 ?

This vulnerability occurs when a system uses one type of encoding for its output, but the component receiving that data expects a different encoding. The mismatch causes the downstream component to interpret the data incorrectly.

Quelle est la gravité de CWE-838 ?

MITRE n'a pas publié de note de probabilité d'exploitation pour cette faiblesse. Traitez-la comme un impact moyen jusqu'à ce que votre modèle de menace prouve le contraire.

Quels langages ou plateformes sont affectés par CWE-838 ?

MITRE n'a pas spécifié les plateformes affectées pour ce CWE — il peut s'appliquer à la plupart des stacks applicatives.

Comment puis-je prévenir CWE-838 ?

Use context-aware encoding. That is, understand which encoding is being used by the downstream component, and ensure that this encoding is used. If an encoding can be specified, do so, instead of assuming that the default encoding is the same as the default being assumed by the downstream component. Where possible, use communications protocols or data formats that provide strict boundaries between control and data. If this is not feasible, ensure that the protocols or formats allow the…

Comment Plexicus détecte et corrige CWE-838 ?

Le moteur SAST de Plexicus reconnaît la signature de flux de données de CWE-838 à chaque commit. Lorsqu'une correspondance est trouvée, notre agent Codex Remedium ouvre une PR de correction avec le code corrigé, les tests et un résumé d'une ligne pour le relecteur.

Où puis-je en savoir plus sur CWE-838 ?

MITRE publie la définition canonique à https://cwe.mitre.org/data/definitions/838.html. Vous pouvez également consulter la documentation OWASP et NIST pour des conseils adjacents.

Prêt quand vous l'êtes

Arrêtez de payer par développeur.
Commencez à fermer la boucle.

Plexicus est l'ASPM natif IA qui scanne, filtre, corrige, penteste et explique — de façon autonome. Développeurs illimités, dépôts illimités, actions IA à usage équitable. Vrai niveau gratuit, €269/mo annuel quand vous êtes prêt.