CWE-120 Base Incomplete High likelihood

Buffer Copy without Checking Size of Input ('Classic Buffer Overflow')

This vulnerability occurs when a program copies data from one memory location to another without first verifying that the source data will fit within the destination buffer's allocated space.

Definition

What is CWE-120?

This vulnerability occurs when a program copies data from one memory location to another without first verifying that the source data will fit within the destination buffer's allocated space.
A classic buffer overflow happens when software blindly trusts input data size. When more data is copied into a fixed-size buffer than it can hold, the excess spills over into adjacent memory. This can corrupt other variables, crash the program, or, most critically, allow an attacker to overwrite critical control data like function return addresses, potentially redirecting the program's execution to malicious code. To prevent this, developers must always validate input sizes before copying operations. Use safe functions that specify buffer limits (like `strncpy` instead of `strcpy` in C), or better yet, employ modern languages with built-in bounds checking. The core principle is never to trust external input and to enforce strict boundaries for all memory operations.
Vulnerability Diagram CWE-120
Classic Stack Buffer Overflow char buf[8] saved EBP return address stack ↓ strcpy(buf, attacker_input_64_bytes) → overflow → Oversized input rewrites saved registers and the return address.
Auswirkungen in der Praxis

Real-world CVEs caused by CWE-120

  • buffer overflow using command with long argument

  • buffer overflow in local program using long environment variable

  • buffer overflow in comment characters, when product increments a counter for a ">" but does not decrement for "<"

  • By replacing a valid cookie value with an extremely long string of characters, an attacker may overflow the application's buffers.

  • By replacing a valid cookie value with an extremely long string of characters, an attacker may overflow the application's buffers.

Wie Angreifer es ausnutzen

Angreiferpfad Schritt für Schritt

  1. 1

    The following code asks the user to enter their last name and then attempts to store the value entered in the last_name array.

  2. 2

    The problem with the code above is that it does not restrict or limit the size of the name entered by the user. If the user enters "Very_very_long_last_name" which is 24 characters long, then a buffer overflow will occur since the array can only hold 20 characters total.

  3. 3

    The following code attempts to create a local copy of a buffer to perform some manipulations to the data.

  4. 4

    However, the programmer does not ensure that the size of the data pointed to by string will fit in the local buffer and copies the data with the potentially dangerous strcpy() function. This may result in a buffer overflow condition if an attacker can influence the contents of the string parameter.

  5. 5

    The code below calls the gets() function to read in data from the command line.

Verwundbares Codebeispiel

Vulnerable C

The following code asks the user to enter their last name and then attempts to store the value entered in the last_name array.

Verwundbar C
char last_name[20];
  printf ("Enter your last name: ");
  scanf ("%s", last_name);
Sicheres Codebeispiel

Secure pseudo

Sicher pseudo
// Validate, sanitize, or use a safe API before reaching the sink.
function handleRequest(input) {
  const safe = validateAndEscape(input);
  return executeWithGuards(safe);
}
What changed: the unsafe sink is replaced (or the input is validated/escaped) so the same payload no longer triggers the weakness.
Präventions-Checkliste

How to prevent CWE-120

  • Requirements Use a language that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid. For example, many languages that perform their own memory management, such as Java and Perl, are not subject to buffer overflows. Other languages, such as Ada and C#, typically provide overflow protection, but the protection can be disabled by the programmer. Be wary that a language's interface to native code may still be subject to overflows, even if the language itself is theoretically safe.
  • Architecture and Design Use a vetted library or framework that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid. Examples include the Safe C String Library (SafeStr) by Messier and Viega [REF-57], and the Strsafe.h library from Microsoft [REF-56]. These libraries provide safer versions of overflow-prone string-handling functions.
  • Operation / Build and Compilation Use automatic buffer overflow detection mechanisms that are offered by certain compilers or compiler extensions. Examples include: the Microsoft Visual Studio /GS flag, Fedora/Red Hat FORTIFY_SOURCE GCC flag, StackGuard, and ProPolice, which provide various mechanisms including canary-based detection and range/index checking. D3-SFCV (Stack Frame Canary Validation) from D3FEND [REF-1334] discusses canary-based detection in detail.
  • Implementation Consider adhering to the following rules when allocating and managing an application's memory: - Double check that your buffer is as large as you specify. - When using functions that accept a number of bytes to copy, such as strncpy(), be aware that if the destination buffer size is equal to the source buffer size, it may not NULL-terminate the string. - Check buffer boundaries if accessing the buffer in a loop and make sure there is no danger of writing past the allocated space. - If necessary, truncate all input strings to a reasonable length before passing them to the copy and concatenation functions.
  • Implementation Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue." Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylists can be useful for detecting potential attacks or determining which inputs are so malformed that they should be rejected outright.
  • Architecture and Design For any security checks that are performed on the client side, ensure that these checks are duplicated on the server side, in order to avoid CWE-602. Attackers can bypass the client-side checks by modifying values after the checks have been performed, or by changing the client to remove the client-side checks entirely. Then, these modified values would be submitted to the server.
  • Operation / Build and Compilation Run or compile the software using features or extensions that randomly arrange the positions of a program's executable and libraries in memory. Because this makes the addresses unpredictable, it can prevent an attacker from reliably jumping to exploitable code. Examples include Address Space Layout Randomization (ASLR) [REF-58] [REF-60] and Position-Independent Executables (PIE) [REF-64]. Imported modules may be similarly realigned if their default memory addresses conflict with other modules, in a process known as "rebasing" (for Windows) and "prelinking" (for Linux) [REF-1332] using randomly generated addresses. ASLR for libraries cannot be used in conjunction with prelink since it would require relocating the libraries at run-time, defeating the whole purpose of prelinking. For more information on these techniques see D3-SAOR (Segment Address Offset Randomization) from D3FEND [REF-1335].
  • Operation Use a CPU and operating system that offers Data Execution Protection (using hardware NX or XD bits) or the equivalent techniques that simulate this feature in software, such as PaX [REF-60] [REF-61]. These techniques ensure that any instruction executed is exclusively at a memory address that is part of the code segment. For more information on these techniques see D3-PSEP (Process Segment Execution Prevention) from D3FEND [REF-1336].
Erkennungssignale

How to detect CWE-120

Automated Static Analysis High

This weakness can often be detected using automated static analysis tools. Many modern tools use data flow analysis or constraint-based techniques to minimize the number of false positives. Automated static analysis generally does not account for environmental considerations when reporting out-of-bounds memory operations. This can make it difficult for users to determine which warnings should be investigated first. For example, an analysis tool might report buffer overflows that originate from command line arguments in a program that is not expected to run with setuid or other special privileges.

Automated Dynamic Analysis

This weakness can be detected using dynamic tools and techniques that interact with the software using large test suites with many diverse inputs, such as fuzz testing (fuzzing), robustness testing, and fault injection. The software's operation may slow down, but it should not become unstable, crash, or generate incorrect results.

Manual Analysis

Manual analysis can be useful for finding this weakness, but it might not achieve desired code coverage within limited time constraints. This becomes difficult for weaknesses that must be considered for all inputs, since the attack surface can be too large.

Automated Static Analysis - Binary or Bytecode High

According to SOAR [REF-1479], the following detection techniques may be useful: ``` Highly cost effective: ``` Bytecode Weakness Analysis - including disassembler + source code weakness analysis Binary Weakness Analysis - including disassembler + source code weakness analysis

Manual Static Analysis - Binary or Bytecode SOAR Partial

According to SOAR [REF-1479], the following detection techniques may be useful: ``` Cost effective for partial coverage: ``` Binary / Bytecode disassembler - then use manual analysis for vulnerabilities & anomalies

Dynamic Analysis with Automated Results Interpretation SOAR Partial

According to SOAR [REF-1479], the following detection techniques may be useful: ``` Cost effective for partial coverage: ``` Web Application Scanner Web Services Scanner Database Scanners

Plexicus Auto-Fix

Plexicus erkennt CWE-120 automatisch und öffnet in unter 60 Sekunden einen Fix-PR.

Codex Remedium scannt jeden Commit, identifiziert genau diese Schwachstelle und liefert einen reviewer-ready Pull Request mit dem Patch. Keine Tickets. Keine Hand-offs.

Häufig gestellte Fragen

Frequently asked questions

Was ist CWE-120?

This vulnerability occurs when a program copies data from one memory location to another without first verifying that the source data will fit within the destination buffer's allocated space.

Wie gravierend ist CWE-120?

MITRE stuft die Exploit-Wahrscheinlichkeit als hoch ein — diese Schwachstelle wird aktiv in freier Wildbahn ausgenutzt und sollte priorisiert behoben werden.

Welche Sprachen oder Plattformen sind von CWE-120 betroffen?

MITRE lists the following affected platforms: C, C++, Assembly.

Wie kann ich CWE-120 verhindern?

Use a language that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid. For example, many languages that perform their own memory management, such as Java and Perl, are not subject to buffer overflows. Other languages, such as Ada and C#, typically provide overflow protection, but the protection can be disabled by the programmer. Be wary that a language's interface to native code may still be subject to overflows, even if the language itself is…

Wie erkennt und behebt Plexicus CWE-120?

Die SAST-Engine von Plexicus erkennt die Datenfluss-Signatur von CWE-120 bei jedem Commit. Bei einem Treffer öffnet unser Codex-Remedium-Agent einen Fix-PR mit korrigiertem Code, Tests und einer einzeiligen Zusammenfassung für den Reviewer.

Wo erfahre ich mehr über CWE-120?

MITRE veröffentlicht die kanonische Definition unter https://cwe.mitre.org/data/definitions/120.html. Für ergänzende Hinweise kannst du auch die OWASP- und NIST-Dokumentation heranziehen.

Bereit, wenn du es bist

Schluss mit dem Bezahlen pro Entwickler.
Schließ den Kreislauf.

Plexicus ist die KI-native ASPM, die scannt, filtert, fixt, pentestet und erklärt — autonom. Unbegrenzte Entwickler, unbegrenzte Repos, Fair-Use-KI-Aktionen. Echter kostenloser Tarif, €269/mo jährlich, wenn du bereit bist.