CVE-2025-49847: Buffer Overflow Vulnerability in llama.cpp Leading to Potential Code Execution.

Overview

CVE-2025-49847 is a significant vulnerability found in the llama.cpp, a C/C++ implementation of several LLM models. This vulnerability is of high concern due to its potential to allow an attacker to cause arbitrary memory corruption and even execute unauthorized code. This could lead to significant system compromise and data leakage, affecting various applications and services that rely on affected versions of llama.cpp. Given the potential severity of the impact, it’s crucial for organizations to understand this vulnerability and take appropriate measures to mitigate it.

Vulnerability Summary

CVE ID: CVE-2025-49847
Severity: High (8.8 CVSS Score)
Attack Vector: Network
Privileges Required: None
User Interaction: None
Impact: System compromise and potential data leakage

Affected Products

Escape the Surveillance Era

Most apps won’t tell you the truth.
They’re part of the problem.

Phone numbers. Emails. Profiles. Logs.
It’s all fuel for surveillance.

Ameeba Chat gives you a way out.

• No phone number
• No email
• No personal info
• Anonymous aliases
• End-to-end encrypted

Chat without a trace.

Download Ameeba Chat Learn More

Product | Affected Versions

llama.cpp | Prior to version b5662

How the Exploit Works

The vulnerability lies in the vocabulary-loading code of llama.cpp. Here, a helper function, _try_copy in llama_vocab::impl::token_to_piece(), incorrectly casts a very large size_t token length into an int32_t. This results in the bypassing of the length check (if (length < (int32_t)size)), and memcpy is still called with that oversized size. A malicious GGUF model vocabulary provided by an attacker can take advantage of this to overwrite memory beyond the intended buffer, thereby leading to arbitrary memory corruption and potential unauthorized code execution.

Conceptual Example Code

Below is a conceptual example of how this vulnerability might be exploited. This is represented as a pseudocode for an attacker-supplied GGUF model vocabulary with an oversized token.

// Malicious GGUF model vocabulary
std::string malicious_vocab = createOversizedToken();
// Loading malicious vocabulary in llama.cpp
llama_vocab vocab = llama_vocab::load_from_string(malicious_vocab);
// Triggering buffer overflow
vocab.token_to_piece(oversizedToken);

In this example, createOversizedToken() is a function that creates a token larger than int32_t can handle. The oversized token is then loaded into llama.cpp through the load_from_string function, and the buffer overflow is triggered when token_to_piece is called with the oversized token. This could potentially lead to memory corruption and unauthorized code execution.

CVE-2025-49847: Buffer Overflow Vulnerability in llama.cpp Leading to Potential Code Execution.

Escape the Surveillance Era

More posts

CVE-2025-49651: Critical Unauthorized Access Vulnerability in BackendAI

CVE-2025-49297: Path Traversal Vulnerability in Mikado-Themes Grill and Chow

CVE-2025-49296: Path Traversal Vulnerability in Mikado-Themes GrandPrix

CVE-2025-49295: Path Traversal Vulnerability In Mikado-Themes MediClinic