Overview
A critical vulnerability, identified as CVE-2025-2099, has been discovered in the `transformers.testing_utils` module of huggingface/transformers, a popular machine learning library. This vulnerability, specifically within the `preprocess_string()` function, potentially exposes systems to Regular Expression Denial of Service (ReDoS) attacks. It is significant as it can lead to high system CPU usage, resulting in potential application downtime and posing a risk to system stability and data security.
Vulnerability Summary
CVE ID: CVE-2025-2099
Severity: High (7.5 CVSS Score)
Attack Vector: Network
Privileges Required: None
User Interaction: None
Impact: Potential system compromise or data leakage
Affected Products
A new way to communicate
Ameeba Chat is built on encrypted identity, not personal profiles.
Message, call, share files, and coordinate with identities kept separate.
- • Encrypted identity
- • Ameeba Chat authenticates access
- • Aliases and categories
- • End-to-end encrypted chat, calls, and files
- • Secure notes for sensitive information
Private communication, rethought.
Product | Affected Versions
huggingface/transformers | v4.48.3
How the Exploit Works
The vulnerability exists in the `preprocess_string()` function of the `transformers.testing_utils` module in huggingface/transformers. The regular expression used for processing code blocks in docstrings has nested quantifiers. This causes exponential backtracking when processing input with a large number of newline characters. An attacker can exploit this by providing a specially crafted payload, causing high CPU usage and potential application downtime. This effectively allows for a Denial of Service (DoS) scenario.
Conceptual Example Code
The following pseudocode example demonstrates how an attacker might exploit this vulnerability:
import transformers.testing_utils as utils
malicious_payload = "\n" * 100000 # A long string of newline characters
utils.preprocess_string(malicious_payload)
In this conceptual example, the `malicious_payload` string consists of a large number of newline characters. When passed to the `preprocess_string()` function, it triggers the vulnerability, leading to high CPU usage and potential denial of service.
