Databricks researchers have identified six heap overflow vulnerabilities in the GGML library's GGUF file format parser that could allow attackers to execute code on victims' computers through malicious machine learning model files.
The GGUF file format has emerged as a popular binary format for storing and loading machine learning model weights, particularly for Llama-2 models in low-level contexts. However, recent security research has uncovered critical vulnerabilities that could allow attackers to compromise systems through seemingly innocuous model files.
The Discovery
Databricks security researchers identified six heap overflow vulnerabilities in the GGML library, which is responsible for parsing GGUF files. These vulnerabilities stem from insufficient validation of input data during the parsing process, creating multiple attack vectors that could be exploited to execute arbitrary code on victim machines.
The vulnerabilities affect various implementations that utilize the GGML library, including llama.cpp, the python llm module, and the ctransformers library when loading GGUF files from sources like Huggingface.
Technical Analysis of the Vulnerabilities
CVE-2024-25664: Unchecked KV Count
The primary entry point for loading GGUF files is the gguf_init_from_file() function. This function reads the GGUF header, verifies the magic value "GGUF", and then parses key-value pairs from the file. The vulnerability occurs when reading these key-value pairs:
- The function allocates memory for an array of
gguf_kvstructures based on a count read from the file - This count is not properly validated and can be manipulated by an attacker
- Each
gguf_kvstructure contains a key and value field, providing powerful primitives for heap exploitation - By providing a large count value, an attacker can cause the allocation to wrap, resulting in a small allocation followed by a large write loop that overflows adjacent heap memory
A proof-of-concept demonstrated this vulnerability by causing an allocation of 0xe0 bytes while using a count value of 0x55555555555555a, resulting in heap memory being smashed with 0x500 key-value pairs.
CVE-2024-25665: Reading String Types
Another vulnerability exists in the gguf_fread_str() function, which reads length-encoded strings from the file:
- The function reads a length value using
gguf_fread_el() - It then allocates memory for the string plus one byte for a null terminator
- By providing a size of 0xffffffffffffffff, the addition wraps back to 0
- The allocator returns the smallest possible chunk size
- A subsequent copy operation using the large unwrapped size causes a heap overflow
CVE-2024-25666: Tensor Count Unchecked
A similar vulnerability exists when parsing gguf_tensor_infos:
- The
ctx->header.n_tensorsvalue is read from the file without validation - This value is multiplied by the size of a tensor info structure
- The multiplication can wrap, resulting in a smaller allocation than expected
- A loop then copies each tensor info element, causing a heap overflow
CVE-2024-25667: User-Supplied Array Elements
When unpacking key-value values of array type (GGUF_TYPE_ARRAY):
- The code reads the array element type and count from the file
- It multiplies the type size from the
GGUF_TYPE_SIZEarray with the element count - Since the element count is user-supplied and arbitrary, this calculation can wrap
- This results in a small allocation followed by a large copy loop
- The compact nature of array data provides a very controlled overflow over heap contents
CVE-2024-25668: Unpacking KV String Type Arrays
During array KV unpacking for string type arrays:
- An element count is read directly from the file without validation
- This value is multiplied by the size of the
gguf_strstruct - The multiplication can wrap, resulting in a small allocation
- A loop populates the chunk up to the n value, causing an out-of-bounds write of string struct contents
Unbounded Array Indexing
An additional vulnerability exists during array parsing:
- The required size of a type is determined via the
GGUF_TYPE_SIZE[]array - The index used to access this array is read directly from the file without sanitization
- An attacker could provide an index outside the bounds of this array
- This returns a size that causes an integer wrap, used in both allocation and copy operations
Timeline and Response
- January 23, 2024: Vendor contacted, vulnerabilities reported
- January 25, 2024: CVE requests submitted
- January 28, 2024: Fixes reviewed in GGML GitHub repository
- January 29, 2024: Patches merged into master branch
All six vulnerabilities were addressed in commit 6b14d73, which implements the necessary security fixes.
Security Implications
These vulnerabilities represent a significant security concern for the machine learning community. Attackers could leverage these flaws to:
- Execute arbitrary code on victim machines through malicious GGUF files
- Distribute malware disguised as legitimate machine learning models
- Compromise developers and researchers working with ML models
- Create supply chain attacks through popular model repositories
Conclusion
The discovery of these vulnerabilities highlights the critical need for rigorous security review in the rapidly evolving field of machine learning. As the GGUF format becomes increasingly popular for distributing trained models, ensuring the security of the underlying parsing libraries becomes paramount.
The collaborative response between Databricks and the GGML.ai team demonstrates the importance of responsible disclosure and rapid patching in maintaining the security of open-source machine learning infrastructure. However, this incident serves as a reminder that security must be a foundational consideration in the development of new file formats and parsing libraries, particularly those handling untrusted data from external sources.
The patches are now available in the GGML library, and users are strongly encouraged to update their implementations to protect against these heap overflow vulnerabilities. As machine learning continues to grow in popularity and adoption, the security community must remain vigilant in identifying and addressing potential attack vectors in the tools and formats that underpin this transformative technology.

Comments
Please log in or register to join the discussion