Critical Heap Overflow Vulnerabilities Discovered in Popular GGUF Model Format
#Vulnerabilities

Critical Heap Overflow Vulnerabilities Discovered in Popular GGUF Model Format

Tech Essays Reporter
5 min read

Databricks researchers have identified six heap overflow vulnerabilities in the GGML library's GGUF file format parser that could allow attackers to execute code on victims' computers through malicious machine learning model files.

The GGUF file format has emerged as a popular binary format for storing and loading machine learning model weights, particularly for Llama-2 models in low-level contexts. However, recent security research has uncovered critical vulnerabilities that could allow attackers to compromise systems through seemingly innocuous model files.

The Discovery

Databricks security researchers identified six heap overflow vulnerabilities in the GGML library, which is responsible for parsing GGUF files. These vulnerabilities stem from insufficient validation of input data during the parsing process, creating multiple attack vectors that could be exploited to execute arbitrary code on victim machines.

The vulnerabilities affect various implementations that utilize the GGML library, including llama.cpp, the python llm module, and the ctransformers library when loading GGUF files from sources like Huggingface.

Technical Analysis of the Vulnerabilities

CVE-2024-25664: Unchecked KV Count

The primary entry point for loading GGUF files is the gguf_init_from_file() function. This function reads the GGUF header, verifies the magic value "GGUF", and then parses key-value pairs from the file. The vulnerability occurs when reading these key-value pairs:

  • The function allocates memory for an array of gguf_kv structures based on a count read from the file
  • This count is not properly validated and can be manipulated by an attacker
  • Each gguf_kv structure contains a key and value field, providing powerful primitives for heap exploitation
  • By providing a large count value, an attacker can cause the allocation to wrap, resulting in a small allocation followed by a large write loop that overflows adjacent heap memory

A proof-of-concept demonstrated this vulnerability by causing an allocation of 0xe0 bytes while using a count value of 0x55555555555555a, resulting in heap memory being smashed with 0x500 key-value pairs.

CVE-2024-25665: Reading String Types

Another vulnerability exists in the gguf_fread_str() function, which reads length-encoded strings from the file:

  • The function reads a length value using gguf_fread_el()
  • It then allocates memory for the string plus one byte for a null terminator
  • By providing a size of 0xffffffffffffffff, the addition wraps back to 0
  • The allocator returns the smallest possible chunk size
  • A subsequent copy operation using the large unwrapped size causes a heap overflow

CVE-2024-25666: Tensor Count Unchecked

A similar vulnerability exists when parsing gguf_tensor_infos:

  • The ctx->header.n_tensors value is read from the file without validation
  • This value is multiplied by the size of a tensor info structure
  • The multiplication can wrap, resulting in a smaller allocation than expected
  • A loop then copies each tensor info element, causing a heap overflow

CVE-2024-25667: User-Supplied Array Elements

When unpacking key-value values of array type (GGUF_TYPE_ARRAY):

  • The code reads the array element type and count from the file
  • It multiplies the type size from the GGUF_TYPE_SIZE array with the element count
  • Since the element count is user-supplied and arbitrary, this calculation can wrap
  • This results in a small allocation followed by a large copy loop
  • The compact nature of array data provides a very controlled overflow over heap contents

CVE-2024-25668: Unpacking KV String Type Arrays

During array KV unpacking for string type arrays:

  • An element count is read directly from the file without validation
  • This value is multiplied by the size of the gguf_str struct
  • The multiplication can wrap, resulting in a small allocation
  • A loop populates the chunk up to the n value, causing an out-of-bounds write of string struct contents

Unbounded Array Indexing

An additional vulnerability exists during array parsing:

  • The required size of a type is determined via the GGUF_TYPE_SIZE[] array
  • The index used to access this array is read directly from the file without sanitization
  • An attacker could provide an index outside the bounds of this array
  • This returns a size that causes an integer wrap, used in both allocation and copy operations

Timeline and Response

  • January 23, 2024: Vendor contacted, vulnerabilities reported
  • January 25, 2024: CVE requests submitted
  • January 28, 2024: Fixes reviewed in GGML GitHub repository
  • January 29, 2024: Patches merged into master branch

All six vulnerabilities were addressed in commit 6b14d73, which implements the necessary security fixes.

Security Implications

These vulnerabilities represent a significant security concern for the machine learning community. Attackers could leverage these flaws to:

  • Execute arbitrary code on victim machines through malicious GGUF files
  • Distribute malware disguised as legitimate machine learning models
  • Compromise developers and researchers working with ML models
  • Create supply chain attacks through popular model repositories

Conclusion

The discovery of these vulnerabilities highlights the critical need for rigorous security review in the rapidly evolving field of machine learning. As the GGUF format becomes increasingly popular for distributing trained models, ensuring the security of the underlying parsing libraries becomes paramount.

The collaborative response between Databricks and the GGML.ai team demonstrates the importance of responsible disclosure and rapid patching in maintaining the security of open-source machine learning infrastructure. However, this incident serves as a reminder that security must be a foundational consideration in the development of new file formats and parsing libraries, particularly those handling untrusted data from external sources.

The patches are now available in the GGML library, and users are strongly encouraged to update their implementations to protect against these heap overflow vulnerabilities. As machine learning continues to grow in popularity and adoption, the security community must remain vigilant in identifying and addressing potential attack vectors in the tools and formats that underpin this transformative technology.

Comments

Loading comments...