GNU C Library Lands x86_64 FMA'ed cosh For A ~35% Improvement
#Hardware

GNU C Library Lands x86_64 FMA'ed cosh For A ~35% Improvement

Hardware Reporter
2 min read

GNU C Library gains a 35% performance boost for hyperbolic cosine calculations on modern x86_64 processors through FMA optimization.

The GNU C Library (glibc) has received a significant performance boost for mathematical calculations on modern x86_64 processors. Adhemerval Zanella of Linaro has enabled an FMA-optimized cosh() function for x86_64-v3 micro-architecture feature level, delivering approximately 35% better performance compared to the existing SSE2 implementation.

glibc CORE-MATH change performance benchmark

The hyperbolic cosine function, commonly used in scientific computing, engineering calculations, and various mathematical applications, now benefits from fused multiply-add (FMA) instructions available on modern Intel and AMD processors. This optimization leverages the x86_64-v3 instruction set level, which includes FMA capabilities that allow for more efficient floating-point operations.

FMA instructions combine multiplication and addition operations into a single instruction, reducing latency and improving throughput for mathematical computations. For the cosh() function specifically, this means fewer CPU cycles are required to calculate the hyperbolic cosine of a given value, resulting in the observed 35% performance improvement.

This optimization is part of a broader effort to modernize glibc's mathematical functions for contemporary hardware. The patches landed in glibc Git today alongside several other performance-related changes, particularly around using tanh and sinh functions from CORE-MATH. However, the results for these additional functions were mixed, especially when targeting older CPUs or instruction set architectures.

GNU

The timing of this work aligns with preparations for the upcoming glibc 2.44 release, scheduled for August 2026. Based on historical patterns, we can expect additional performance optimizations to land before the final release, continuing the trend of making glibc more efficient on modern hardware.

These improvements matter significantly for applications that perform heavy mathematical computations, such as scientific simulations, financial modeling, and data analysis tools. A 35% reduction in computation time for fundamental mathematical functions can translate to substantial overall performance gains in workloads that rely heavily on these operations.

For developers and system administrators, this optimization is automatic - applications using glibc's mathematical functions will benefit without requiring any code changes. The performance improvement is available to anyone building their software for the x86_64-v3 target, which includes modern x86_64 processors from Intel and AMD that support FMA instructions.

This work represents the ongoing effort to ensure glibc remains optimized for contemporary hardware while maintaining compatibility with older systems. As processors continue to evolve with new instruction set capabilities, we can expect similar targeted optimizations to appear in future glibc releases, further improving performance for mathematical and scientific computing workloads on Linux systems.

Comments

Loading comments...