A Java Performance Quest: Taming Unsafe Code, Embracing Idiomatic Style & Debugging the Linux Kernel
#Regulation

A Java Performance Quest: Taming Unsafe Code, Embracing Idiomatic Style & Debugging the Linux Kernel

Cloud Reporter
8 min read

This article explores Java performance optimization strategies through the lens of QuestDB, a high-performance time-series database. We examine the trade-offs between idiomatic Java and low-level optimizations, the evolution from Unsafe to modern Java features like Vector API and Project Valhalla, and insights into debugging at the kernel level.

A Java Performance Quest: Taming Unsafe Code, Embracing Idiomatic Style & Debugging the Linux Kernel

Featured image

In the world of high-performance computing, Java often faces skepticism about its capabilities compared to lower-level languages. Yet, as demonstrated by QuestDB, a time-series database processing millions of rows per second, Java can deliver exceptional performance when properly optimized. In this comprehensive analysis, we explore the strategies and technologies that enable Java to compete at the highest levels of performance while maintaining code quality and maintainability.

QuestDB: Architecture and Design Principles

QuestDB represents a fascinating case study in high-performance database design. Built on principles borrowed from high-frequency trading systems, it employs a three-tiered architecture optimized for different aspects of data handling:

  1. Ingestion Tier: A write-ahead log optimized for maximum throughput
  2. Query Tier: Data organized by time for efficient retrieval
  3. Archive Tier: Parquet files stored in object storage for long-term retention

This tiered approach allows QuestDB to achieve ingestion rates in the millions of rows per second while maintaining query efficiency for recent data and cost-effective storage for historical data.

"The low tier is ingestion optimized. The mid-tier is query optimized and the last tier is archiving optimized," explains Jaromir Hamala, a software engineer at QuestDB.

Java in High-Performance Systems

The use of Java for a high-performance database challenges conventional wisdom about the language's capabilities. QuestDB's core is approximately 90% Java, supplemented with components in C, C++, and Rust for specialized tasks.

"QuestDB, the core is technically Java," Hamala notes. "If you check our GitHub stats, then, I don't know by heart, but my guess is that 85% if not more of the total lines of code is Java. So nominally it's Java, but it's a rather unorthodox Java, at least people tend to tell us when they see our code base."

The project's origins in high-frequency trading influenced its design philosophy, emphasizing avoidance of garbage collection through careful memory management and object reuse rather than allocation.

The Evolution from Unsafe to Modern Java

QuestDB's journey reflects the broader evolution of Java performance optimization. Early versions relied heavily on Unsafe for direct memory access, bypassing Java's memory safety for performance gains.

"The core right now, as it is, it still relies on the old school Unsafe base," Hamala explains. "One reason is that up until recently, the Java client was part of the main QuestDB jar for some historical reasons, and this was preventing us to upgrade the Java based version aggressively."

Recent developments have enabled migration to more modern Java versions. "We just recently split it, so now the core is 17 or we just bumped it, or we are about to bump it, to 21. And then because of this client service split, we are now in a position to do this very aggressively."

Vector API and Performance Optimization

The Vector API represents a significant advancement in Java's performance capabilities, enabling SIMD (Single Instruction, Multiple Data) operations directly from Java code.

"I've been playing with Vector API just last weekend because it's super exciting for things like filtering where you have SQL predicate," Hamala shares. "Because we are filtering oftentimes over hundreds of millions and billions of rows, the machine code must be the most efficient machine code possible."

QuestDB's current approach involves a custom JIT compiler for filters, but the Vector API offers potential advantages:

  • Better cross-platform support, particularly for ARM architectures
  • More maintainable code than the current C++ implementation
  • Potential for improved warm-up times compared to custom code generation

"The results are say promising. There's still some typical Java difficulties like with warmup time," Hamala notes. "So what I end up doing is consuming the same intermediate representation with the filter BC+ as backend consumes right now. And I would generate runtime byte code, which would represent that particular filter."

Project Valhalla and Panama

Two upcoming Java projects promise to further bridge the gap between idiomatic Java and high-performance code:

Project Valhalla

Project Valhalla aims to introduce value types to Java, enabling more efficient memory layouts without sacrificing type safety.

"One of our principles is to control memory layout of our data structures, which right now in Java is not easy," Hamala explains. "So, being able to use Valhalla for value tied and have a fine control over memory layout, I think that's going to be fantastic, once it's there."

Valhalla would allow QuestDB to maintain its "mechanical sympathy"—awareness of how hardware works—while writing more idiomatic Java code.

Panama

Project Panama focuses on improving foreign function and memory access, potentially replacing JNI with more efficient alternatives.

"With Panama, we could rid of JNI because we could just use the Panama bindings to call this mmap from the standard library and we could also read of the Unsafe access because then hopefully one day we will be able to have some kind of zero cost abstraction to read all the memory, mmap memory, without paying the price or too much price because right now with Unsafe, it can be fiddly."

Linux Kernel Debugging Experience

Hamala's experience debugging a Linux kernel bug provides insights into the depths of system-level troubleshooting.

"I was trying to reproduce a performance issue for one of our customer experience. I attached my IC profiler to QuestDB and my whole computer froze," he recounts. "What is going on? I restarted the computer, tried to attach the profiler again, the same thing happened. It was completely frozen."

The issue turned out to be a kernel bug related to timer cancellation. "The kernel basically deadlocked itself because internally it was trying to cancel the timer, but that task which was canceling the timer could not cancel the timer because the timer actually triggered that task."

Through careful debugging with GDB, Hamala was able to understand and work around the issue, demonstrating the value of deep system knowledge even for application developers.

The One Billion Row Challenge

Hamala's third-place finish in the One Billion Row Challenge highlights the extreme optimizations possible when pushing Java to its limits.

"My contribution was that I exploited this. Basically each line looks like copy pasted twice, except with different arguments, and that is to exploit the fact that CPU has multiple arithmetic logical units," he explains. "So it can do multiple logical or mathematical, algebraical, you name it, operations at the same time."

This experience revealed the surprising capabilities of modern hardware. "The lesson for me is, again, the computers are extremely fast, and if you are not sabotaging them, they are surprising."

AI in Coding

The emergence of AI coding assistants has transformed how developers approach complex codebases and learning new technologies.

"I have different use cases. Sometimes it's just investigation of unknown code base," Hamala notes. "So when I was doing my Vector API exercise last weekend, I wanted to see how the mapping of the Vector API all the way through the compilation pipeline in Hotspot to specific vectorized instruction, how that works."

AI tools enable exploration of complex systems like the Hotspot compiler that would be impractical through traditional methods alone. "It doesn't mean that I would not be able to trace it all the way down, but I wouldn't be able to trace it all the way and down in one Saturday. So that it means practically it would not be possible because, I have family too."

Balancing Performance and Maintainability

A key theme throughout the discussion is the tension between performance optimization and code maintainability. Hamala emphasizes that extreme optimizations like those used in the One Billion Row Challenge aren't appropriate for most production code.

"Again, it doesn't mean that in order to write fast code, it has to be ugly. This was really about squeezing the very last drops and that required some very ugly tricks. Again, I would not recommend this to anyone. We don't even use this level of trickery in QuestDB because the one billion rows challenge had one massive advantage. The code didn't have to be maintained after the deadline, so you could do whatever."

The Future of Java Performance

Looking ahead, several developments promise to further improve Java's performance characteristics:

  • Project Valhalla: Value types for more efficient memory layouts
  • Project Panama: Improved foreign function and memory access
  • Vector API: SIMD operations without native code
  • Continued JVM improvements: Better compilation and optimization techniques

These advancements aim to eliminate the need for many low-level optimizations while maintaining performance.

"It allows us to keep our mechanical sympathy and write Java code because right now there is a bit of tension between these two, right? Either you are writing thematic Java, everything is an object with its own identity or the primitives. And if you embrace this, I think most developers should, then you can write nice idiomatic Java, but then sometimes this obstruction is too high and it removes some of the control," Hamala explains.

Conclusion

QuestDB's journey demonstrates that Java can compete with lower-level languages for high-performance applications while offering the advantages of the Java ecosystem. The evolution from Unsafe to modern Java features like Vector API and Project Valhalla represents a maturation of the language's performance capabilities.

As Java continues to evolve, the line between idiomatic and high-performance code will blur, enabling developers to write maintainable applications that don't sacrifice performance. For specialized applications like QuestDB, this means the ability to focus on business logic rather than low-level optimizations.

The experiences shared—from kernel debugging to extreme performance optimization—highlight the value of deep system knowledge even as abstraction layers continue to evolve. As AI tools augment developer capabilities, the balance between understanding and implementation will continue to shift, but the fundamental principles of performance optimization remain relevant.

For organizations building high-performance systems in Java, the path forward involves embracing modern Java features while understanding the underlying hardware that ultimately determines performance. The future of Java performance lies not in bypassing the language's safety features, but in enhancing them to enable both safety and speed.

Comments

Loading comments...