Ampere AmpereOne M A192-32M: First Look at the 192-Core 12-Channel DDR5 Arm CPU
#Chips

Ampere AmpereOne M A192-32M: First Look at the 192-Core 12-Channel DDR5 Arm CPU

Hardware Reporter
3 min read

A first look at Ampere's AmpereOne M A192-32M, featuring 192 cores, 12 DDR5 memory channels, and a redesigned package compared to the 8-channel A192-32X variant.

Server CPUs are evolving rapidly, and Ampere's AmpereOne family represents a significant push into the high-performance computing space with their Arm-based processors. Today, we're taking a first look at the AmpereOne M A192-32M, a 192-core processor that brings some interesting changes compared to its predecessor, the A192-32X.

The AmpereOne M A192-32M: 192 Cores, 12 Memory Channels

The A192-32M is part of Ampere's AmpereOne M series, which stands for "Memory" - a clear indicator of its primary differentiator. While it maintains the same 192 cores and 192 threads as the A192-32X, the M variant features 12 channels of DDR5 memory instead of 8. Both processors run at 3.2GHz, hence the "32" designation.

Featured image

This is the AmpereOne A192-32M in a 2U single-socket Gigabyte server without its heatsink installed. At SC25, we saw the Gigabyte R1A3-T40-AAV1, a 1U version of the system we have here.

GIGABYTE R2A3 T40 AAV1 AmpereOne M Processor 1

Physical Package Differences

The most noticeable difference between the A192-32M and A192-32X is in their physical packages. Similar to the AmpereOne A192-32X, the A192-32M has the center compute die with the PCIe and memory tiles under the heat spreader.

Ampere AmpereOne A192 32M Top 1

However, the actual package is different. Here is the A192-32X (below) that you can compare to the A192-32M (above).

ASRock Rack AMPONED8 2T CM Ampere AmpereOne A192 32X 1

Memory Controller Architecture

When we flip the chips over to examine the pads, the expansion in I/O pads becomes clear. The A192-32M on top shows additional pads to handle the extra DDR5 channels compared to the A192-32X below.

Ampere AmpereOne A192 32M Pads 1

The design of AmpereOne includes I/O chiplets for both PCIe and DDR5 controllers. On the PCIe side, you'll find two chiplets on one edge - this is identical to the A192-32X configuration.

Featured image

On the memory side, the difference becomes apparent. The original A192-32X has two DDR5 controller chiplets on each side, with each controller managing two DDR5 channels, giving us eight channels total.

ASRock Rack AMPONED8 2T CM Ampere AmpereOne A192 32X 1

When we look at the AmpereOne M, you can see there are three of these controllers on each side. Three controllers on two sides, each controlling two DDR5 channels, gives us six chiplets and 12 channels of DDR5.

Featured image

System Integration

When the chips are installed into their sockets, you can see that the memory controllers are positioned on the side of the DDR5 memory slots.

GIGABYTE R2A3 T40 AAV1 AmpereOne M Processor 1

This leaves the PCIe chiplets facing towards the front and rear of the system, optimizing the layout for both memory bandwidth and PCIe connectivity.

GIGABYTE R2A3 T40 AAV1 AmpereOne M Processor 1

Performance Implications

The benefits of having 12-channel DDR5 memory support are substantial. Not only do you get 50% more memory channels, but these channels also run at DDR5-5600 speeds versus DDR5-5200 in the original AmpereOne. This translates to significantly higher memory bandwidth, which is crucial for memory-intensive workloads.

However, there's a trade-off: you lose some PCIe lanes, going down from 128 to 96 lanes in the AmpereOne M. This reduction in PCIe lanes is the price paid for the additional memory channels.

Final Thoughts

We'll have more on the AmpereOne M soon, but it was fascinating to get hands-on time with this 12-channel DDR5 Arm processor. These are really interesting chips that we don't get to see all the time, so we wanted to take a moment to let folks see this one in its native environment.

GIGABYTE R2A3 T40 AAV1 AmpereOne M Processor 1

The AmpereOne M represents an interesting design choice in the Arm server CPU space, prioritizing memory bandwidth over PCIe connectivity. For workloads that are memory-bound rather than I/O-bound, this could be an excellent solution. We're looking forward to diving deeper into performance benchmarks and real-world testing in our upcoming review.

Comments

Loading comments...