Google’s AlphaTensor uses a AlphaZero‑style reinforcement learner to discover provably correct matrix‑multiplication algorithms that beat the best human‑designed methods for many small‑matrix sizes and even improve practical performance on GPUs and TPUs.

AlphaTensor Shows How Deep Reinforcement Learning Can Find Faster Matrix‑Multiplication Algorithms

The problem: multiplying matrices faster

Matrix multiplication is a primitive that appears in virtually every modern computation – from training deep neural networks to solving large linear systems in scientific simulations. The naïve algorithm needs n³ scalar multiplications for two n × n matrices, but clever hand‑crafted tricks such as Strassen’s 2 × 2 algorithm (7 multiplications instead of 8) have shown that fewer operations are possible. Finding the optimal low‑rank decomposition of the matrix‑multiplication tensor is an NP‑hard search problem; the space of possible algorithms grows astronomically even for modest sizes like 4 × 4 matrices.

AlphaTensor’s approach

Google DeepMind reframed the search as a single‑player game called TensorGame. The game starts with the full matrix‑multiplication tensor and, at each step, the player selects a rank‑one factor ((u, v, w)) to subtract. The goal is to reach the zero tensor using the smallest number of moves, because each move corresponds to a scalar multiplication in the final algorithm.

To explore this massive action space (over 10¹² possible moves for a 5 × 5 case) the team built AlphaTensor, a reinforcement‑learning agent based on the AlphaZero architecture:

a transformer‑style neural network ingests the current 3‑D tensor and predicts a policy over actions and a distribution over future returns;
Monte‑Carlo Tree Search (MCTS) samples actions from the policy, evaluates leaf states with the network, and backs up the results;
synthetic demonstrations are generated by randomly sampling factorisations and feeding them to the network, giving the agent a useful prior before any self‑play.

A key trick is to randomise the basis of the tensor at the start of each game. Because tensor rank is invariant under basis change, a low‑rank decomposition found in any basis proves an algorithm for the canonical case, while the random bases inject diversity and expose factorizations that would be invisible in the standard representation.

What AlphaTensor discovered

Training a single agent on all tensors up to size 5 × 5 (both over the integers and over a small finite field) yielded a library of thousands of algorithms. Highlights include:

Matrix size	Standard arithmetic rank	Previous best rank	AlphaTensor rank
4 × 4	47 (mod 2)	49 (Strassen‑two‑level)	47
5 × 5	76	80	76
3 × 3	23 (re‑discovers Laderman)	23	23

For the 4 × 4 case the algorithm uses 47 scalar multiplications, improving on Strassen’s two‑level method (which needs 72 scalar multiplications when applied recursively). When the 4 × 4 routine is used recursively, the asymptotic complexity drops to (O(n^{\log_2 47})), a modest but concrete improvement over the classic (O(n^{\log_2 7})) bound.

AlphaTensor also found non‑equivalent factorizations – more than 14 000 for the 4 × 4 tensor alone – showing that the space of optimal algorithms is far richer than previously assumed.

Tailoring algorithms to hardware

The same framework can optimise for runtime instead of rank. By adding a terminal reward equal to the negative of the measured execution time on a target device, AlphaTensor learned algorithms that run faster on specific accelerators.

On an Nvidia V100 GPU the agent discovered a 4 × 4 block algorithm that beats the hand‑tuned Strassen‑square implementation by 3.2 % on an 8 192 × 8 192 multiplication.

On a Google TPU‑v2 the same approach yielded a 2.8 % speed‑up over the baseline cuBLAS‑style implementation.

The discovered GPU algorithm performs more additions than Strassen‑square, but the extra additions are arranged in a way that the XLA compiler can fuse them efficiently, turning the theoretical disadvantage into a practical win.

Broader implications

AlphaTensor proves that deep reinforcement learning can handle exact, combinatorial mathematics at a scale where traditional SAT or continuous‑optimization methods stall. The methodology is not limited to plain matrix multiplication:

It can be applied to structured bilinear operations such as skew‑symmetric matrix‑vector products, where AlphaTensor produced an asymptotically optimal (O(n))‑multiplication algorithm.
By swapping the reward function, one could target numerical stability, energy consumption, or even memory footprint – metrics that matter for edge devices.
The same game formulation extends to other tensor‑rank problems (border rank, non‑negative factorisation), opening a path toward automated discovery in algebraic complexity theory.

Where to find the code and data

The full set of discovered algorithms, along with JAX implementations for the GPU/TPU experiments, is released under an open licence:

Repository: https://github.com/deepmind/alphatensor
Interactive notebook for checking non‑equivalence: https://github.com/deepmind/alphatensor/notebooks/check_equivalence.ipynb
Detailed benchmark scripts: https://github.com/deepmind/alphatensor/benchmarks

Outlook

AlphaTensor’s success rests on three ideas that are likely to influence future research:

Game‑based formulation – turning a mathematical discovery problem into a reinforcement‑learning game makes it compatible with self‑play and MCTS.
Synthetic demonstrations – generating exact factorizations at random gives the network a strong prior, dramatically reducing the number of expensive self‑play games.
Transfer across targets – training a single agent on many tensor sizes enables knowledge sharing, so improvements on a 3 × 3 case help the 5 × 5 case.

The main limitation is the need to pre‑define a discrete coefficient set (F) (e.g., {-2,‑1,0,1,2}). Extending the search to learn the coefficient set itself could uncover yet more efficient factorizations.

In any case, AlphaTensor shows that reinforcement learning is ready to move from games to genuine mathematical research, offering a new tool for the community that studies the fundamental limits of computation.

This article was written for HackerNoon’s “New Story” series and reflects a skeptical but curious view of the startup‑like research ecosystem surrounding AI‑driven algorithm discovery.

#reinforcement learning #matrix-multiplication #Deep Learning #tensor-algorithms #GPU Optimization

AlphaTensor Shows How Deep Reinforcement Learning Can Find Faster Matrix‑Multiplication Algorithms

AlphaTensor Shows How Deep Reinforcement Learning Can Find Faster Matrix‑Multiplication Algorithms

The problem: multiplying matrices faster

AlphaTensor’s approach

What AlphaTensor discovered

Tailoring algorithms to hardware

Broader implications

Where to find the code and data

Outlook

Comments