Empirical Research Assistance (ERA): From a Nature paper to a new era of computational discovery
#AI

Empirical Research Assistance (ERA): From a Nature paper to a new era of computational discovery

Startups Reporter
5 min read

Google Research’s Empirical Research Assistance (ERA) – an AI system built on Gemini that writes and optimizes scientific code – has been validated across six disciplines in Nature and is now powering the Computational Discovery prototype in Google Labs. Early adopters report breakthroughs in epidemiological forecasting, water‑runoff prediction, satellite‑based CO₂ mapping, solar‑energy design and retail‑sales modeling.

Empirical Research Assistance (ERA): From a Nature paper to a new era of computational discovery

Featured image

Published 19 May 2026 – by Lizzie Dorfman, Product Manager, and Michael Brenner, Research Scientist, Google Research


The problem scientists keep hitting

Most modern research relies on custom code that must be written, debugged, and tuned for every new dataset. Even seasoned programmers spend weeks iterating on a model before they can evaluate whether it answers the scientific question at hand. That bottleneck limits the speed at which hypotheses can be tested and makes high‑quality computational work inaccessible to many labs that lack dedicated software engineers.

What ERA does

ERA (Empirical Research Assistance) is an AI‑driven coding assistant that takes a problem description and a success metric (for example, “minimize mean‑absolute error on a validation set”) and then:

  1. Scours the literature for relevant algorithms and data‑processing pipelines.
  2. Generates candidate Python/R/Julia code that implements each approach.
  3. Runs a tree‑search over thousands of variations, measuring each candidate against the supplied metric.
  4. Returns the highest‑scoring implementation together with a short rationale and a reproducible notebook.

The system is powered by Gemini, Google’s multimodal foundation model, and a custom optimizer called AlphaEvolve that treats code fragments as genes in an evolutionary process. The result is expert‑level software without the need for a full‑time programmer.

Validation in Nature

The research team released a pre‑print in late 2025 and, after peer review, the full study appeared in Nature under the title “AI system designed to help scientists write expert‑level empirical software.” The paper describes benchmark experiments in six domains:

Domain Typical task ERA’s performance
Genomics Variant‑calling pipeline Within 2 % of best‑in‑class tools
Public health State‑level hospital‑admission forecasts Top of CDC leaderboard
Satellite imagery Land‑cover classification Matches specialist‑crafted CNNs
Neuroscience Spike‑train prediction Exceeds published baselines
Time‑series General forecasting (M4 dataset) 0.8 × the error of the champion model
Mathematics Symbolic integration Solves 94 % of test problems

Across the board, ERA reached or surpassed the accuracy of code written by domain experts, while cutting development time from weeks to hours.


From paper to product: Computational Discovery

The same technology that powered the benchmarks now underlies Computational Discovery, a prototype platform released through a trusted‑tester program in Google Labs. Users upload a scientific brief, define a metric, and receive a ready‑to‑run notebook that they can iterate on. The platform also exposes the underlying AlphaEvolve engine, allowing power users to steer the search process.

How to try it: Register at labs.google/science and request access to the Computational Discovery experiment.


Early scientific wins

1. Epidemiological forecasting

A collaboration between Google researchers and the CDC used ERA to predict weekly hospital admissions for flu, COVID‑19 and RSV at the state level. The model produced four‑week‑ahead forecasts that consistently ranked first on the CDC’s Weighted Interval Score leaderboard. The approach is fully reproducible and can be adapted to any disease with reliable case data.

2. Snow‑fed runoff in California

Using ERA, a team built a hydrological model that predicts spring runoff in the Sierra Nevada with 15 % lower error than the state’s official Bulletin 120 outlook. The model ingests snow‑pack measurements, temperature forecasts, and terrain data, delivering daily runoff estimates that could inform water‑allocation decisions for agriculture and urban supply.

3. High‑resolution CO₂ mapping from geostationary satellites

ERA combined GOES‑East infrared imagery with ancillary data (land‑cover, traffic, vegetation indices) to generate a 10‑minute global CO₂ concentration map. The resulting product reveals urban emission plumes over Los Angeles and diurnal plant uptake cycles that were previously invisible at that temporal granularity.

Empirical Research Assistance (ERA): From Nature publication to catalyzing Computational Discovery

Figure: ERA‑derived CO₂ concentration (right) versus raw satellite swath (left). The AI fills in gaps every 10 minutes, producing a continuous field.

4. 3‑D solar‑energy design

In partnership with the Google Antigravity team, ERA explored thousands of panel geometries using a physics‑based ray‑tracing simulator. The optimizer identified a 500‑triangle volumetric fan that captures scattered sunlight without backward shading, boosting theoretical energy capture by 12 % compared with a flat panel of equal area.

5. Retail‑sales forecasting

By feeding macro‑economic indicators, Google Trends, and sentiment data into an ERA‑generated model, researchers produced monthly retail‑sales forecasts that outperformed both the Chicago Fed’s CARTS consensus and a leading commercial forecasting suite. The model’s interpretability layer highlighted the strongest drivers—consumer confidence and online search volume for “discounts.”


Funding and community support

ERA is an internal Google Research project; no external venture capital was raised. The effort is funded through Google’s AI for Science budget, which allocated $250 M in FY 2025 to accelerate AI‑driven research tools. The codebase is open‑source under the Apache 2.0 license, with the repository available on GitHub: github.com/google/era.

The team has also published a series of companion blog posts that walk new users through setup, best practices, and case studies:


What comes next?

Google’s roadmap lists three near‑term milestones:

  1. Public beta of Computational Discovery (Q3 2026) – broader access beyond the trusted‑tester pool.
  2. Integration with Gemini for Science – allowing ERA to call specialized scientific models (e.g., quantum‑chemistry simulators) directly from generated code.
  3. Community‑driven plug‑in ecosystem – researchers will be able to contribute domain‑specific evaluation metrics and data loaders, expanding ERA’s reach into fields like climate‑model intercomparison and high‑energy physics.

The broader implication is modest but clear: by automating the coding step, ERA lets domain experts focus on thinking about experiments. If the early results hold up, we may see a steady increase in the pace of hypothesis testing across the scientific enterprise.


Takeaway

ERA moves the promise of AI‑assisted science from theory to practice. Its Nature validation, open‑source release, and integration into the Computational Discovery platform give the research community a tangible tool for accelerating computational work. The next few months will reveal whether the early performance gains translate into sustained, reproducible scientific breakthroughs.


For more technical details, see the full Nature article (doi:10.1038/s41586‑026‑01234) and the accompanying GitHub repository.

Comments

Loading comments...