#Business

A 6,000-Citation Paper on Sustainability and Stock Returns Is Fatally Flawed. The System Refuses to Fix It.

Trends Reporter
6 min read

A landmark study in Management Science, cited over 6,000 times and referenced by Wall Street and government officials, contains fundamental errors that make its conclusions uninterpretable. The story of the replication attempt reveals a systemic failure in academic publishing and research integrity.

A paper in Management Science titled "The Impact of Corporate Sustainability on Organizational Processes and Performance" has become a cornerstone of the business and sustainability conversation. Cited more than 6,000 times, its findings—that high-sustainability firms outperform their peers—have been invoked by Wall Street executives, top government officials, and even a former U.S. Vice President. The paper’s authors, Robert Eccles, Ioannis Ioannou, and George Serafeim, are affiliated with Harvard Business School and London Business School, lending it further authority.

Yet, according to a detailed account by researcher Andy King, the study is built on a methodological foundation so flawed that its results are meaningless. The story of King’s attempt to replicate the paper, and the institutional resistance he encountered, exposes a troubling pattern in how the scholarly community curates—and fails to correct—its own record.

The Replication Attempt and Its Discoveries

King began his replication effort in September 2023, contacting the authors with a list of serious problems he encountered:

  • The reported matching method did not function as described.
  • A key result was mislabeled as statistically significant when it was not.
  • Critical statistical tests appeared to be missing.
  • The sample was highly unusual.

He received no response from the authors. This is a common hurdle. As documented by Bloomfield et al. (2018), requests from replicators are often ignored, and authors can block replication simply by refusing to provide details omitted from the published article.

When King turned to colleagues for help, he found a culture of avoidance. Scholars expressed fear of conflict, cited being overwhelmed, or admitted to being "too much of a coward." One internationally respected scholar told King, "Once a paper is published… it is more harmful to one’s career to point out the fraud than to be the one committing it." This sentiment reveals a system where social cohesion often trumps scientific correction.

Institutional Barriers to Correction

After failing to get a response from the authors, King submitted a comment to Management Science. The journal rejected it. Reviewers did not address the substance of his critique but objected to his "tone," arguing that replicators should "tread very lightly" and that published authors should be granted "discretion."

King appealed the decision twice and was rejected both times. During this process, the authors did admit to the editor that they had misreported a key finding—labeling it as statistically significant when it was not. They claimed this was a "typo," that they intended to type "not significant" but omitted the word "not." The journal published a brief erratum only after King posted about the issue on LinkedIn, a move that prompted the editor to discover the correction had been "misplaced and forgotten."

The Deeper Flaw: An Impossible Method

While revising his replication for publication, King became convinced of a more fundamental problem. The empirical strategy in the original paper rests on a demanding requirement: the "treated" (high-sustainability) and "control" (low-sustainability) firms must be so closely matched that which firm is treated is essentially random. The authors claimed to have used strict matching criteria and reported a 98% match success rate.

King’s replication attempt achieved a match rate of less than 15%. A Monte Carlo simulation he conducted determined that the authors' reported success was "many, many, many times less than winning the lottery." The conclusion was stark: either the matching process was so precise that it wouldn’t yield enough pairs for analysis, or it was loose enough that the analysis could not be interpreted. The reported method could not have been conducted as described, rendering the results uninterpretable.

Research Integrity Offices: The "Ariely Defense"

King submitted a research integrity complaint to Harvard Business School and London Business School in 2025. The authors eventually responded, acknowledging they had misreported their method. They explained it was an "editing error": sentences describing the matching process from an "exploratory" study were inadvertently left in the final paper.

This explanation conflicted with the record. The incorrect claim appeared in the earliest available draft of the article and was retained and edited in later versions. The "exploratory study" itself did not appear in any draft.

The institutional responses were telling. Harvard Business School stated it would not communicate its findings. London Business School concluded the false claim was not an "intentional falsehood" because the professor "did not have access to the raw data and did not conduct the analyses in question." This is what King calls the "Ariely defense"—a claim of innocence based on a lack of direct involvement, which sidesteps the author's responsibility for the published work.

LBS deemed the problem "minor," arguing it did not impact the "main text, analyses, or findings." King counters that the issue is not minor: it is the difference between a usable and useless study. LBS did acknowledge "poor practice" and planned to address it through "education and training."

The Current State and a Call for Reform

Today, the Management Science article remains only partially corrected. Readers who find the erratum about the statistical significance typo will not learn of the misreported method. The paper continues to be cited, influencing policy and investment decisions based on uninterpretable results.

King and the author of the blog post, Andrew Gelman, argue that the scholarly curation system is broken. They propose reforms based on principles of effective self-regulation:

  1. Transparency: Journals and universities should publicly disclose comments, complaints, corrections, and retraction requests.

  2. Independent Audit: An independent third party should audit the integrity process.

  3. Graduated Sanctions: Penalties should reflect the severity of the violation, not be all-or-nothing.

  4. Systemic Support for Correction: They advocate for tools like "FurtherReview," a platform for post-publication review and discussion.

The story is not merely about one flawed paper. It illustrates a culture where correcting errors is seen as career-threatening, where journals prioritize author discretion over methodological rigor, and where research integrity offices may minimize serious issues. The result is a scientific record that is not self-correcting, but self-preserving—even when what it preserves is wrong.

{{IMAGE:1}}

The image, a screenshot from a statistical modeling blog, underscores the technical depth required to uncover such flaws. It represents the kind of rigorous, often tedious, analysis necessary to challenge established claims—a process the current system discourages.

What Can Be Done?

The authors suggest practical steps for the community:

  • Stop citing single studies as definitive; check for replications.
  • Publish corrections when errors are found in your own work.
  • Support replication efforts and journals like the Journal of Management Scientific Reports (JOMSR), which published King's replication.
  • Advocate for stronger research integrity policies at your institution.

The core issue is a fundamental misunderstanding of the scientific endeavor. As Gelman notes, "Science that can’t be fixed isn’t past science; it’s dead science." The reluctance to correct errors, supported by institutional inertia, allows flawed research to stand unchallenged, eroding trust in the entire scholarly enterprise. The case of Eccles, Ioannou, and Serafeim (2014) is a stark example of how that failure manifests, with real-world consequences for the many who have relied on its flawed conclusions.

Comments

Loading comments...