When AI Graded Its Own Homework: An English Professor's Classroom Experiment with ChatGPT

A University of Virginia professor conducted a semester-long experiment, allowing students to use ChatGPT for writing assignments while challenging them to decide if AI could replace human instruction. The results reveal surprising insights about AI's limitations in fostering creativity, the ethics of 'Centaur' collaboration, and why 68 out of 72 students ultimately voted to keep their professor.

For Dr. [Name Redacted], an English professor at the University of Virginia, the rise of ChatGPT among his students – affectionately dubbed 'Chat' – wasn't just a grading challenge; it was an existential question for his discipline. Rather than banning AI outright, he launched a bold experiment across four sections of his first-year writing course, involving 72 students, asking a fundamental question: Is it still necessary or valuable to learn to write? More provocatively, he tasked them with deciding, by semester's end, whether AI could replace him.

The experiment began with a baseline survey probing student ethics and habits. While students overwhelmingly rejected the idea that using a calculator in math class was unethical, attitudes towards AI in English were far more ambivalent. Yet, despite reservations, students freely admitted using AI for brainstorming (56%), proofreading (50%), interpreting prompts (38%), outlining (28%), and even drafting (smaller percentages). "My students seemed genuinely confused," the professor observed, "far from being nihilistic... cheaters."

Flavorless Prose and the Centaur Chess Dilemma

Students completed parallel assignments with and without AI, quickly developing a critical eye for AI-generated text. They derided its output as "flavorless" and "bland," noting telltale signs like an overuse of em-dashes, predictable three-example sentences, and a tendency to hallucinate sources. This critical engagement, ironically, turned them into close readers.

Guided by an MIT essay advocating against a dystopian "human vs. machine" narrative, the class explored the concept of "Centaur Chess" – where human-AI teams outperformed solo supercomputers. The analogy suggested AI could act as a personalized coach, like training wheels, helping students learn until the support became obsolete. "Once one accepts the merits of a 'Human with Machine' narrative," the MIT authors argued, "the threat starts to disappear."

The Creativity Conundrum and the Homogenization Hazard

Testing this theory, the class examined a 2024 study showing AI-assisted stories rated 8-9% higher for "novelty" and "publishability," especially benefiting weaker writers. However, a critical flaw emerged: AI-assisted stories became strikingly similar. When replicating the study, writer Annalee Newitz found AI frequently defaulted to tropes like "the real treasure was..." The professor highlighted this as a "classic social dilemma": individual gains at the expense of collective diversity. A stark demonstration came when students read aloud AI-generated essay topics – nearly all variations of "Navigating the Digital Age" or "From Connection to Distraction." The bland homogeneity silenced the room.

The Snowball Fight That Fooled Everyone and the Grading Gauntlet

The experiment took a twist with Max's essay. He presented two introductions describing a UVA snowball fight: one vivid, personal, and human-seeming (featuring a meet-cute with a kind-eyed girl), the other serviceable but clunky. The class unanimously voted the evocative one as human. Max revealed ChatGPT wrote the first version. The class was stunned, forcing a reckoning with AI's ability to leverage powerful narrative tropes (like romance) to bypass critical detection.

The professor then turned the lens on himself: Could AI grade essays? Students received feedback from both their professor and an AI of their choice, then revised using whichever they preferred. While most students preferred the professor's feedback (noting AI's unhelpful fixation on "improving transitions"), many found AI's advice comparable and faster. Student Cruz achieved a "Centaur" approach by feeding the professor's comments to ChatGPT for rapid revision strategies, effectively merging human insight with AI scalability.

The Vote: Diminishing Returns or Essential Craft?

Ultimately, 68 of 72 students voted affirmatively: the writing course, and the professor, were still necessary. Essays revealed nuanced reasoning. Many, like Hannah and Andrew, planned to use AI less after seeing its limitations. Others, like Max (whose AI snowball fight essay fooled everyone), shifted towards accepting AI as a tool within the writing process, citing Centaur Chess, but still voted to keep the course.

However, four students voted against necessity. Nathan, citing his "excellent" pre-college education, questioned the ROI of further writing refinement: "at what point is AI simply 'good enough'?" Drew framed it bluntly: spending $5000+ and 2100+ minutes for "incremental improvements" wasn't worth it compared to career-focused courses; requiring writing for all was like "insisting someone master starting a fire with flint in an era of propane lighters."

Lingering Questions: Crutches, Voices, and Trust

Some male students used AI to write essays arguing for the course's value, their telltale AI prose (three-part lists, clichés like "tearing my hair out") ironically undermining their point. Student Cam offered perhaps the most poignant reflection: previously using AI on "almost every... assignment," the experiment revealed it had become a "crutch," leaving her unable to edit, jumpstart thoughts, or write unaided. Her literal use of crutches during the semester gave the metaphor weight.

"I have no doubt that reading and writing will survive without the help of college," the professor concludes, "but at its best, college offers students the opportunity to learn these skills with, and from, one another." The experiment revealed AI's potential as a powerful, yet deeply flawed, tool – capable of homogenizing thought and undermining foundational skills if unchecked. The students' overwhelming vote to retain human instruction signals a recognition that writing is more than throughput; it's intrinsically tied to developing voice, critical thought, and human connection, processes AI cannot replicate but might, carefully managed, augment.

Source: Adapted from the original article published on Literary Hub (Source URL: https://lithub.com/what-happened-when-i-tried-to-replace-myself-with-chatgpt-in-my-english-classroom/).