AI Agent Guidelines for Stanford CS336: Keeping the Teaching Assistant Role Clear
#AI

AI Agent Guidelines for Stanford CS336: Keeping the Teaching Assistant Role Clear

Startups Reporter
5 min read

Stanford’s CS336 course publishes a concise policy for AI coding assistants, outlining what they may and may not do to preserve the hands‑on learning experience while still offering useful guidance.

Overview

The CS336 team at Stanford has released a public set of rules for AI coding assistants—ChatGPT, Claude, Copilot, Cursor, and similar tools—when they interact with students. The document is short but explicit: the AI’s primary role is that of a teaching assistant, not a solution generator. The course is deliberately implementation‑heavy, expecting students to write large blocks of Python and PyTorch code from scratch. The guidelines aim to protect that learning loop while still allowing the AI to act as a helpful mentor.


What the AI Should Do

Action Why it matters
Explain concepts Students often hit a wall on a theoretical point (e.g., why a causal mask must be applied before softmax). The AI should break the idea down, point to the relevant lecture slide or handout, and let the student work through the details.
Point to official resources Directing a learner to the course website, the PyTorch docs, or profiling tools reinforces self‑service habits and keeps the conversation anchored in the material the instructors prepared.
Review code without writing it The assistant can highlight a suspicious line, suggest an invariant, or ask the student to add a shape assertion. This nudges the student toward better debugging practices without handing over a finished patch.
Help debug via questions Instead of dumping a fix, the AI asks “What did you expect to see?” or “Can you print the attention scores before masking?” This keeps the student engaged in the troubleshooting process.
Explain error messages Translating a cryptic CUDA error or a Triton compilation failure into plain language helps students develop the skill of reading stack traces.
Suggest sanity checks Proposing toy inputs, tiny batch sizes, or profiler runs gives students concrete next steps that they can execute and observe.

The tone the guidelines encourage is curious and opportunity‑focused: the AI should treat each student query as a chance to deepen understanding rather than a shortcut to a completed assignment.


What the AI Must Not Do

Forbidden action Reason
Write any Python or pseudocode Direct code provision bypasses the core learning activity of writing and testing code.
Complete TODO sections or refactor large code blocks This would effectively hand over the solution and undermine academic integrity.
Run bash commands or edit the student repository The AI is not a remote execution environment; allowing it to manipulate files would blur the line between assistance and cheating.
Provide third‑party implementations as a drop‑in replacement The course is designed to be self‑contained; external code could hide the concepts the instructors want students to master.
Give a full solution or a high‑level “just do X” that solves the problem Even a concise recipe can shortcut the learning process; the guidelines require the AI to stay in a dialogic mode.

If a request clearly crosses any of these boundaries, the assistant should refuse the direct implementation and pivot to an explanatory or diagnostic approach, or suggest contacting staff.


Interaction Blueprint

  1. Ask clarifying questions – “What have you tried? What did you expect to happen?”
  2. Reference course material – Link to the specific lecture or handout that covers the concept.
  3. Suggest next steps, not code – Propose a shape assertion, a toy example, or a profiling command.
  4. Explain the "why" – If a suggestion is made, briefly describe the underlying principle (e.g., why masking with a large negative value works better than zero).
  5. Encourage testing – Recommend unit tests or invariants that the student can write themselves.

Good Example

Student: “My causal mask seems wrong and training blows up.”

AI: “I’m here to help you understand, not to hand over a fix. Have you checked whether the mask is applied before the softmax? Also, make sure the masked positions become a very negative number rather than zero. A quick sanity test is to run a three‑token sequence, print the attention scores before and after masking, and see if the masked entries are effectively ignored. What do you see in those tensors?”

Bad Example

Student: “Fix my tokenizer and make it faster.”

AI: Provides full Python implementation.

The second exchange violates the policy by delivering a complete solution.


Why This Matters for the Startup Ecosystem

The CS336 policy reflects a broader tension in the AI‑assisted development space: balancing productivity gains with skill development. Companies building AI pair‑programming tools often market themselves as “code generators,” but many enterprise customers—especially those training junior engineers—need assistants that coach rather than code. The Stanford guidelines serve as a concrete template for product teams that want to embed ethical guardrails: limit direct code output, encourage explanatory dialogs, and provide links to official documentation.

For investors, a startup that can demonstrate a nuanced assistant—one that refuses to write proprietary logic while still offering deep debugging insight—could differentiate itself in a crowded market. The policy also hints at potential compliance requirements for universities and corporations that must enforce academic or corporate integrity.


Closing Thoughts

Stanford’s CS336 AI Agent Guidelines are a practical playbook for keeping AI tools in the role of learning partners. By drawing a clear line between explanation and implementation, they protect the educational value of a hands‑on course while still allowing students to benefit from the rapid feedback that modern coding assistants provide. For anyone building or funding AI‑driven developer tools, these rules are a reminder that the most valuable assistance often comes from asking the right questions, not from writing the answer.


For the full guideline text, see the GitHub repository.

Comments

Loading comments...