Code Review

This course does not have a written final exam. Instead, your team will undergo a live code review during either the last week of classes or the final exam slot. This is the single highest-signal assessment in the course: it tells me whether you understand the system you built.

Format

Your team will be assigned a 20-minute window during the last week of class. All team members must be present. You will screen-share your repository and walk through your pipeline with me.

The session has two parts.

Part 1: Team Walkthrough

5 minutes

Your team will present a guided walkthrough of your complete pipeline, from input to output. You choose the structure of this walkthrough. You may divide it by pipeline stage, by module, or by team member responsibility — whatever best tells the story of your system.

During the walkthrough, I will not interrupt. I am listening for three things:

Can your team explain how data flows through the system from raw input to final output without hand-waving over any stage?
When you describe a design choice, do you explain why you made it, not just what it does?
Do you acknowledge where your pipeline is fragile, where your assumptions are weak, or where your results should be interpreted with caution?

Part 2: Individual Questions

15 minutes

After the walkthrough, I will ask questions directed at individual team members. You may not defer to a teammate. If you do not know the answer, say so—that is always preferable to guessing.

These questions will be drawn from your own codebase, your sprint reports, and your commit history. They will focus on the following areas.

Implementation decisions

If you wrote a function, I may ask you to explain a specific block of code line by line. I may ask why you chose one library or method over an alternative. I may ask what happens if a particular input is malformed or missing.

Debugging and failure

If your sprint report documents a struggle, I may ask you to walk me through it again in real time. I may point to a specific commit and ask what you were thinking when you wrote it. I may ask what you would do differently now.

Scientific reasoning

I may ask you to interpret a result from your pipeline. I may ask what a particular score or metric means biologically. I may ask how you would explain your findings to the project sponsor.

Connections across the pipeline

I may ask how your work depends on a teammate’s module, or how a change in one stage would propagate downstream. This tests whether you understand the system as a whole, not just your assigned piece.

Grading

If the code review reveals that a team member cannot explain the work attributed to them in the commit history and sprint reports, their individual code-review grade will be adjusted. If the code review reveals that the entire team has a shallow understanding of a pipeline that appears sophisticated in the repository, the team’s grade will be adjusted accordingly.

Conversely, if your pipeline has rough edges but your team demonstrates deep understanding of why things work the way they do, where the limitations are, and what you would improve given more time, that understanding will be reflected positively in your grade.

I am not looking for perfection. I am looking for ownership.

How to Prepare

You do not need to study for this review. If you wrote the code, debugged the errors, and documented your struggles honestly throughout the semester, you already know everything I am going to ask.

The best preparation is to re-read your own sprint reports and walk through your own commit history. If there are sections of the codebase you have not touched or do not understand, now is the time to sit with a teammate and have them explain it to you; not so you can recite it back to me, but so you actually understand the system you are delivering.

The Purpose of This Review

I want to talk to you directly for a moment, not as your instructor evaluating your work, but as someone who has spent years in both software engineering and research and who cares about what happens to you after you leave this classroom.

I know that large language models can write functional code. I know that some of you have used them during this course. I am not naive about this, and I am not going to pretend otherwise.

What I want you to understand is what the evidence actually says about what that costs you, and why this review exists.

The research is not ambiguous

A 2025 randomized controlled trial with nearly 1,000 students found that students given unrestricted access to GPT-4 performed 48% better during practice — and 17% worse on exams when the tool was taken away. They had learned less than students who had no AI at all. Worse, they did not realize it. They rated their own learning just as highly as the control group.

An MIT Media Lab study tracked brain activity over four months and found that LLM users showed progressively weaker neural engagement. When the LLM was removed, their cognitive activity did not bounce back. The researchers describe this as “cognitive debt”: short-term convenience purchased at the cost of deeper encoding.

A Gallup/Walton Foundation survey of 2,500 adults your age found that 79% believe AI makes people lazier and 62% worry it makes people less intelligent — and yet 74% used it in the past month anyway. One in six reported using it at work even when told not to. Your generation is not unaware of the problem. You are caught in it.

What is happening in industry

A Stanford Digital Economy Lab study tracking millions of U.S. workers through ADP payroll data found that employment for software developers aged 22–25 has declined nearly 20% from its late-2022 peak, while employment for developers aged 35–49 has grown.

Entry-level hiring at major tech companies dropped roughly 25% year-over-year in 2024. In an industry survey of 800 hiring managers, 70% said they believe AI can do the work of interns. The junior roles that used to be the on-ramp to a career are disappearing.

Think about what that means for you. The people getting hired are the ones who can do things that LLMs cannot: reason about why a system fails, debug a problem they have never seen before, make architectural decisions under uncertainty, and explain their thinking to a team. Those skills are not built by watching an LLM generate code. They are built by struggling through problems yourself, getting stuck, forming hypotheses about what is wrong, and working through it until something clicks.

That is exactly what this course is designed to make you do. This course exists to give you one last structured opportunity to build real skill before you are on your own.

The discomfort of not knowing what to do next, and then working through it, is not a flaw in the course design. It is the entire point. The confidence you need for your career is not built by getting the right answer. It is built by proving to yourself that you can figure things out when no one hands you the solution.

Martin Schwartz’s one-page essay, “The Importance of Stupidity in Scientific Research,” makes this argument better than I can. If you have not read it yet, now is a good time.

This is not a punishment

It is preparation.

You are graduating into a market that is more competitive than any in recent memory. Graduate school admissions committees and hiring managers are going to put you in a room, alone, and ask you to demonstrate what you know. If your knowledge is shallow (e.g., if you can describe what your pipeline does but not why, if you can recite a workflow but not debug it) that will become visible immediately. The code review simulates that moment.

A codebase can look clean, well-documented, and functional and still represent very little learning. If an LLM wrote your docking wrapper, you will not be able to explain why you chose those parameters. If an LLM designed your data pipeline, you will not be able to tell me what breaks when the input format changes. If an LLM generated your filtering logic, you will not be able to reason about what a PAINS flag actually means chemically. The code review is where understanding becomes visible. There is no shortcut for it.

I’m not saying to never use AI

I use it. Most working scientists and engineers use it in some capacity. But there is a critical distinction that one of your peers articulated well: the first time you do something, you should do it yourself. The fifth, tenth, fifteenth time, AI can increase your productivity because you have the mental model to evaluate whether its output is correct. You cannot evaluate what you do not understand, and you cannot understand what you have never struggled through.

The core insight is simple: using an LLM to bypass the work feels efficient in the moment, but it is borrowing against your future competence. In science, that is not just an academic problem. A wrong computational results is not a bad grade. It is bad science that wastes months of wet-lab validation and erodes trust with your collaborators.

Last updated on March 18, 2026