The interview debrief is where good hires die. You can run four careful interviews, capture clean independent signal, and then destroy all of it in a 30-minute meeting where the first confident sentence sets the verdict. An interview debrief is the meeting where a hiring panel converts its separate assessments into one decision, and it is the single most neglected, highest-leverage step in hiring. The interview measures the candidate. The debrief decides their fate, and it is almost always run on memory and seniority instead of evidence.

The problem is not that interviewers fail to gather signal. It is that the debrief aggregates that signal *socially* rather than statistically. Whoever speaks first or loudest anchors the room. When the hiring manager, usually the most senior person present, states their vote before anyone else, junior interviewers quietly drift toward it. The evidence was in the room. The meeting threw it away.

## Why the debrief, not the interview, is where hiring breaks

Interviews can produce good independent signal. The debrief is where that signal gets pooled, and pooling people in a room introduces three well-documented failure modes: anchoring, conformity, and seniority deference.

Anchoring bias, first formalized by Tversky and Kahneman in *Science* (1974), means the first number or judgment introduced disproportionately shapes everyone else's. In a debrief, the first confident "I think this is a hire" becomes the gravitational center the rest of the conversation orbits. The halo effect, documented by Thorndike all the way back in 1920, compounds it: one salient trait, a great answer, an impressive logo on the resume, colors the entire rating.

Now stack seniority on top. The most senior voice usually speaks first and carries the most weight, so junior interviewers face a choice between contradicting their boss with a half-formed argument or nodding along. Most nod. The result is a meeting that looks like consensus but is actually one person's opinion wearing four people's faces. The signal from the other three was real. It just never made it onto the table.

## How do you run an interview debrief?

Run an interview debrief in five steps that protect the independent signal you already collected:

1. **Lock independent scorecards before the meeting.** Each interviewer submits an evidence-tied score within about 24 hours of their interview, with no editing once the debrief opens.
2. **Vote before you discuss.** Open with a silent hire / no-hire tally, not an open conversation.
3. **Go junior to senior.** The most junior interviewers speak first; the hiring manager speaks last, so seniority can't anchor the room.
4. **Discuss only the disagreements.** Spend time where votes split, and tie every claim to specific evidence. Reject "I just had a bad feeling."
5. **Record the decision.** Write down the outcome, the votes behind it, and who made the call, as structured data with an audit trail.

The order is the whole point. Capture happens before influence, voting happens before discussion, and the loudest voice goes last. Everything below explains why each step earns its place.

## The science: structure beats vibes

The evidence on structured versus unstructured hiring is among the most replicated in industrial-organizational psychology, and it points one direction: structure wins.

Schmidt and Hunter's landmark 1998 meta-analysis in *Psychological Bulletin* put the predictive validity of structured interviews at roughly **.51** against roughly **.38** for unstructured ones. That gap is the difference between a hiring method that reliably tracks future performance and one that mostly tracks how much the interviewer liked the candidate.

Reliability is even more damning. Conway, Jako and Goodman's 1995 meta-analysis in the *Journal of Applied Psychology*, pooling 111 reliability estimates, found interrater reliability for unstructured interviews averaging about **.37**. Read that plainly: two interviewers watching the same unstructured candidate barely agree with each other. If your panel's raw assessments don't agree, a debrief that just averages their gut feelings is averaging noise.

Structure, meaning standardized questions, anchored rating scales, and a defined rule for combining ratings, is the single biggest lever for both validity and agreement. The debrief is where that combining rule either exists or doesn't. Most debriefs don't have one. They have a conversation.

> If two interviewers watching the same unstructured candidate agree only about a third of the time, a debrief that averages their feelings is averaging noise. Structure is the only thing that turns four opinions into one signal.

This is the same logic behind a [structured interview scorecard](/blog/structured-interview-scorecards-predictive-validity): the scorecard is the artifact that makes validity possible, and the debrief is where you either honor it or override it with the loudest opinion in the room.

## The three levers that protect the signal

Every fix for a broken debrief is upstream of the conversation. You cannot debias a meeting from inside the meeting once anchoring has already happened. There are three levers, and all three operate *before* anyone speaks.

### Lever 1: Lock scorecards before the meeting

Independent scores, submitted before the debrief and frozen once it starts, block anchoring, halo, and post-hoc rationalization in one move. If a junior interviewer commits to a 3-out-of-4 on a specific competency, with notes citing what the candidate actually said, that score can't quietly migrate to the manager's number during the meeting. The commitment is already on record.

The mechanism is well supported even where a single clean statistic isn't: independent judgment formed before exposure to the group is the standard countermeasure to anchoring and conformity. The practical rule is simple. No editing after the debrief opens. A score you can revise once you've heard the room isn't an independent score.

### Lever 2: Round-robin, junior first

Speaking order is not a courtesy. It is a bias control. When the hiring manager speaks last, their seniority can't set the anchor, and junior interviewers state their real assessment instead of a pre-edited version of it. Go around the table from least senior to most senior, every time.

### Lever 3: Vote, then discuss

Start with a tally, not a debate. A quick hire / no-hire vote surfaces where the panel actually disagrees in about ten seconds, and then you spend the meeting only on the splits. Unanimous calls don't need 20 minutes of discussion; they need to be recorded and closed. The disagreements are where the real information lives, and they're where evidence, not volume, has to win.

<div class="blog-inline-cta">
  <p><strong>Tired of debriefs that run on memory?</strong> Kit captures each interviewer's review independently, applies an explicit decision rule instead of the loudest voice, and surfaces only the close calls that need a human.</p>
  <p><a href="/users/sign_up">Start your free trial</a></p>
</div>

## The 80% rule: a self-diagnostic for your debriefs

Here is a number you can run against your own team this quarter. Jill Macri, the former global recruiting lead at Airbnb and now at Growth by Design Talent, uses roughly **80%** as the benchmark: at least four out of every five debriefs should end in a clear hire or no-hire call. If yours don't, the problem isn't the candidate pool. It's the rubric or the loop.

This is a practitioner heuristic, not a peer-reviewed law, so treat it as a diagnostic rather than a target to game. But the logic is sound. When a debrief can't reach a clear decision, it usually means one of two things: the panel was never aligned on what the role actually requires, or the hiring manager doesn't yet know what they need. Both are fixable, and both are invisible until you start counting.

So count. What percentage of your debriefs in the last quarter ended in a clean call versus a "let's bring them back for one more conversation"? If it's well under 80%, you have a calibration problem masquerading as a candidate problem, and no amount of additional interviews will solve it. The cost of getting this wrong is real: a bad hire is widely estimated to cost at least **30% of the employee's first-year salary**, a figure commonly attributed to the U.S. Department of Labor and best treated as a conservative floor. For senior roles, fuller estimates run far higher. The debrief is the last gate before that cost locks in.

## You don't need more interviewers, you need independent ones

The instinct when a hire feels risky is to add another interview. The data says don't. Google's internal analysis, documented by Laszlo Bock in *Work Rules!* and credited to analyst Todd Carlisle, found that **four interviewers predict a candidate's performance with about 86% reliability**, and each additional interviewer beyond the fourth adds **less than 1%**.

Read that carefully: four *interviewers*, meaning four independent evaluators contributing separate scores, not four sequential rounds of conversation. The reliability comes from combining independent signals, not from more debate or more meetings. A fifth, sixth, or seventh interviewer mostly adds scheduling drag and candidate fatigue while contributing almost nothing to the decision. This is the same reason a [bloated interview loop loses your best candidates](/blog/too-many-interview-rounds-lose-best-candidates): more rounds feel safer and aren't.

The lesson for the debrief is direct. Your job in that meeting is not to generate new consensus through discussion. It is to *aggregate the independent signal you already have* using a rule everyone agreed to in advance. Four good independent scores, combined properly, beat ten people talking until someone gives in.

## Turn the debrief into a logged, calibrated decision

A good debrief is a data-integrity problem, and the integrity has to be enforced *before* the meeting, not requested during it. You can ask people nicely to score independently and speak junior-first, or you can build a process that makes the right behavior the default. This is exactly where [Kit](/users/sign_up) sits.

Kit's **`team_review` stage** handles async team voting and scoring with blind-review visibility rules. Each interviewer submits an evidence-tied review, and a junior interviewer sees what they need to score the candidate *without* seeing the manager's verdict first. That's "lock the scorecard before the meeting," enforced by the data model instead of by good intentions. The independent score is captured before influence can reach it.

When it's time to decide, Kit applies an **explicit decision rule** rather than a naive average. Per-interviewer scores are visible, a threshold is applied, and a strong dissent isn't washed out by three lukewarm yeses or a single confident voice. The vote tally is preserved, not blended into a number that hides the disagreement.

Most importantly, Kit surfaces exactly the decisions that need a human. The **pending-decisions queue** returns the reviews that *didn't* resolve cleanly: split votes, below-threshold scores, or a lead's veto, each row showing the tally, the threshold, whether a lead vetoed, and how long it's been waiting. That's the operational form of the 80% rule. The roughly 20% that don't resolve cleanly are explicitly queued for deliberate resolution instead of being rammed through by whoever spoke first. When the call is made, Kit records it as **structured data with an audit trail**, so the decision, the votes behind it, and who made it are all recoverable later.

The recommended protocol, end to end: each interviewer submits an independent, evidence-tied score in a blind `team_review` before the debrief, with no editing afterward. Open the meeting with the tally already locked. Discuss only the disagreements, junior voices first, every claim tied to specific evidence. Let split, below-threshold, and vetoed cases route to the pending-decisions queue and resolve them deliberately. Track your clear-decision rate, and aim for 80% or higher.

Stop running debriefs on memory and seniority. Capture independent reviews blind, let the system flag the close calls, and record the decision with the evidence attached. The interview is where you collect the signal. The debrief is where you either keep it or throw it away, and now you can keep it. [Start a free trial](/users/sign_up) and run your next debrief as a calibrated decision, not a popularity contest.