Objective measures aren't perfect at predicting real-life clinical ability

“Can we please try to be objective about this!” I said these words to myself over and over during this year’s interview season as we formulated our residency rank list. At my institution, the residents and faculty have equal sway in forming the rank list. The chief resident facilitates the resident half of the process. As the hours wore on during our last meeting, the discussion gradually deviated from assessing the applicants’ clinical potential. The focus shifted to peripheral elements of character: “[Applicant] is just really cool, and I want them to be my friend,” and similar arguments. I was concerned.

One’s ability to get along with other residents is important. However, it isn’t the only factor to consider when ranking residency applicants. You want a clinically solid incoming class who know their stuff and can keep up with the rigors of residency. Trying to move away from haphazard character judgments, I focused on applicants’ board scores and clerkship grades. Consider a situation where applicants “A” and “B” were equally liked by resident interviewers and were similarly involved in extracurriculars. The fair thing to do would be to use their “stats” to sort out who deserved to be ranked higher. Right? And then I started thinking about those criteria.

I realize that this site is read by individuals of many different backgrounds and at varying stages of medical training. Depending on who you are, the ensuing comments might be controversial, or maybe they are old news. The first, potentially obvious observation: Objective measures aren’t perfect at predicting real-life clinical ability.

“Objective” measures aren’t the be-all, end-all to ranking

In my experience, USMLE and COMLEX scores (especially Step 1) correlate only loosely with clinical ability. Step 1 isn’t written to be a primary discriminating factor in the residency selection process. Beyond pass/fail, we should stop caring as much about the three-digit score. Seriously, how many practicing physicians need to remember the Krebs cycle? I am not the first one to suggest Step 1 be de-emphasized.
Although of questionable clinical significance, board scores certainly do correlate with demographic characteristics. Namely, higher board scores are associated with having a non-minority background and speaking English as a first language. Incorrect use of board scores in the ranking process solidifies bias against nontraditional and diverse applicants.
I argue that clerkship grades are not much better. Often they are heavily weighted by standardized testing and thus subject to similar biases as is the USMLE. Clerkship standards also vary from institution to institution. Clerkship performance has some validity internal to the specific medical school, but I would argue that it has less external validity when comparing across schools.

Toward eliminating bias

I cannot say whether there is one completely perfect way to conduct applicant ranking. The Match process provides a decent place to start. I especially appreciate the standardization it brings to the process. But the Match doesn’t eliminate all sources of bias, especially when it comes to test scores and clerkship grades. Despite many efforts to increase diversity in medical education, the proportions of under-represented minorities in medical schools haven’t changed much since 1980.

Consider the imperfections inherent in the metrics we use to evaluate medical students. This should beget humility in the resident selection process. Humility that discovers and celebrates applicants’ life experiences. Humility that de-emphasizes the “objective” measures we use to evaluate our applicants. Humility that breeds a culture of curiosity rather than exclusivity.

The Match this year happened months ago, but soon enough, Fall will return and with it, another interview season. If you are involved in a residency, I hope these thoughts challenge next year’s interview process. Lower your USMLE thresholds for interviews. See past the test scores. Look for character. Above all, I propose emphasizing a central evaluation criterion: An applicant’s potential to serve their community and the common good with their medical training.

Scott Hippe is a family medicine chief resident who blogs at Insights on Residency Training, a part of NEJM Journal Watch.

Image credit: Shutterstock.com