Arthur C. Clark and Stanley Kubrick predicted supercomputers more intelligent than humans. In 2001: A Space Odyssey, HAL states, with typical human immodesty, “The 9000 series is the most reliable computer ever made … We are all, by any practical definition of the words, foolproof and incapable of error.” Forty years later, IBM’s Watson pummeled humans in Jeopardy – a distinctly human game.
Watson is a big shot oncology fellow at MD Anderson; he is already impressing nurses and the attendings. The supercomputer presented patients in the morning rounds, parsed data within seconds, and made few mistakes. The real oncology fellow, the human I mean, flabbergasted by the efficiency of his binary colleague, relayed to the Washington Post, “Even if you work all night, it would be impossible to be able to put this much information together like that.” Watson doesn’t have to worry about duty hour restrictions.
CEO of IBM, Ginni Rometty, claims that Watson 2.0 will interpret medical imaging like a radiologist. In its third iteration, the supercomputer will “debate and reason.” Why hire radiologists who sap productivity with lunch breaks and sleep? Watson will never complain about the dearth of vegan food in the cafeteria, never get tired, and — best of all — never whine about Medicare reimbursement cuts.
But forgive me for snoring at night without fear of the robo-radiologist. The reasons are simple.
There are tasks that a toddler can do with no easy computational solution, like recognizing moms, dads, and aunts. “Aunt Minnies” are diagnoses that can instantly be identified the same way you might recognize the face of your aunt in a crowd. These are the easiest diagnoses for a radiologist but are so difficult for a computer that computer scientists have invented heuristics — shortcuts that trade accuracy for speed. Heuristics are “good enough” algorithms, but may not be good enough for the high stakes in medicine.
Facebook’s facial recognition might pick your face from a group picture most of the time, but it also makes laughable mistakes. Before Watson replaces radiologists, it must meet a higher bar than Facebook. “Might” is not good enough.
Suppose that Watson can spot Aunt Minnies. It must communicate which is does using natural language processing. But medical lingo is anything but natural. A helpful radiology request might read, “75 yo M w/ MM, AAA s/p TEVAR c/b EL on 2/2013 p/w CP r/t back.” A less helpful one: “Unspecified.” Medical lingo has typographical errors, missing punctuations and ambiguous acronyms.
MM: Is that Multiple myeloma? Mediastinal mass? Malignant mesothelioma? Metastatic melanoma? Or Mr. Mean?
During the 2011 Jeopardy competition, a clue asked for the “anatomic oddity of U.S. gymnast George Eyser.” Watson answered, “What is ‘legs?” The correct response: “What is missing a leg?” Watson misinterpreted the question because it knew “anatomic” but not “oddity.”
Misunderstandings in medicine do not happen from bad grammar and split infinitives. Misunderstandings come from wrong context. Radiologists do not get partial credit for “pulmonary embolism” when the right diagnosis is “no pulmonary embolism.”
Watson does not need vacations, reimbursement or oxygen. It may not need to physically exist: Watson already works from the cloud. But its only way of maximizing utility is through a radiologist, not instead of one. The computer can be a decision support tool, not a doctor.
Howard Chen is a radiology resident. This article originally appeared in the Health Care Blog.
Image credit: Wikipedia