PubMed misses the big picture when it comes to nutrition

As medical librarians, we’re certainly the first to say that PubMed is a superb database, elegantly crafted at the National Library of Medicine to do fast and efficient searches for almost all medical and health subjects. Much of the power of PubMed is that it makes it possible to search broad subjects easily. When the user searches “cancer,” for example, PubMed quickly finds thousands of citations on all types of cancer, from melanoma to leukemia, whether the word “cancer” appears in the citation or not. Likewise, a search for “antidepressants” finds articles on all specific types of antidepressants. The seeming simplicity of these kinds of broad subject searches, of course, belies the amount of intellectual effort that’s expended to classify subjects into searchable “bundles.”

As much as PubMed is a superb resource, there are a few subjects that are not easy to search. One of the most notable of these, especially important with the obesity epidemic and the renewed awareness of the importance of preventive medicine, is nutrition. The basic reason that nutrition is so hard to search in PubMed is that, unlike the subjects mentioned above that are organized into easily searchable bundles, nutrition is broken out into many different places in the Medical Subject Headings classification or “tree.” Contrary to expectations, the largest group of related citations is not under “nutrition,” but rather under “food.” And the second largest bundle is under “diet.”

Ironically, the high quality of indexing in PubMed probably contributes to the difficulty of searching nutrition. With the generally superior organization of subjects in PubMed, people may be lulled into thinking that the same tight organization also applies to nutrition. So a person not experienced at searching for nutrition-related topics might search for one nutrition-related term (nutrition, food, diet), retrieve several thousand citations, and assume that they have covered the field, not realizing that they have missed a large group of citations that are under other terms that they didn’t search.

With the fragmentation of nutrition-related terms in PubMed, to do a comprehensive search, it’s necessary to search several different terms, and combine them into a bundle, or “hedge,” to use medical librarian language. We have developed such a hedge, free for anyone to use, which includes all nutrition-related terms, and this generally works well. However, there are some articles that are not retrieved by the hedge because their indexing contains nothing that marks them as being part of any general nutrition subject. This is especially a problem in articles on specific foods, particularly plant-based foods. Here are some examples from a series of our blog articles:

Chocolate. A particular problem in searching plant-based foods is that they often are indexed in plants (medical subject headings are indicated by italics), but not in any nutrition-related category. So chocolate is indexed by its botanical name, cacao. Sometimes articles on chocolate are retrieved by searching for nutrition-related terms because they’re indexed with other nutrition terms. But in many other cases they are not found, because they are only under cacao. So, for instance, if you do a general search for food-related causes of migraine (using our broad nutrition hedge), you will not retrieve this article: Chocolate is a migraine-provoking agent.

Cranberries. Like chocolate, cranberry is indexed by its botanical name, Vaccinium macrocarpon, which is in plants, and not in any nutrition category. Like chocolate, articles on cranberries may or may not be found with a broad nutrition or food search. Here’s an example of an article that is not found: Cranberry for preventing urinary tract infection.

Olive oil has a more basic problem in PubMed than chocolate and cranberries. Inexplicably, it doesn’t even have its own subject term. Instead, it’s put under plant oils, which is in chemicals and drugs, and not in any food or nutrition cluster. This is especially confounding because the dietary fats cluster, which is found in broad nutrition searches, contains such things as corn oil, safflower oil and soybean oil. If olive oil had its own subject heading, it would likely be placed in this bundle. Here’s an article not found with a broad nutrition or food search: Olive oil may reduce coronary artery disease risk.

Red meat. It’s not just plant-based foods that have a problem in PubMed. With red meat getting much attention as a possible contributor to cardiovascular disease, it’s surprising that there is not a subject term or corresponding category, in PubMed for it. Even more surprisingly, there are not even subject terms for specific kinds of meat (beef, pork). Instead, everything is lumped together under the one term meat, which includes red meats, as well as fish and poultry.

So, what’s our advice if want to search for the nutritional aspects of a disease or other subject? Your best bet would be to use a “hedge” that includes all food and nutrition terms, like the one we mentioned above, available here. But if you’re doing it on the fly, and don’t have time to grab a hedge, a simplified version of the hedge is this:

food OR diet OR nutrition

Go to Advanced Search in PubMed and combine this (using AND) with other subjects. This isn’t quite as comprehensive as our hedge, but it does a fairly good job.

Eric Rumsey and Janna Lawrence are medical librarians who blog at the University of Iowa Libraries. Mr. Rumsey can be reached on Twitter @EricRumsey.

View 3 Comments >

Most Popular

✓ Join 150,000+ subscribers
✓ Get KevinMD's most popular stories