MEDLINE

From Citizendium
Revision as of 16:58, 30 June 2011 by imported>Robert Badgett (→‎Methods to improve searching MEDLINE)
Jump to navigation Jump to search
This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and subject to a disclaimer.

According to the U.S. National Library of Medicine, "MEDLINE® (Medical Literature Analysis and Retrieval System Online) is the U.S. National Library of Medicine's® (NLM) premier bibliographic database that contains over 16 million references to journal articles in life sciences with a concentration on biomedicine. A distinctive feature of MEDLINE is that the records are indexed with NLM's Medical Subject Headings (MeSH®)."[1]

PubMed is the National Library of Medicine's free online search system for MEDLINE.

Structure

MEDLINE® (Medical Literature Analysis and Retrieval System Online) is a database of predominantly biomedical bibliographic citations maintained by the U.S. National Library of Medicine (NLM).[2] Each citation includes bibliographic data, abstract if available, links to full text of the article and keywords.

The process for selecting journals is described.[3]

The keywords are indexed with the NLM's Medical Subject Headings (MeSH®)[4] and subheadings[5]. Indexing of MESH terms by human is assisted by the Medical Text Indexer (MTI).[6]

The important MeSH terms “Randomized Controlled Trial” and “Clinical Controlled Trial” were introduced in 1991 and 1995, respectively.[7] The Cochrane Collaboration helps MEDLINE correctly retag articles with these terms.[7]

The National Library of Medicine's Indexing Initiative is trying to automate assignment of MeSH terms. The National Library of Medicine is investigated whether indexing MeSH terms can be either fully or semi-automated.[8]

PubMed provides feedback relevance with its "See related" feature.[9][10]

Methods to improve searching MEDLINE

Studies of searching MEDLINE. [11] [12] [13] [14]
Study Setting Method Precision Comments
Wallace et al.[11]
2010
Locating relevant studies for systematic reviews Machine learning
• Support Vector Machines• 
Ensemble of four SVM classifiers (title;abstract;MeSH;UMLS
  Reduced the number of articles to manually review by 40% to 50%.
Kilicoglu[13]
2009
Identifying high quality studies in Internal Medicine cell cell cell
Aphinyanaphongs[12]
2005
Identifying high quality studies in Internal Medicine Machine learning
• Support Vector Machines• 
Four feature-spaces (title;abstract;MeSH;publication type
cell cell
Cohen[14]
2006
Locating relevant studies for updating systematic reviews cell cell cell
cell cell cell cell cell
cell cell cell cell cell


There is much ongoing research into improving MEDLINE search results.[15][16]

Citation tracking

Citation tracking may help identify relevant studies in MEDLINE.[17][18]

Clustering

Clustering search results may help.[19]

Filters (hedges)

MEDLINE filters, also called hedges, are an optimal Boolean combination of search terms, both textword and MeSH terms, to search articles. Many filters have been made by the Hedges Team and are available as Clinical Queries at PubMed. The Clinical Queries at PubMed may improve the quality of articles retrieved.[20]

Filters have been criticized for being imperfect.[21]

Filters for article types

Evolution of search filters
Purpose category Strategy with
high sensitivity
Strategy with
high specificity
1994[22]
Treatment randomized controlled trial[Publication Type] OR drug therapy[MeSH Subheading] OR therapeutic use[MeSH Subheading] OR random*[Title/Abstract] placebo*[Title/Abstract] OR (double[Title/Abstract] AND blind*[Title/Abstract]
Diagnosis
2005[23]
Treatment (clinical[Title/Abstract] AND trial[Title/Abstract]) OR clinical trials[MeSH Terms] OR clinical trial[Publication Type] OR random*[Title/Abstract] OR random allocation[MeSH Terms] OR therapeutic use[MeSH Subheading] randomized controlled trial[Publication Type] OR (randomized[Title/Abstract] AND controlled[Title/Abstract] AND trial[Title/Abstract])
Diagnosis sensitiv*[Title/Abstract] OR sensitivity and specificity[MeSH Terms] OR diagnos*{Title/Abstract] OR diagnosis[MeSH:noexp] OR diagnostic * [MeSH:noexp] OR diagnosis,differential[MeSH:noexp] OR diagnosis[Subheading:noexp] specificity[Title/Abstract]

One filter is for identifying randomized controlled trials. Many MEDLINE filters have been developed by the Hedges team[23] supported by a grant from the National Library of Medicine.[24] The filters were initially published in 1994[22] and then revised and published in 2005[25].

Examples include filters for randomized controlled trials[26] and systematic reviews[27].

Filters for subject types

A filter have been developed for articles about kidney disease[28], dentistry[29], and about specific age ranges[30].

Relevancy ranking

Although MEDLINE is usually searched for exact matches using Boolean terms, relevancy ranking has been studied. In an early comparison, relevancy ranking performed well; however, the Boolean version of MEDLINE did not fully use MeSH terms.[31][32]

eTBLAST uses text mining to search for similar publications.[33][34]

Citation analysis or PageRank

There are conflicting results over the role of ranking results based on citation counts or PageRank. A study using Google's own PageRank found PubMed's clinical queries to be better.[35] However, a comparative study found better results for a metric analogous to PageRank for biomedical journals based on:[36][37]

Machine learning

Machine learning methods in which the search engine seeks articles that more resemble the included articles, may be more accurate than Boolean methods (see EBMSearch below).[12][38] However, the study by Aphinyanaphongs compared machine learning to the 1994 Boolean filters.[12]

Machine learning may be improved by ensemble learning method using stacked generalization (or stacking) to emphasize the role of UMLS concepts and title words.[13]

Machine learning may[39][38] or may not[36] be more accurate than citation based strategies. Citation or link strategies may improve upon text categorization.[40]

Machine learning built for categorizing one gold standard may not work as well in another setting.[39]

Research methods for comparative studies

For more information, see: Information retrieval.

In comparing the information retrieval of search strategies, there are two experimental methods.

  1. If a complete test collection of articles is available that is already divided into articles of meeting inclusion criteria and articles that not meeting criteria, then each strategy is compared for its ability to successfully identify the articles meeting criteria (sensitivity) and to successfully exclude (specificity) the articles not meeting criteria. Sensitivity is also called "recall".[41]
  2. If a partial test collection is available that only consists of articles meeting inclusion criteria (for example, article meeting inclusion criteria for ACP Journal Club[12] or articles included in a systematic review of a clinical topic or articles in an annotated bibliography[37]), then the sensitivity is again the proportion of relevant articles identified by the strategy. However, the specificity is not computable. Instead, one of several related measures are calculated. These measures are all based on the positive predictive value (PPV) of the strategy. Analogous to PPV used in diagnostic testing, the PPV directly correlates with the prevalence of relevant articles in the collection and thus is not stable across prevalences.[42]
    1. Precision is "the proportion of retrieved articles that meet criteria" and thus is the same as the PPV.[43][44]
    2. Number Needed to Read (NNR) is 1/precision and is "how many papers in a journal have to be read to find one of adequate clinical quality and relevance."[45][46][42][35] Of note, the NNR has been proposed as a metric to help libraries to decide which journals to subscribe to.[45]
    3. Hit curve "is the number of important articles among the first n results."[47][36]
    4. 11-point precision recall graph is similar to a receiver operating characteristic curve[12]

Methods to access MEDLINE

There are many third party interfaces to search MEDLINE such as OVID[48]. The National Library of Medicine's own search interface is PubMed (http://pubmed.gov). The National Library of Medicine maintains a list of search engines at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/search/.

PubMed

For more information, see: PubMed.

PubMed (http://pubmed.gov) is the National Library of Medicine's own free Internet access to MEDLINE. PubMed has been freely available since 1997.

EBM Search

EBM Search (http://www.ahsl.arizona.edu/ebmsearch/) is a federated medical search engine.[49]

EBMSearch

EBMSearch (http://ebmsearch.org/) maintains its own copy of MEDLINE and uses machine learning to rank articles.[12]

eTBLAST

eTBLAST uses text mining to search for similar publications.[33][34]

GoPubMed

GoPubMed (http://www.GoPubMed.org/) applies social networking to MEDLINE.[50]

HubMed

HubMed (http://www.hubmed.org/) does not maintain its own copy of MEDLILNE, but rather uses PubMed's EUtils web service to retrieve MEDLINE records stored at PubMed.[51]

Ovid

SUMSearch

SUMSearch (http://sumsearch.uthscsa.edu/) is a federated medical search engine. It does not maintain its own copy of MEDLINE, but rather queries PubMed and revises searches too few or too many citations are retrieved. At the same time, SUMSearch queries the National Guidelines Clearinghouse, DARE, WikiPedia, and other resources.

References

  1. MEDLINE Fact Sheet. National Library of Medicine. Retrieved on 2008-01-22.
  2. National Library of Medicine. MEDLINE Fact Sheet. Retrieved on 2007-11-09.
  3. Anonymous (2007). MEDLINE® Journal Selection Fact Sheet. National Library of Medicine. Retrieved on 2010-04-04.
  4. National Library of Medicine. Medical Subject Headings (MESH®) Fact Sheet. Retrieved on 2007-11-09.
  5. Anonymous (2008). Qualifiers - 2008. National Library of Medicine. Retrieved on 2008-03-19.
  6. Anonymous. Medical Text Indexer (MTI). National Library of Medicine
  7. 7.0 7.1 Glanville JM, Lefebvre C, Miles JN, Camosso-Stefinovic J (2006). "How to identify randomized controlled trials in MEDLINE: ten years on.". J Med Libr Assoc 94 (2): 130-6. PMID 16636704. PMC PMC1435857.
  8. National Library of Medicine. Indexing Initiative. Retrieved on 2007-11-25.
  9. Lin J, Wilbur WJ (2007). "PubMed related articles: a probabilistic topic-based model for content similarity.". BMC Bioinformatics 8: 423. DOI:10.1186/1471-2105-8-423. PMID 17971238. PMC PMC2212667. Research Blogging.
  10. Anonymous (2011). PubMed Help: Computation of Related Citations
  11. 11.0 11.1 Wallace BC, Trikalinos TA, Lau J, Brodley C, Schmid CH (2010). "Semi-automated screening of biomedical citations for systematic reviews.". BMC Bioinformatics 11: 55. DOI:10.1186/1471-2105-11-55. PMID 20102628. PMC PMC2824679. Research Blogging.
  12. 12.0 12.1 12.2 12.3 12.4 12.5 12.6 Aphinyanaphongs Y, Tsamardinos I, Statnikov A, Hardin D, Aliferis CF (2005). "Text categorization models for high-quality article retrieval in internal medicine.". J Am Med Inform Assoc 12 (2): 207-16. DOI:10.1197/jamia.M1641. PMID 15561789. PMC PMC551552. Research Blogging. Cite error: Invalid <ref> tag; name "pmid15561789" defined multiple times with different content Cite error: Invalid <ref> tag; name "pmid15561789" defined multiple times with different content Cite error: Invalid <ref> tag; name "pmid15561789" defined multiple times with different content Cite error: Invalid <ref> tag; name "pmid15561789" defined multiple times with different content
  13. 13.0 13.1 13.2 Kilicoglu H, Demner-Fushman D, Rindflesch TC, Wilczynski NL, Haynes RB (2009). "Towards automatic recognition of scientifically rigorous clinical research evidence.". J Am Med Inform Assoc 16 (1): 25-31. DOI:10.1197/jamia.M2996. PMID 18952929. PMC PMC2605595. Research Blogging. Cite error: Invalid <ref> tag; name "pmid18952929" defined multiple times with different content
  14. 14.0 14.1 Cohen AM, Hersh WR, Peterson K, Yen PY (2006). "Reducing workload in systematic review preparation using automated citation classification.". J Am Med Inform Assoc 13 (2): 206-19. DOI:10.1197/jamia.M1929. PMID 16357352. PMC PMC1447545. Research Blogging.
  15. Lu Z (2011). "PubMed and beyond: a survey of web tools for searching biomedical literature.". Database (Oxford) 2011: baq036. DOI:10.1093/database/baq036. PMID 21245076. PMC PMC3025693. Research Blogging.
  16. Kim JJ, Rebholz-Schuhmann D (2008). "Categorization of services for seeking information in biomedical literature: a typology for improvement of practice.". Brief Bioinform 9 (6): 452-65. DOI:10.1093/bib/bbn032. PMID 18660511. Research Blogging.
  17. Bakkalbasi N, Bauer K, Glover J, Wang L (2006). "Three options for citation tracking: Google Scholar, Scopus and Web of Science". Biomed Digit Libr 3: 7. DOI:10.1186/1742-5581-3-7. PMID 16805916. Research Blogging.
  18. Kuper H, Nicholson A, Hemingway H (2006). "Searching for observational studies: what does citation tracking add to PubMed? A case study in depression and coronary heart disease". BMC Med Res Methodol 6: 4. DOI:10.1186/1471-2288-6-4. PMID 16483366. Research Blogging.
  19. Lin Y, Li W, Chen K, Liu Y (2007). "A document clustering and ranking system for exploring MEDLINE citations". J Am Med Inform Assoc 14 (5): 651–61. DOI:10.1197/jamia.M2215. PMID 17600104. Research Blogging.
  20. Lokker C, Haynes RB, Wilczynski NL, McKibbon KA, Walter SD (2011). "Retrieval of diagnostic and treatment studies for clinical use through PubMed and PubMed's Clinical Queries filters.". J Am Med Inform Assoc. DOI:10.1136/amiajnl-2011-000233. PMID 21680559. Research Blogging.
  21. Leeflang MM, Scholten RJ, Rutjes AW, Reitsma JB, Bossuyt PM (2006). "Use of methodological search filters to identify diagnostic accuracy studies can lead to the omission of relevant studies.". J Clin Epidemiol 59 (3): 234-40. DOI:10.1016/j.jclinepi.2005.07.014. PMID 16488353. Research Blogging.
  22. 22.0 22.1 Haynes RB, Wilczynski N, McKibbon KA, Walker CJ, Sinclair JC (1994). "Developing optimal search strategies for detecting clinically sound studies in MEDLINE.". J Am Med Inform Assoc 1 (6): 447-58. PMID 7850570. PMC PMC116228[e]
  23. 23.0 23.1 Hedges Team. Search Strategies. Retrieved on 2011-03-015.
  24. Project Information - NIH RePORTER – NIH Research Portfolio Online Reporting Tool Expenditures and Results. Retrieved on 2007-11-25.
  25. Haynes RB, McKibbon KA, Wilczynski NL, Walter SD, Werre SR, Hedges Team (2005). "Optimal search strategies for retrieving scientifically strong studies of treatment from Medline: analytical survey.". BMJ 330 (7501): 1179. DOI:10.1136/bmj.38446.498542.8F. PMID 15894554. PMC PMC558012. Research Blogging.
  26. McKibbon KA, Wilczynski NL, Haynes RB (2009). "Retrieving randomized controlled trials from MEDLINE: a comparison of 38 published search filters.". Health Info Libr J 26 (3): 187-202. DOI:10.1111/j.1471-1842.2008.00827.x. PMID 19712211. Research Blogging.
  27. Wilczynski NL, Haynes RB (2009). "Consistency and accuracy of indexing systematic review articles and meta-analyses in MEDLINE.". Health Info Libr J 26 (3): 203-10. DOI:10.1111/j.1471-1842.2008.00823.x. PMID 19712212. Research Blogging.
  28. Garg AX, Iansavichus AV, Wilczynski NL, Kastner M, Baier LA, Shariff SZ et al. (2009). "Filtering Medline for a clinical discipline: diagnostic test assessment framework.". BMJ 339: b3435. DOI:10.1136/bmj.b3435. PMID 19767336. Research Blogging.
  29. Niederman R, Chen L, Murzyn L, Conway S. Benchmarking the dental randomised controlled literature on MEDLINE. Evidence-Based Dentistry. 2002;3:5-9 DOI:10.1038/sj/ebd/4600095
  30. Kastner M, Wilczynski NL, Walker-Dilks C, McKibbon KA, Haynes B (2006). "Age-specific search strategies for Medline.". J Med Internet Res 8 (4): e25. DOI:10.2196/jmir.8.4.e25. PMID 17213044. PMC PMC1794003. Research Blogging.
  31. Hersh WR, Hickam DH (1992). "A comparison of retrieval effectiveness for three methods of indexing medical literature". Am. J. Med. Sci. 303 (5): 292–300. PMID 1580316[e]
  32. Hersh WR, Hickam DH, Haynes RB, McKibbon KA (1994). "A performance and failure analysis of SAPHIRE with a MEDLINE test collection". J Am Med Inform Assoc 1 (1): 51–60. PMID 7719787[e]
  33. 33.0 33.1 Errami M, Wren JD, Hicks JM, Garner HR (2007). "eTBLAST: a web server to identify expert reviewers, appropriate journals and similar publications.". Nucleic Acids Res 35 (Web Server issue): W12-5. DOI:10.1093/nar/gkm221. PMID 17452348. PMC PMC1933238. Research Blogging.
  34. 34.0 34.1 Lewis J, Ossowski S, Hicks J, Errami M, Garner HR (2006). "Text similarity: an alternative way to search MEDLINE.". Bioinformatics 22 (18): 2298-304. DOI:10.1093/bioinformatics/btl388. PMID 16926219. Research Blogging.
  35. 35.0 35.1 Haase A, Follmann M, Skipka G, Kirchner H (2007). "Developing search strategies for clinical practice guidelines in SUMSearch and Google Scholar and assessing their retrieval performance". BMC Med Res Methodol 7: 28. DOI:10.1186/1471-2288-7-28. PMID 17603909. Research Blogging.
  36. 36.0 36.1 36.2 Bernstam EV, Herskovic JR, Aphinyanaphongs Y, Aliferis CF, Sriram MG, Hersh WR (2006). "Using citation data to improve retrieval from MEDLINE". J Am Med Inform Assoc 13 (1): 96–105. DOI:10.1197/jamia.M1909. PMID 16221938. Research Blogging. This study may have been biased towards ranking systems because 1) all retrieval methods analyzed a "preliminary result set using simple PubMed queries, 2) the boolean filters were developed in 1994 as the authors probably completed the study prior to the 2005 update of PubMed filters" Cite error: Invalid <ref> tag; name "pmid16221938" defined multiple times with different content Cite error: Invalid <ref> tag; name "pmid16221938" defined multiple times with different content
  37. 37.0 37.1 Herskovic JR, Bernstam EV (2005). "Using incomplete citation data for MEDLINE results ranking". AMIA Annu Symp Proc: 316–20. PMID 16779053[e] PubMed Central Cite error: Invalid <ref> tag; name "pmid16779053" defined multiple times with different content
  38. 38.0 38.1 Fu LD, Wang L, Aphinyanagphongs Y, Aliferis CF (2007). "A comparison of impact factor, clinical query filters, and pattern recognition query filters in terms of sensitivity to topic.". Stud Health Technol Inform 129 (Pt 1): 716-20. PMID 17911810[e] This study may be biased due to using 1994 version of clinical filters. Cite error: Invalid <ref> tag; name "pmid17911810" defined multiple times with different content
  39. 39.0 39.1 Aphinyanaphongs Y, Statnikov A, Aliferis CF (2006). "A comparison of citation metrics to machine learning filters for the identification of high quality MEDLINE documents.". J Am Med Inform Assoc 13 (4): 446-55. DOI:10.1197/jamia.M2031. PMID 16622165. PMC PMC1513679. Research Blogging.
  40. Lin J (2008). "PageRank without hyperlinks: reranking with PubMed related article networks for biomedical text retrieval.". BMC Bioinformatics 9: 270. DOI:10.1186/1471-2105-9-270. PMID 18538027. PMC PMC2442104. Research Blogging.
  41. Hersh, William R. (2008). Information Retrieval: A Health and Biomedical Perspective (Health Informatics). Berlin: Springer. ISBN 0-387-78702-X.  Google books
  42. 42.0 42.1 Bachmann LM, Coray R, Estermann P, Ter Riet G (2002). "Identifying diagnostic studies in MEDLINE: reducing the number needed to read". J Am Med Inform Assoc 9 (6): 653–8. PMID 12386115[e]
  43. Haynes RB, Wilczynski NL (2004). "Optimal search strategies for retrieving scientifically strong studies of diagnosis from Medline: analytical survey". BMJ 328 (7447): 1040. DOI:10.1136/bmj.38068.557998.EE. PMID 15073027. Research Blogging.
  44. Zhang L, Ajiferuke I, Sampson M (2006). "Optimizing search strategies to identify randomized controlled trials in MEDLINE". BMC Med Res Methodol 6: 23. DOI:10.1186/1471-2288-6-23. PMID 16684359. PMC 1488863. Research Blogging.
  45. 45.0 45.1 Toth B, Gray JA, Brice A (2005). "The number needed to read-a new measure of journal value". Health Info Libr J 22 (2): 81–2. DOI:10.1111/j.1471-1842.2005.00568.x. PMID 15910578. Research Blogging.
  46. McKibbon KA, Wilczynski NL, Haynes RB (2004). "What do evidence-based secondary journals tell us about the publication of clinically important articles in primary healthcare journals?". BMC Med 2: 33. DOI:10.1186/1741-7015-2-33. PMID 15350200. Research Blogging.
  47. Herskovic JR, Iyengar MS, Bernstam EV (2007). "Using hit curves to compare search algorithm performance". J Biomed Inform 40 (2): 93–9. DOI:10.1016/j.jbi.2005.12.007. PMID 16469545. Research Blogging.
  48. Anonymous. MEDLINE® - Ovid's MEDLINE. Retrieved on 2007-11-09.
  49. Bracke PJ, Howse DK, Keim SM (April 2008). "Evidence-based Medicine Search: a customizable federated search engine". J Med Libr Assoc 96 (2): 108–13. DOI:10.3163/1536-5050.96.2.108. PMID 18379665. PMC 2268222. Research Blogging.
  50. Doms A, Schroeder M (July 2005). "GoPubMed: exploring PubMed with the Gene Ontology". Nucleic acids research 33 (Web Server issue): W783–6. DOI:10.1093/nar/gki470. PMID 15980585. PMC 1160231. Research Blogging.
  51. Eaton AD (July 2006). "HubMed: a web-based biomedical literature search interface". Nucleic acids research 34 (Web Server issue): W745–7. DOI:10.1093/nar/gkl037. PMID 16845111. PMC 1538859. Research Blogging.

External links