Table of Contents: 2018 JULY–AUGUST No. 423
Incorporating Values for Indexing Method in MEDLINE/PubMed XML. NLM Tech Bull. 2018 Jul-Aug;(423):e2.
[Editor's note: This change was implemented in PubMed on October 2, 2018.]
The MEDLINE/PubMed DTD was modified in 2017 to incorporate the attribute "IndexingMethod" for the element <MedlineCitation> (see MEDLINE/PubMed XML Element Descriptions and their Attributes). Values will now be applied as appropriate for this attribute in citations indexed for MEDLINE to provide documentation of the method by which the set of Medical Subject Heading (MeSH) indexing terms was determined for a citation. IndexingMethod values are for computational analysis of MEDLINE XML and are not searchable in PubMed. It is particularly important for researchers using MEDLINE indexing as a gold standard for training machine learning algorithms to be able to identify in the MEDLINE XML those citations that were indexed solely by a human method versus those that were indexed by a semi-automated method (algorithm results reviewed by a human) or an automated method (algorithm alone).
IndexingMethod is an implied attribute, meaning that it will only be present if a value is specified. If the IndexingMethod attribute is not present, the indexing method is fully human indexed.
The values to be added are:
The algorithm that currently supports MEDLINE indexing is the Medical Text Indexer (MTI), a product of the National Library of Medicine (NLM) Indexing Initiative.
Beginning in September 2018, these values will be added as appropriate for newly completed MEDLINE citations. For previously completed citations that were indexed by one of these methods, values will be added with the 2019 MEDLINE/PubMed baseline file that is produced in December.
Citations completed by an indexing method of Automated or Curated represent a small proportion of all MEDLINE citations. MEDLINE citations that have been completed by a human indexing method currently number approximately 22 million.
While MEDLINE indexing has traditionally involved full human curation, these automated and semi-automated methods of MEDLINE indexing have been explored in recent years to increase our efficiency and focus expert human effort in key areas to keep up with the ever-expanding volume of biomedical literature. In addition, NLM recently initiated MEDLINE 2022: A Five-Year Development Plan to maintain the usefulness of MEDLINE as a tool for discovering and analyzing the biomedical literature. One of the goals of the MEDLINE 2022 project is to implement a range of indexing methods to ensure the timely assignment of MeSH to MEDLINE citations. Providing XML data on the method used to index citations for MEDLINE supports our effort to be transparent about all facets of the MEDLINE 2022 project.
Additional information about the projects and citation sets mentioned in this article can be found here:
Please send any comments and questions regarding changes to the MEDLINE indexing process to NLM Support Center.