PubTator System

From GM-RKB
Jump to navigation Jump to search

A PubTator System is an publicly-available online automated annotation system and text mining system.

  • Context:
    • It can automatically extract and annotate various biological entities from PubMed abstracts and full-text articles
    • It can be designed to assist in the biocuration of scientific publications and improve the efficiency of manual annotation for biological databases
    • ..
  • Example(s):
    • Using PubTator to assist in literature review and data extraction for researchers
    • Applying PubTator for large-scale text mining tasks in the life sciences
    • ...
  • Counter-Example(s):
  • See: Spacy, Biomedical NER.


References

2023

  • chat
  • Definitional Sentence: PubTator is an online, publicly available text mining and annotation tool that assists in the biocuration of scientific publications by automatically extracting and annotating various biological entities from PubMed abstracts and full-text articles.
  • Context: PubTator was developed to facilitate the process of biocuration and to improve the efficiency of manual annotation for biological databases. It can recognize and annotate a wide range of biological entities, such as genes, proteins, diseases, chemicals, and species, making it a valuable resource for researchers, curators, and biologists.
  • Features:
    • Automatic recognition and annotation of biological entities
    • Integration with the PubMed database
    • User-friendly web interface for interactive curation
    • Export of annotated data in various formats (e.g., XML, BioC, and JSON)
    • Application programming interface (API) for integration with other tools and services
  • Applications:
    • Assisting biocuration of scientific publications
    • Facilitating literature review and data extraction for researchers
    • Enabling large-scale text mining tasks in the life sciences
    • Supporting the development of new text mining and natural language processing methods
  • See Also: Biocuration, Biological Database

2019

  • (Wei et al., 2019) ⇒ Chih-Hsuan Wei, Alexis Allot, Robert Leaman, and Zhiyong Lu. (2019). “PubTator Central: Automated Concept Annotation for Biomedical Full Text Articles.” Nucleic acids research 47, no. W1
    • ABSTRACT: PubTator Central (https://www.ncbi.nlm.nih.gov/research/pubtator/) is a web service for viewing and retrieving bioconcept annotations in full text biomedical articles. PubTator Central (PTC) provides automated annotations from state-of-the-art text mining systems for genes/proteins, genetic variants, diseases, chemicals, species and cell lines, all available for immediate download. PTC annotates PubMed (29 million abstracts) and the PMC Text Mining subset (3 million full text articles). The new PTC web interface allows users to build full text document collections and visualize concept annotations in each document. Annotations are downloadable in multiple formats (XML, JSON and tab delimited) via the online interface, a RESTful web service and bulk FTP. Improved concept identification systems and a new disambiguation module based on deep learning increase annotation accuracy, and the new server-side architecture is significantly faster. PTC is synchronized with PubMed and PubMed Central, with new articles added daily. The original PubTator service has served annotated abstracts for ∼300 million requests, enabling third-party research in use cases such as biocuration support, gene prioritization, genetic disease analysis, and literature-based knowledge discovery. We demonstrate the full text results in PTC significantly increase biomedical concept coverage and anticipate this expansion will both enhance existing downstream applications and enable new use cases.

2013

  • (Wei et al., 2013) ⇒ Chih-Hsuan Wei, Hung-Yu Kao, and Zhiyong Lu. (2013). “PubTator: A Web-based Text Mining Tool for Assisting Biocuration.” Nucleic acids research 41, no . W1
    • ABSTRACT: Manually curating knowledge from biomedical literature into structured databases is highly expensive and time-consuming, making it difficult to keep pace with the rapid growth of the literature. There is therefore a pressing need to assist biocuration with automated text mining tools. Here, we describe PubTator, a web-based system for assisting biocuration. PubTator is different from the few existing tools by featuring a PubMed-like interface, which many biocurators find familiar, and being equipped with multiple challenge-winning text mining algorithms to ensure the quality of its automatic results. Through a formal evaluation with two external user groups, PubTator was shown to be capable of improving both the efficiency and accuracy of manual curation. PubTator is publicly available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/PubTator/.