SemEHR System
A SemEHR System is Semantic Search System that can extracts and retrieves information from electronic health records.
- Context:
- It was developed by Wu et al., (2018), source code is available at: https://github.com/CogStack/SemEHR.
- It is composed by 3 subsystems and 2 data storage units :
- a SemEHR Producing Subsystem that extracts clinical notes from EHR by implementing the following pipelines and toolkits:
- CogStack for data retrieval and information extraction;
- Bio-YODIE for Natural Language Processing;
- a Semantic Analizer and Indexer;
- a SemEHR Continuous Learning Subsystem that collects and analyzes user feedback by using the following software engines:
- a Rule Engine that generates, and applies rules for removing unwanted results;
- a Machine Learning Engine that takes users feedback as training data for Bidirectional Recurrent Neural Network that analysis the corpus, and populates a confidence value for each concept mention;
- a Sem EHR Consuming Subsystem that consists of semantic search engine composed by:
- a Query Processor;
- a Data Analyser;
- a Reasoner;
- an ElasticSearch Cluster that stores the EHR data processed in the SemEHR Producing Subsystem;
- a Study Kowledge Graph that stores the study parameters, search settings, study results and search rules.
- a SemEHR Producing Subsystem that extracts clinical notes from EHR by implementing the following pipelines and toolkits:
- Example(s):
- Counter-Example(s):
- See: Semantic Web, Ontology Search System, Natural Language Processing, Annotation Task, Electronic Health Record.
References
2019
- (Github, 2019) ⇒ https://github.com/CogStack/CogStack-SemEHR#intro Retrieved:2019-04-26.
- QUOTE: Built upon off-the-shelf toolkits including a Natural Language Processing (NLP) pipeline (Bio-Yodie) and an enterprise search system (CogStack), SemEHR implements a generic information extraction (IE) and retrieval infrastructure by identifying contextualised mentions of a wide range of biomedical concepts from unstructured clinical notes. Its IE functionality features an adaptive and iterative NLP mechanism where specific requirements and fine-tuning can be fulfilled and realised on a study basis. NLP annotations are further assembled at patient level and extended with clinical and EHR-specific knowledge to populate a panorama for each patient, which comprises a) longitudinal semantic data views and b) structured medical profile(s). The semantic data is serviced via ontology-based search and analytics interfaces to facilitate clinical studies.
2018
- (Wu et al., 2018) ⇒ Honghan Wu, Giulia Toti, Katherine I Morley, Zina M Ibrahim, Amos Folarin, Richard Jackson, Ismail Kartoglu, Asha Agrawal, Clive Stringer, Darren Gale, Genevieve Gorrell, Angus Roberts, Matthew Broadbent, Robert Stewart, and Richard JB Dobson. (2018). “SemEHR: A General-purpose Semantic Search System to Surface Semantic Data from Clinical Notes for Tailored Care, Trial Recruitment, and Clinical Research.” In: Journal of the American Medical Informatics Association, 25(5).
- QUOTE: Unlocking the data contained within both structured and unstructured components of electronic health records (EHRs) has the potential to provide a step change in data available for secondary research use, generation of actionable medical insights, hospital management, and trial recruitment. To achieve this, we implemented SemEHR, an open source semantic search and analytics tool for EHRs.
(...) SemEHR implements a generic information extraction (IE) and retrieval infrastructure by identifying contextualized mentions of a wide range of biomedical concepts within EHRs. Natural language processing annotations are further assembled at the patient level and extended with EHR-specific knowledge to generate a timeline for each patient.
Figure 2. The architecture of SemEHR is composed of 3 subsystems: (1) the producing subsystem (upper part of the figure), creation of SemEHR semantic index by harmonizing, natural language processing, and indexing EHR data; (2) the continuous learning subsystem, addressing study-specific requirements and supporting fine-tuning for separate studies; and (3) the consuming subsystem (lower part), supporting tailored care, patient recruitment, and clinical research by semantic searching and study-based continuous learning.
- QUOTE: Unlocking the data contained within both structured and unstructured components of electronic health records (EHRs) has the potential to provide a step change in data available for secondary research use, generation of actionable medical insights, hospital management, and trial recruitment. To achieve this, we implemented SemEHR, an open source semantic search and analytics tool for EHRs.
2017
- (Wu et al., 2017) ⇒ Honghan Wu, Giulia Toti, Katherine I Morley, Zina Ibrahim, Amos Folarin, Ismail Kartoglu, Richard Jackson, Asha Agrawal, Clive Stringer, Darren Gale, Genevieve M Gorrell, Angus Roberts, Matthew Broadbent, Robert Stewart, and Richard J B Dobson. (2017). “SemEHR: Surfacing Semantic Data from Clinical Notes in Electronic Health Records for Tailored Care, Trial Recruitment, and Clinical Research.” In: The Lancet, 390. doi:10.1016/S0140-6736(17)33032-5