TechMiner System
A TechMiner System is a scholarly corpus mining system.
- See: Rexplore.
References
2016
- (Osborne et al., 2016) ⇒ Francesco Osborne, Helene de Ribaupierre, and Enrico Motta. (2016). “TechMiner: Extracting Technologies from Academic Publications.” In: Proceedings of 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2016).
- QUOTE: In recent years we have seen the emergence of a variety of scholarly datasets. Typically these capture "standard" scholarly entities and their connections, such as authors, affiliations, venues, publications, citations, and others. However, as the repositories grow and the technology improves, researchers are adding new entities to these repositories to develop a richer model of the scholarly domain. In this paper, we introduce TechMiner, a new approach, which combines NLP, machine learning and semantic technologies, for mining technologies from research publications and generating an OWL ontology describing their relationships with other research entities. ...
To address these issues, we have developed TechMiner (TM), a new approach which combines natural language processing (NLP), machine learning and semantic technologies to identify software technologies from research publications. In the resulting OWL representation, each technology is linked to a number of related research entities, such as the authors who introduced it and the relevant topics. ...
The TechMiner (TM) approach was created for automatically identifying technologies from a corpus of metadata about research publications and describing them semantically. It takes as input the IDs, the titles and the abstracts of a number of research papers in the Scopus dataset6 and a variety of knowledge bases (DBpedia [12], WordNet [15], the Klink-2 Computer Science ontology [16], and others) and returns an OWL ontology describing a number of technologies and their related research entities.
These include: 1) the authors who most published on it, 2) related research areas, 3) the publications in which they appear, and, optionally, 4) the team of authors who introduced the technology and 5) the URI of the related DBpedia entity. The input is usually composed by a set of publications about a certain topic (e.g., Semantic Web, Machine Learning), to retrieve all technologies in that field. However, TM can be used on any set of publications.
Figure 1. The TechMiner architecture.
- QUOTE: In recent years we have seen the emergence of a variety of scholarly datasets. Typically these capture "standard" scholarly entities and their connections, such as authors, affiliations, venues, publications, citations, and others. However, as the repositories grow and the technology improves, researchers are adding new entities to these repositories to develop a richer model of the scholarly domain. In this paper, we introduce TechMiner, a new approach, which combines NLP, machine learning and semantic technologies, for mining technologies from research publications and generating an OWL ontology describing their relationships with other research entities. ...