2005 AKnowlApproachToCitatExtract

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Citation Extraction, INFOMAP, Knowledge-based, Ontology, Literature Mining.

Notes

Cited By

Quotes

Abstract

Integration of the bibliographical information of scholarly publications available on the Internet is an important task in academic research. To accomplish this task, accurate reference metadata extraction for scholarly publications is essential for the integration of information from heterogeneous reference sources. In this paper, we propose a knowledge-based approach to literature mining and focus on reference metadata extraction methods for scholarly publications. We adopt an ontological knowledge representation framework called INFOMAP to automatically extract the reference metadata. The experimental results show that, by using INFOMAP, we can extract author, title, journal, volume, number (issue), year, and page information from different reference styles with a high degree of accuracy. The overall average field accuracy of citation extraction for a Bioinformatics dataset is 97.87% for six reference styles.

References

  • R. R. Bouckaert, "Low level information extraction: a Bayesian network based approach", Workshop on Text Learning (TextML-2002). (2002).
  • K. Burnett, K. B. Ng, and S. Park, "A comparison of the two traditions of metadata development", Journal of the American Society for Information Science, vol. 50, pp. 1209 - 1217, 1999.
  • G. Chowdhury, "Template mining for information extraction from digital documents", Library Trends, vol. 48, pp. 182-208, 1999.
  • T. Davenport, D. DeLong, and M. Beers, "Successful knowledge management projects", Sloan Management Review, vol. 39, pp. 43-57, 1998.
  • Y. Ding, G. Chowdhury, and S. Foo, "Template mining for the extraction of citation from digital documents", Proceedings of the Second Asian Digital Library Conference, Taiwan, 1999. pp. 47-62.
  • Y. Ding and S. Foo, "Ontology research and development. Part I - a review of ontology generation", Journal of Information Science, vol. 28, pp. 123-136, 2002.
  • M. Fleischman, Eduard Hovy, and A. Echihabi, "Offline Strategies for Online Question Answering: Answering Questions Before They Are Asked", Proceedings of ACL-2003 Conference, 2003. pp. 1-7.
  • C. L. Giles, K. D. Bollacker, and S. Lawrence, "CiteSeer: An Automatic Citation Indexing System", Digital Libraries 98 - The Third ACM Conference on Digital Libraries, 1998. pp. 89-98.
  • A. Goodrum, K. McCain, S. Lawrence, and C. Giles, "Scholarly publishing in the Internet age: a citation analysis of computer science literature", Information Processing & Management, vol. 37, pp. 661-675, 2001.
  • H. Han, C. L. Giles, E. Manavoglu, H. Zha, Z. Zhang, and E. A. Fox, "Automatic document metadata extraction using support vector machines", JCDL '03: Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries, 2003. pp. 37-48.
  • P. Jacso, "The future of citation indexing: An interview with Eugene Garfield", Online, vol. 28, pp. 38-40, 2004.
  • S. Lawrence, C. L. Giles, and K. Bollacker, "Digital libraries and autonomous citation indexing", Computer, vol. 32, pp. 67-71, 1999.
  • A. Maedche and S. Staab, "Ontology learning from text", Natural Language Processing and Information Systems, vol. 1959, pp. 364-364, 2001.
  • F. Peng and A. McCallum, "Accurate Information Extraction from Research Papers using Conditional Random Fields", Proceedings of Human Language Technology Conference and North American Chapter of the Association for Computational Linguistics (HLTNAACL), 2004. pp. 329-336.
  • K. Seymore, A. McCallum, and R. Rosenfeld, "Learning hidden Markov model structure for information extraction", AAAI-99 Workshop on Machine Learning for Information Extraction, 1999. pp. 37-42.
  • A. Takasu, "Bibliographic attribute extraction from erroneous references based on a statistical model", JCDL '03: Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries, 2003. pp. 49-60.
  • S.-H. Wu, M.-Y. Day, and W.-L. Hsu, "FAQ-centered Organizational Memory", Proceeding of the Knowledge Management and Organizational Memory workshop on the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI-01), 2001. pp. 112-120.
  • S.-H. Wu, T.-H. Tsai, and W.-L. Hsu, "Domain Event Extraction and Representation with Domain Ontology", Proceedings of the IJCAI-03 Workshop on Information Integration on the Web, Acapulco, Mexico, 2003. pp. 33-38


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2005 AKnowlApproachToCitatExtractShih-Hung Wu
Wen-Lian Hsu
Min-Yuh Day
Tzong-Han Tsai
Cheng-Lung Sung
Cheng-Wei Lee
Chorng-Shyong Ong
A Knowledge-based Approach to Citation ExtractionProceedings of the Conference on Information Reuse and Integrationhttp://gra103.aca.ntu.edu.tw/gdoc/94/D90725010a.pdf10.1109/IRI-05.2005.15064482005