2004 TwoSuperLearnApproachforNameDisambigInAuthorCitations

(Han et al., 2004) ⇒ Hui Han, Lee Giles, Hongyuan Zha, Cheng Li, Kostas Tsioutsiouliklis. (2004). “Two Supervised Learning Approach for Name Disambiguation in Author Citations.” In: Proceedings of the Fourth ACM/IEEE-CS Joint Conference on Digital Libraries (CDL 2004). doi:10.1109/JCDL.2004.1336139

Subject Headings:

Notes

Cited By

~141 http://scholar.google.com/scholar?cites=7534682383075163012

Quotes

Abstract

Due to name abbreviations, identical names, name misspellings, and pseudonyms in publications or bibliographies (citations), an author may have multiple names and multiple authors may share the same name. Such name ambiguity affects the performance of document retrieval, web search, database integration, and may cause improper attribution to authors. This paper investigates two supervised learning approaches to disambiguate authors in the citations. One approach uses the naive Bayes probability model, a generative model; the other uses Support Vector Machines(SVMs) [The Nature of Statistical Learning Theory] and the vector space representation of citations, a discriminative model. Both approaches utilize three types of citation attributes: co-author names, the title of the paper, and the title of the journal or proceeding. We illustrate these two approaches on two types of data, one collected from the web, mainly publication lists from homepages, the other collected from the DBLP citation databases.

References

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2004 TwoSuperLearnApproachforNameDisambigInAuthorCitations	Hui Han Lee Giles Hongyuan Zha Cheng Li Kostas Tsioutsiouliklis			Two Supervised Learning Approach for Name Disambiguation in Author Citations				10.1109/JCDL.2004.1336139