2010 RelationalRetrievalUsingaCombin

(Lao et al., 2010) ⇒ Ni Lao, and William W. Cohen. (2010). “Relational Retrieval Using a Combination of Path-constrained Random Walks.” In: Machine Learning Journal, 81(1). doi:10.1007/s10994-010-5205-8

Subject Headings: Random Walk with Restart, Entity Relation Graph, Link Prediction, Learned Similarity Measure.

Notes

It is based on an ECML 2010 paper, whose slides are here: http://www.cs.cmu.edu/~nlao/publication/2010/2010.ECML.slides.pdf

Cited By

Quotes

Author Keywords

Entity relation graph; Filtering and recommending; Learning to rank; Random walk; Relational model

Abstract

Scientific literature with rich metadata can be represented as a labeled directed graph. This graph representation enables a number of scientific tasks such as ad hoc retrieval or named entity recognition (NER) to be formulated as typed proximity queries in the graph. One popular proximity measure is called Random Walk with Restart (RWR), and much work has been done on the supervised learning of RWR measures by associating each edge label with a parameter. In this paper, we describe a novel learnable proximity measure which instead uses one weight per edge label sequence: proximity is defined by a weighted combination of simple "path experts", each corresponding to following a particular sequence of labeled edges. Experiments on eight tasks in two subdomains of biology show that the new learning method significantly outperforms the RWR model (both trained and untrained). We also extend the method to support two additional types of experts to model intrinsic properties of entities: query-independent experts, which generalize the PageRank measure, and popular entity experts which allow rankings to be adjusted for particular entities that are especially important.

5 Conclusion and future work

We proposed a novel method for learning a weighted combination of path-constrained random walkers, which is able to discover and leverage complex path features of relational retrieval data. We also evaluate the impact of using query-independent path features, and popular entity features which can model per entity characteristics. Our experiment on several recommendation and retrieval tasks involving scientific publications shows that the proposed method can significantly outperforms traditional models based on random walk with restarts.

We are very interested in the generalization from simple relations to hyper-relations which are mappings from possibly more than one source types. For example, there is much incentive to express the AND relation (Balmin et al. 2004): e.g. consider the task of finding papers that are both written by certain author and recent. However, model complexity will be a major concern. Efficient structure selection algorithm is very important to make a system practical.

Furthermore, we are interested in algorithms that introduces new entities and edges to the graph. This can potentially be useful to improving retrieval quality or efficiency. For example, new entities can represent subtopics of research interests, and new links can represent memberships from words, authors or papers to these subtopics. In this way, a model might be able to replace some long paths which we have shown in the experiment with relatively shorter and more effective paths associated with the introduced structures.

References

,

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2010 RelationalRetrievalUsingaCombin	William W. Cohen Ni Lao			Relational Retrieval Using a Combination of Path-constrained Random Walks				10.1007/s10994-010-5205-8