2015 PantherFastTopKSimilaritySearch

From GM-RKB

Jump to navigation Jump to search

(Zhang et al., 2015) ⇒ Jing Zhang, Jie Tang, Cong Ma, Hanghang Tong, Yu Jing, and Juanzi Li. (2015). “Panther: Fast Top-k Similarity Search on Large Networks.” In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2015). ISBN:978-1-4503-3664-2 doi:10.1145/2783258.2783267

Subject Headings:

Notes

Cited By

Quotes

Author Keywords

Data mining; random path; social network; vertex similarity

Abstract

Estimating similarity between vertices is a fundamental issue in network analysis across various domains, such as social networks and biological networks. Methods based on common neighbors and structural contexts have received much attention. However, both categories of methods are difficult to scale up to handle large networks (with billions of nodes). In this paper, we propose a sampling method that provably and accurately estimates the similarity between vertices. The algorithm is based on a novel idea of random path. Specifically, given a network, we perform R random walks, each starting from a randomly picked vertex and walking T steps. Theoretically, the algorithm guarantees that the sampling size R = O (2Îµ^-2 log₂ T) depends on the error-bound Îµ, the confidence level (1 -- Î´), and the path length T of each random walk.

We perform extensive empirical study on a Tencent microblogging network of 1,000,000,000 edges. We show that our algorithm can return top-k similar vertices for any vertex in a network 300Ã faster than the state-of-the-art methods. We also use two applications-identity resolution and structural hole spanner finding -- to evaluate the accuracy of the estimated similarities. Our results demonstrate that the proposed algorithm achieves clearly better performance than several alternative methods.

References

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2015 PantherFastTopKSimilaritySearch	Jie Tang Hanghang Tong Jing Zhang Juanzi Li Cong Ma Yu Jing			Panther: Fast Top-k Similarity Search on Large Networks				10.1145/2783258.2783267		2015

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=2015_PantherFastTopKSimilaritySearch&oldid=850941"

Facts

... more about "2015 PantherFastTopKSimilaritySearch"

Jing Zhang +, Jie Tang +, Cong Ma +, Hanghang Tong +, Yu Jing + and Juanzi Li +

10.1145/2783258.2783267 +

Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining +

Panther: Fast Top-k Similarity Search on Large Networks +

2015 +