2012 SummarizationbasedMiningBiparti

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

Author Keywords

Abstract

How to extract the truly relevant information from a large relational data set? The answer of this paper is a technique integrating graph summarization, graph clustering, link prediction and the discovery of the hidden structure on the basis of data compression. Our novel algorithm SCMiner (for Summarization-Compression Miner) reduces a large bipartite input graph to a highly compact representation which is very useful for different data mining tasks: 1) Clustering: The compact summary graph contains the truly relevant clusters of both types of nodes of a bipartite graph. 2) Link prediction: The compression scheme of SCMiner reveals suspicious edges which are probably erroneous as well as missing edges, i.e. pairs of nodes which should be connected by an edge. 3) Discovery of the hidden structure: Unlike traditional co-clustering methods, the result of SCMiner is not limited to row - and column-clusters. Besides the clusters, the summary graph also contains the essential relationships between both types of clusters and thus reveals the hidden structure of the data. Extensive experiments on synthetic and real data demonstrate that SCMiner outperforms state-of-the-art techniques for clustering and link prediction. Moreover, SCMiner discovers the hidden structure and reports it in an interpretable way to the user. Based on data compression, our technique does not rely on any input parameters which are difficult to estimate.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2012 SummarizationbasedMiningBipartiChristian Böhm
Claudia Plant
Jing Feng
Xiao He
Bettina Konte
Summarization-based Mining Bipartite Graphs10.1145/2339530.23397252012