2009 HeterogeneousSourceConsensusLea: Difference between revisions
m (Text replacement - "ments]]" to "ment]]s") |
m (Text replacement - "ions]] " to "ion]]s ") |
||
Line 17: | Line 17: | ||
=== Abstract === | === Abstract === | ||
Nowadays, [[Large Data Set|enormous amounts of data]] are continuously generated not only in massive scale, but also from different, sometimes conflicting, views.</s> Therefore, it is important to [[Consolidation Task|consolidate]] different [[concept]]s for [[intelligent]] [[Decision Making Task|decision making]].</s> | Nowadays, [[Large Data Set|enormous amounts of data]] are continuously generated not only in massive scale, but also from different, sometimes conflicting, views.</s> Therefore, it is important to [[Consolidation Task|consolidate]] different [[concept]]s for [[intelligent]] [[Decision Making Task|decision making]].</s> | ||
For example, to [[Prediction Act|predict]] the [[Research Area|research areas]] of some [[people]], the best [[Experimental Result|results]] are usually achieved by [[combining]] and [[consolidating]] [[Prediction Act| | For example, to [[Prediction Act|predict]] the [[Research Area|research areas]] of some [[people]], the best [[Experimental Result|results]] are usually achieved by [[combining]] and [[consolidating]] [[Prediction Act|prediction]]s obtained from the [[Citation Network|publication network]], [[Co-Authorship Graph|co-authorship network]] and the [[Text Content|textual content]] of their [[Research Publication|publications]].</s> | ||
Multiple [[Supervised Learning Task|supervised]] and [[Unsupervised Learning Task|unsupervised]] [[Learned Model|hypotheses]] can be drawn from these [[Data Source|information source]]s, and negotiating their [[difference]]s and [[consolidating]] [[Decision|decision]]s usually yields a much more [[Accuracy Measure|accurate]] [[model]] due to the [[Diverse Set|diversity]] and [[Heterogenous Set|heterogeneity]] of these [[Modeling Task|models]].</s> | Multiple [[Supervised Learning Task|supervised]] and [[Unsupervised Learning Task|unsupervised]] [[Learned Model|hypotheses]] can be drawn from these [[Data Source|information source]]s, and negotiating their [[difference]]s and [[consolidating]] [[Decision|decision]]s usually yields a much more [[Accuracy Measure|accurate]] [[model]] due to the [[Diverse Set|diversity]] and [[Heterogenous Set|heterogeneity]] of these [[Modeling Task|models]].</s> | ||
[[In this paper, we]] address the problem of “[[Consensus Learning Task|consensus learning]]” among competing [[Learned Model|hypotheses]], which either rely on [[External Knowledge|outside knowledge]] ([[Supervised Learning Task|supervised learning]]) or [[internal structure]] ([[Unsupervised Learning Task|unsupervised clustering]]). </s> | [[In this paper, we]] address the problem of “[[Consensus Learning Task|consensus learning]]” among competing [[Learned Model|hypotheses]], which either rely on [[External Knowledge|outside knowledge]] ([[Supervised Learning Task|supervised learning]]) or [[internal structure]] ([[Unsupervised Learning Task|unsupervised clustering]]). </s> | ||
[[We]] argue that [[consensus learning]] is an [[NP-Hard Task|NP-hard problem]] and thus propose to solve it by an [[Efficient Algorithm|efficient]] [[heuristic method]]. </s> | [[We]] argue that [[consensus learning]] is an [[NP-Hard Task|NP-hard problem]] and thus propose to solve it by an [[Efficient Algorithm|efficient]] [[heuristic method]]. </s> | ||
[[We]] construct a [[Belief Graph|belief graph]] to first [[Network Propagation Act|propagate]] [[Prediction Act| | [[We]] construct a [[Belief Graph|belief graph]] to first [[Network Propagation Act|propagate]] [[Prediction Act|prediction]]s from [[Supervised Model|supervised models]] to the [[unsupervised]], and then negotiate and reach consensus among them.</s> | ||
Their final [[Decision Act|decision]] is further [[consolidated]] by [[calculating]] each [[model's]] [[weight]] based on its [[degree]] of consistency with other [[Modeling Task|models]].</s> | Their final [[Decision Act|decision]] is further [[consolidated]] by [[calculating]] each [[model's]] [[weight]] based on its [[degree]] of consistency with other [[Modeling Task|models]].</s> | ||
[[Experimental Evaluation|Experiment]]s are conducted on [[20 Newsgroups data]], [[Cora research papers]], [[DBLP author-conference network]], and [[Yahoo! Movies datasets]], and the [[Experimental Result|results]] show that the [[Proposed Algorithm|proposed method]] [[Performance Improvement|improves]] the [[Classification Accuracy|classification accuracy]] and the [[Clustering Quality Performance Measure|clustering quality measure]] ([[NMI]]) over the best base [[model]] by up to [[10%]].</s> | [[Experimental Evaluation|Experiment]]s are conducted on [[20 Newsgroups data]], [[Cora research papers]], [[DBLP author-conference network]], and [[Yahoo! Movies datasets]], and the [[Experimental Result|results]] show that the [[Proposed Algorithm|proposed method]] [[Performance Improvement|improves]] the [[Classification Accuracy|classification accuracy]] and the [[Clustering Quality Performance Measure|clustering quality measure]] ([[NMI]]) over the best base [[model]] by up to [[10%]].</s> |
Latest revision as of 07:26, 22 August 2024
- (Gao et al., 2009) ⇒ Jing Gao, Wei Fan, Yizhou Sun, and Jiawei Han. (2009). “Heterogeneous Source Consensus Learning via Decision Propagation and Negotiation.” In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2009). doi:10.1145/1557019.1557061
Subject Headings: Consensus-based Learning.
Notes
Cited By
- http://scholar.google.com/scholar?q=%22Heterogeneous+source+consensus+learning+via+decision+propagation+and+negotiation%22+2009
- http://portal.acm.org/citation.cfm?doid=1557019.1557061&preflayout=flat#citedby
Quotes
Author Keywords
classification, consensus, ensemble, heterogeneous sources.
Abstract
Nowadays, enormous amounts of data are continuously generated not only in massive scale, but also from different, sometimes conflicting, views. Therefore, it is important to consolidate different concepts for intelligent decision making. For example, to predict the research areas of some people, the best results are usually achieved by combining and consolidating predictions obtained from the publication network, co-authorship network and the textual content of their publications. Multiple supervised and unsupervised hypotheses can be drawn from these information sources, and negotiating their differences and consolidating decisions usually yields a much more accurate model due to the diversity and heterogeneity of these models. In this paper, we address the problem of “consensus learning” among competing hypotheses, which either rely on outside knowledge (supervised learning) or internal structure (unsupervised clustering). We argue that consensus learning is an NP-hard problem and thus propose to solve it by an efficient heuristic method. We construct a belief graph to first propagate predictions from supervised models to the unsupervised, and then negotiate and reach consensus among them. Their final decision is further consolidated by calculating each model's weight based on its degree of consistency with other models. Experiments are conducted on 20 Newsgroups data, Cora research papers, DBLP author-conference network, and Yahoo! Movies datasets, and the results show that the proposed method improves the classification accuracy and the clustering quality measure (NMI) over the best base model by up to 10%. Furthermore, it runs in time proportional to the number of instances, which is very efficient for large scale data sets.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2009 HeterogeneousSourceConsensusLea | Jing Gao Wei Fan Yizhou Sun Jiawei Han | Heterogeneous Source Consensus Learning via Decision Propagation and Negotiation | KDD-2009 Proceedings | 10.1145/1557019.1557061 | 2009 |