2012 HarnessingtheWisdomoftheCrowdsf
- (Zhang et al., 2012) ⇒ Lei Zhang, Linpeng Tang, Ping Luo, Enhong Chen, Limei Jiao, Min Wang, and Guiquan Liu. (2012). “Harnessing the Wisdom of the Crowds for Accurate Web Page Clipping.” In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2012). ISBN:978-1-4503-1462-6 doi:10.1145/2339530.2339621
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%222012%22+Harnessing+the+Wisdom+of+the+Crowds+for+Accurate+Web+Page+Clipping
- http://dl.acm.org/citation.cfm?id=2339530.2339621&preflayout=flat#citedby
Quotes
Author Keywords
Abstract
Clipping Web pages, namely extracting the informative clips (areas) from Web pages, has many applications, such as Web printing and e-reading on small handheld devices. Although many existing methods attempt to address this task, most of them can either work only on certain types of Web pages (e.g., news - and blog-like web pages), or perform semi-automatically where extra user efforts are required in adjusting the outputs. The problem of clipping any types of Web pages accurately in a totally automatic way remains pretty much open. To this end in this study we harness the wisdom of the crowds to provide accurate recommendation of informative clips on any given Web pages. Specifically, we leverage the knowledge on how previous users clip similar Web pages, and this knowledge repository can be represented as a transaction database where each transaction contains the clips selected by a user on a certain Web page. Then, we formulate a new pattern mining problem, mining top-1 qualified pattern, on transaction database for this recommendation. Here, the recommendation considers not only the pattern support but also the pattern occupancy (proposed in this work). High support requires that patterns appear frequently in the database, while high occupancy requires that patterns occupy a large portion of the transactions they appear in. Thus, it leads to both precise and complete recommendation. Additionally, we explore the properties on occupancy to further prune the search space for high-efficient pattern mining. Finally, we show the effectiveness of the proposed algorithm on a human-labeled ground truth dataset consisting of 2000 web pages from 100 major Web sites, and demonstrate its efficiency on large synthetic datasets.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2012 HarnessingtheWisdomoftheCrowdsf | Enhong Chen Lei Zhang Ping Luo Min Wang Linpeng Tang Limei Jiao Guiquan Liu | Harnessing the Wisdom of the Crowds for Accurate Web Page Clipping | 10.1145/2339530.2339621 | 2012 |