2013 RepresentingDocumentsthroughthe
- (El-Arini et al., 2013) ⇒ Khalid El-Arini, Min Xu, Emily B. Fox, and Carlos Guestrin. (2013). “Representing Documents through their Readers.” In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ISBN:978-1-4503-2174-7 doi:10.1145/2487575.2487596
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%222013%22+Representing+Documents+through+their+Readers
- http://dl.acm.org/citation.cfm?id=2487575.2487596&preflayout=flat#citedby
Quotes
Author Keywords
Abstract
From Twitter to Facebook to Reddit, users have become accustomed to sharing the articles they read with friends or followers on their social networks. While previous work has modeled what these shared stories say about the user who shares them, the converse question remains unexplored: what can we learn about an article from the identities of its likely readers? To address this question, we model the content of news articles and blog posts by attributes of the people who are likely to share them. For example, many Twitter users describe themselves in a short profile, labeling themselves with phrases such as “vegetarian” or “liberal”. By assuming that a user's labels correspond to topics in the articles he shares, we can learn a labeled dictionary from a training corpus of articles shared on Twitter. Thereafter, we can code any new document as a sparse non-negative linear combination of user labels, where we encourage correlated labels to appear together in the output via a structured sparsity penalty.
Finally, we show that our approach yields a novel document representation that can be effectively used in many problem settings, from recommendation to modeling news dynamics. For example, while the top politics stories will change drastically from one month to the next, the “politics “label will still be there to describe them. We evaluate our model on millions of tweeted news articles and blog posts collected between September 2010 and September 2012, demonstrating that our approach is effective.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2013 RepresentingDocumentsthroughthe | Khalid El-Arini Carlos Guestrin Min Xu Emily B. Fox | Representing Documents through their Readers | 10.1145/2487575.2487596 | 2013 |