2011 SmoothingTechniquesforAdaptiveO
- (Lin et al., 2011) ⇒ Jimmy Lin, Rion Snow, and William Morgan. (2011). “Smoothing Techniques for Adaptive Online Language Models: Topic Tracking in Tweet Streams.” In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2011) Journal. ISBN:978-1-4503-0813-7 doi:10.1145/2020408.2020476
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%222011%22+Smoothing+Techniques+for+Adaptive+Online+Language+Models%3A+Topic+Tracking+in+Tweet+Streams
- http://dl.acm.org/citation.cfm?id=2020408.2020476&preflayout=flat#citedby
Quotes
Author Keywords
Abstract
We are interested in the problem of tracking broad topics such as “baseball” and “fashion” in continuous streams of short texts, exemplified by tweets from the microblogging service Twitter. The task is conceived as a language modeling problem where per-topic models are trained using hashtags in the tweet stream, which serve as proxies for topic labels. Simple perplexity-based classifiers are then applied to filter the tweet stream for topics of interest. Within this framework, we evaluate, both intrinsically and extrinsically, smoothing techniques for integrating " foreground " models (to capture recency) and "background" models (to combat sparsity), as well as different techniques for retaining history. Experiments show that unigram language models smoothed using a normalized extension of stupid backoff and a simple queue for history retention performs well on the task.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2011 SmoothingTechniquesforAdaptiveO | Rion Snow Jimmy Lin William Morgan | Smoothing Techniques for Adaptive Online Language Models: Topic Tracking in Tweet Streams | 10.1145/2020408.2020476 | 2011 |