Long-Tailed Dataset
Jump to navigation
Jump to search
A Long-Tailed Dataset is a heavy-tailed dataset that can be represented by a long-tailed distribution.
- Example(s):
- a log of Search Engine Ad Keyword Impression.
- See: Fat-Tailed Data, Normally-Distributed Data.
References
2015
- (Chamandy et al., 2015) ⇒ Nicholas Chamandy, Omkar Muralidharan, and Stefan Wager. (2015). “Teaching Statistics at Google-Scale.” In: The American Statistician, 69(4).
- QUOTE: But most Google ad keywords are only seen a small number of times. Because of this, when doing statistical analysis of such long-tailed data, the number of units of interest scales with the amount of data collected. This creates computational challenges for traditional statistical methods.