2010 LargeLinearClassificationWhenDa
- (Yu et al., 2010) ⇒ Hsiang-Fu Yu, Cho-Jui Hsieh, Kai-Wei Chang, and Chih-Jen Lin. (2010). “Large Linear Classification When Data Cannot Fit in Memory.” In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2010). doi:10.1145/1835804.1835910
Subject Headings:
Notes
- Categories and Subject Descriptors: I.5.2 Pattern Recognition: Design Methodology — Classifier design and evaluation
- General Terms: Algorithms, Performance, Experimentation
Cited By
- http://scholar.google.com/scholar?q=%22Large+linear+classification+when+data+cannot+fit+in+memory%22+2010
- http://portal.acm.org/citation.cfm?id=1835910&preflayout=flat#citedby
Quotes
Author Keywords
Abstract
Recent advances in linear classification have shown that for applications such as document classification, the training can be extremely efficient. However, most of the existing training methods are designed by assuming that data can be stored in the computer memory. These methods cannot be easily applied to data larger than the memory capacity due to the random access to the disk. We propose and analyze a block minimization framework for data larger than the memory size. At each step a block of data is loaded from the disk and handled by certain learning methods. We investigate two implementations of the proposed framework for primal and dual SVMs, respectively. As data cannot fit in memory, many design considerations are very different from those for traditional algorithms. Experiments using data sets 20 times larger than the memory demonstrate the effectiveness of the proposed method.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2010 LargeLinearClassificationWhenDa | Chih-Jen Lin Hsiang-Fu Yu Cho-Jui Hsieh Kai-Wei Chang | Large Linear Classification When Data Cannot Fit in Memory | KDD-2010 Proceedings | 10.1145/1835804.1835910 | 2010 |