2001 PrinciplesOfDataMining
- (Hand et al., 2001) ⇒ David J. Hand, Heikki Mannila, and Padhraic Smyth. (2001). “Principles of Data Mining.” In: MIT Press. ISBN:026208290X
Subject Headings: Data Mining Textbook
Notes
- The Preface is available at http://mitpress.mit.edu/books/chapters/026208290Xpref1.pdf
- Chapter 1. is available at http://mitpress.mit.edu/books/chapters/026208290Xchap1.pdf
Cited By
Quotes
Preface
The science of extracting useful information from large data sets or databases is known as data mining. It is a new [[Scientific Discipline|discipline, lying at the intersection of statistics, machine learning, data management and databases, pattern recognition, artificial intelligence, and other areas. All of these are concerned with certain aspects of data analysis, so they have much in common — but each also has its own distinct flavor, emphasizing particular problems and types of solution. …
This text has a different bias. We have attempted to provide a foundational view of data mining. Rather than discuss specific data mining applications at length (such as, say, collaborative filtering, credit scoring, and fraud detection), we have instead focused on the underlying theory and algorithms that provide the “glue” for such applications. This is not to say that we do not pay attention to the applications. Data mining is fundamentally an applied discipline, and with this in mind we make frequent references to case studies and specific applications where the basic theory can (or has been) applied. In our view a mastery of data mining requires an understanding of both statistical and computational issues. This requirement to master two different areas of expertise presents quite a challenge for student and teacher alike. For the typical computer scientist, the statistics literature is relatively impenetrable: a litany of jargon, implicit assumptions, asymptotic arguments, and lack of details on how the theoretical and mathematical concepts are actually realized in the form of a data analysis algorithm. The situation is effectively reversed for statisticians: the computer science literature on machine learning and data mining is replete with discussions of algorithms, pseudocode, computational efficiency, and so forth, often with little reference to an underlying model or inference procedure. An important point is that both approaches are nonetheless essential when dealing with large data sets. An understanding of both the “mathematical modeling” view, and the “computational algorithm” view are essential to properly grasp the complexities of data mining. …
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2001 PrinciplesOfDataMining | Padhraic Smyth Heikki Mannila David J. Hand | Principles of Data Mining | http://books.google.com/books?id=SdZ-bhVhZGYC | 2001 |