Data Mining System
Jump to navigation
Jump to search
A Data Mining System is a data analysis system that can solve a data mining task by implementing a data mining algorithm.
- Context:
- It can (typically) be used by a Data Mining Practitioner.
- It can range from being a Data Mining Package to being a Data Mining Service.
- It can range from being a Structured-data Data Mining System to being a Text-data Data Mining System.
- It can range from being an Industrial Data Mining System to being a Prototype Data Mining System.
- It can range from being a Small-Data Mining System to being a Medium-Data Mining System to being a Large-Data Mining System.
- It can (sometimes) be an Extract-Transform-Load System.
- It can be a Data Management System.
- It can be a Statistical Analysis System.
- It can be a Visualization System.
- It can be a Simulation System.
- …
- Example(s):
- a Python-based Data Mining System, R-based Data Mining System, Scala-based Data Mining System, Java-based Data Mining System, ...
- Data Mining Environments, such as: Weka, RapidMiner, SAS, MATLAB, SPSS, System R, Stata, ...
- Data Mining Tools, such as: scikit.learn, SVMlight, MALLET, SRILM, ...
- Data Mining Services, such as: BigML service, Azure ML service.
- …
- Counter-Example(s):
- See: Data Mining Discipline.
References
- http://en.wikipedia.org/wiki/Data_mining#Software
- http://en.wikipedia.org/wiki/Category:Data_mining_and_machine_learning_software
- http://en.wikipedia.org/wiki/List_of_numerical_analysis_software
2000
- (Witten & Frank, 2000) ⇒ Ian H. Witten, and Eibe Frank. (2000). “Data Mining: Practical Machine Learning Tools and Techniques with Java implementations." Morgan Kaufmann.
- Machine learning systems can use a wide variety of other information about attributes. For instance, dimensional considerations could be used to restrict the search to expressions or comparisons that are dimensionally correct. Circular ordering could affect the kinds of tests that are considered. For example, in a temporal context, tests on a
day
attribute could involvenext day, previous day, next week, same day next week
. Partial orderings, that, generalize/specialization relations, frequently occur in practical situations. Information this kind is often referred to as metadata, data about data. However, the kind of practical schemes currently used for data mining are rarely capable of taking metadata into account, although it is likely that these capabilities will develop rapidly in the future.
- Machine learning systems can use a wide variety of other information about attributes. For instance, dimensional considerations could be used to restrict the search to expressions or comparisons that are dimensionally correct. Circular ordering could affect the kinds of tests that are considered. For example, in a temporal context, tests on a