Discretization Task
(Redirected from Feature Discretization Task)
Jump to navigation
Jump to search
A Discretization Task is a data transformation task that maps a Numeric Variable into a Discrete Variable.
- AKA: Binning, Attribute Discretization, Feature Discretization.
- Context:
- It can involve deciding how many Bins to create
- It can involve deciding where to place the Interval Boundary.
- It can be solved by a Discretization System (that implements a Discretization algorithm).
- See: Feature Engineering.
References
2011
- (Yang, 2011a) ⇒ Ying Yang. (2011). “Discretization.” In: (Sammut & Webb, 2011) p.287
- (Wikipedia, 2009) ⇒ http://en.wikipedia.org/wiki/Discretization
- In mathematics, discretization concerns the process of transferring continuous models and equations into discrete counterparts. This process is usually carried out as a first step toward making them suitable for numerical evaluation and implementation on digital computers. In order to be processed on a digital computer another process named quantization is essential.
- Euler discretization
- Zero-order hold
- Discretization is also related to discrete mathematics, and is an important component of granular computing. In this context, discretization may also refer to modification of variable of category granularity, as when multiple discrete variables are aggregated or multiple discrete categories fused.
- In mathematics, discretization concerns the process of transferring continuous models and equations into discrete counterparts. This process is usually carried out as a first step toward making them suitable for numerical evaluation and implementation on digital computers. In order to be processed on a digital computer another process named quantization is essential.
- (Wikipedia, 2009) ⇒ http://en.wikipedia.org/wiki/Discretization_of_continuous_features
- In statistics and machine learning, discretization refers to the process of converting continuous features or variables to discretized or nominal features. This can be useful when creating probability mass functions – formally, in density estimation. It is a form of binning, as in making a histogram.
- Typically data is discretized into partitions of K equal lengths (equal intervals) or K% of the total data (equal frequencies). [1]
- Some mechanisms for discretizing continuous data include:
2002
- (Gabor Melli, 2002) ⇒ Gabor Melli. (2002). “PredictionWorks' Data Mining Glossary." PredictionWorks.
- Automated Discretization: Discretization which sets the number of bins based on the range of a numeric value. Therefore, the user is not required to specify the number of bins. However, certain values may be 'lost' from the decision tree because of automatic binning, which is not the case with intelligent binning. See Binning, Discretization.
- Binning: Choosing the number of bins into which a numeric range is split. For example, if salaries range from $20,000 to $100,000, the values must be binned into some number of groups, probably between eight and twenty. Many data mining products require the user to manually set binning. See Automated Binning, Discretization.