sklearn.ensemble.RandomTreesEmbedding

1) Import the Totally Random Trees Embedding System from scikit-learn : from sklearn.ensemble import RandomTreesEmbedding

2) Generate training data or load observations dataset: X,y

rt=RandomTreesEmbedding([n_estimators=10, max_depth=5, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, ...])

4) Choose method(s):

Example(s):
Counter-Example(s):
See: Decision Tree, Decision Tree Ensemble Learning System, Regression System, Regularization Task, Ridge Regression Task, Random Forests System, Regression Algorithm.

References

(Scikit Learn, 2017A) ⇒ http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomTreesEmbedding.html Retrieved:2017-10-29.
- QUOTE: class sklearn.ensemble.RandomTreesEmbedding(n_estimators=10, max_depth=5, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, sparse_output=True, n_jobs=1, random_state=None, verbose=0, warm_start=False).
  An ensemble of totally random trees.
  An unsupervised transformation of a dataset to a high-dimensional sparse representation. A datapoint is coded according to which leaf of each tree it is sorted into. Using a one-hot encoding of the leaves, this leads to a binary coding with as many ones as there are trees in the forest.
  The dimensionality of the resulting representation is n_out <= n_estimators * max_leaf_nodes. If max_leaf_nodes == None, the number of leaf nodes is at most n_estimators * 2 ** max_depth.

(Scikit Learn, 2017B) ⇒ http://scikit-learn.org/stable/modules/ensemble.html#random-trees-embedding Retrieved:2017-10-29
- QUOTE: RandomTreesEmbedding implements an unsupervised transformation of the data. Using a forest of completely random trees, RandomTreesEmbedding encodes the data by the indices of the leaves a data point ends up in. This index is then encoded in a one-of-K manner, leading to a high dimensional, sparse binary coding. This coding can be computed very efficiently and can then be used as a basis for other learning tasks. The size and sparsity of the code can be influenced by choosing the number of trees and the maximum depth per tree. For each tree in the ensemble, the coding contains one entry of one. The size of the coding is at most n_estimators * 2 ** max_depth, the maximum number of leaves in the forest.
  As neighboring data points are more likely to lie within the same leaf of a tree, the transformation performs an implicit, non-parametric density estimation.

(Moosmann et al., 2007) ⇒ Moosmann, F., Triggs, B., & Jurie, F. (2007). Fast discriminative visual codebooks using randomized clustering forests. In Advances in Neural Information Processing Systems (pp. 985-992).
- ABSTRACT: Some of the most effective recent methods for content-based image classification work by extracting dense or sparse local image descriptors, quantizing them according to a coding rule such as k-means vector quantization, accumulating histograms of the resulting “visual word” codes over the image, and classifying these with a conventional classifier such as an SVM. Large numbers of descriptors and large codebooks are needed for good results and this becomes slow using k-means. We introduce Extremely Randomized Clustering Forests – ensembles of randomly created clustering trees – and show that these provide more accurate results, much faster training and testing and good resistance to background clutter in several state-of-the-art image classification tasks.

(Geurts et al., 2006) ⇒ Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine learning, 63(1), 3-42. https://doi.org/10.1007/s10994-006-6226-1
- ABSTRACT: This paper proposes a new tree-based ensemble method for supervised classification and regression problems. It essentially consists of randomizing strongly both attribute and cut-point choice while splitting a tree node. In the extreme case, it builds totally randomized trees whose structures are independent of the output values of the learning sample. The strength of the randomization can be tuned to problem specifics by the appropriate choice of a parameter. We evaluate the robustness of the default choice of this parameter, and we also provide insight on how to adjust it in particular situations. Besides accuracy, the main strength of the resulting algorithm is computational efficiency. A bias/variance analysis of the Extra-Trees algorithm is also provided as well as a geometrical and a kernel characterization of the models induced.