Inductive Bias
An Inductive Bias is a set of assumptions used by a Machine Learning System to predict outputs from possible future inputs.
- AKA: Learning Bias, Variance Hint.
- Example(s):
- Counter-Example(s)
- See: Induction, Learning as Search, Dynamic Bias Selection, Bias Variance Decomposition, Domain-Specific Representation.
References
2023
- (Wikipedia, 2023) ⇒ https://en.wikipedia.org/wiki/inductive_bias Retrieved:2023-8-9.
- The inductive bias (also known as learning bias) of a learning algorithm is the set of assumptions that the learner uses to predict outputs of given inputs that it has not encountered.[1]
Inductive bias is anything which makes the algorithm learn one pattern instead of another pattern (e.g. step-functions in decision trees instead of continous function in a linear regression model).
In machine learning, one aims to construct algorithms that are able to learn to predict a certain target output. To achieve this, the learning algorithm is presented some training examples that demonstrate the intended relation of input and output values. Then the learner is supposed to approximate the correct output, even for examples that have not been shown during training. Without any additional assumptions, this problem cannot be solved since unseen situations might have an arbitrary output value. The kind of necessary assumptions about the nature of the target function are subsumed in the phrase inductive bias. [2]
A classical example of an inductive bias is Occam's razor, assuming that the simplest consistent hypothesis about the target function is actually the best. Here consistent means that the hypothesis of the learner yields correct outputs for all of the examples that have been given to the algorithm.
Approaches to a more formal definition of inductive bias are based on mathematical logic. Here, the inductive bias is a logical formula that, together with the training data, logically entails the hypothesis generated by the learner. However, this strict formalism fails in many practical cases, where the inductive bias can only be given as a rough description (e.g. in the case of artificial neural networks), or not at all.
- The inductive bias (also known as learning bias) of a learning algorithm is the set of assumptions that the learner uses to predict outputs of given inputs that it has not encountered.[1]
2017
- (Sammut & Webb, 2017) ⇒ (2017) "Inductive Bias". In: Sammut C., Webb G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA
- QUOTE: Most ML algorithms make predictions concerning future data which cannot be deduced from already observed data. The inductive bias of an algorithm is what choses between different possible future predictions. A strong form of inductive bias is the learner’s choice of hypothesis/model space which is sometimes called declarative bias. In the case of Bayesian analysis, the inductive bias is encapsulated in the prior distribution.
2016
- (Cohen & Shashua, 2016) ⇒ Nadav Cohen, and Amnon Shashua (2016). Inductive bias of deep convolutional networks through pooling geometry. arXiv preprint arXiv:1605.06743.
- ABSTRACT: Our formal understanding of the inductive bias that drives the success of convolutional networks on computer vision tasks is limited. In particular, it is unclear what makes hypotheses spaces born from convolution and pooling operations so suitable for natural images. In this paper we study the ability of convolutional networks to model correlations among regions of their input. We theoretically analyze convolutional arithmetic circuits, and empirically validate our findings on other types of convolutional networks as well. Correlations are formalized through the notion of separation rank, which for a given partition of the input, measures how far a function is from being separable. We show that a polynomially sized deep network supports exponentially high separation ranks for certain input partitions, while being limited to polynomial separation ranks for others. The network's pooling geometry effectively determines which input partitions are favored, thus serves as a means for controlling the inductive bias. Contiguous pooling windows as commonly employed in practice favor interleaved partitions over coarse ones, orienting the inductive bias towards the statistics of natural images. Other pooling schemes lead to different preferences, and this allows tailoring the network to data that departs from the usual domain of natural imagery. In addition to analyzing deep networks, we show that shallow ones support only linear separation ranks, and by this gain insight into the benefit of functions brought forth by depth - they are able to efficiently model strong correlation under favored partitions of the input.
2014
- (Neyshabur, Tomioka & Srebro, 2014) ⇒ Behnam Neyshabur, Ryota Tomioka, and Nathan Srebro (2014). In search of the real inductive bias: On the role of implicit regularization in deep learning. arXiv preprint arXiv:1412.6614.
- ABSATRACT: We present experiments demonstrating that some other form of capacity control, different from network size, plays a central role in learning multilayer feed-forward networks. We argue, partially through analogy to matrix factorization, that this is an inductive bias that can help shed light on deep learning.
2011
- (Sammut & Webb, 2011) ⇒ Claude Sammut (editor), and Geoffrey I. Webb (editor). (2011). “Inductive Bias.” In: (Sammut & Webb, 2011) p.522
1998
- (Caruana, 1998) ⇒ Richard Caruana. (1998). "Multitask learning. In Learning to learn" (PDF)(pp. 95-133). Springer, Boston, MA.
- ABSTRACT: This paper suggests that it may be easier to learn several hard tasks at one time than to learn these same tasks separately. In effect, the information provided by the training signal for each task serves as a domain-specific inductive bias for the other tasks. Frequently the world gives us clusters of related tasks to learn. When it does not, it is often straightforward to create additional tasks. For many domains, acquiring inductive bias by collecting additional teaching signal may be more practical than the traditional approach of codifying domain-specific biases acquired from human expertise. We call this approach MultitaskLearning (MTL). Since much of the power of an inductive learner follows directly from its inductive bias, multitask learning may yield more powerful learning. An empirical example of multitask connectionist learning is presented where learning improves by training one network on several related tasks at the same time. Multitask decision tree induction is also outlined.