Software-based Prediction Structure

From GM-RKB
Jump to navigation Jump to search

A Software-based Prediction Structure is a prediction artifact that is a software structure (which maps a learning test record to a target value).



References

2016

2014

  • (Wikipedia, 2014) ⇒ http://en.wikipedia.org/wiki/Predictive_Model_Markup_Language#PMML_Components Retrieved:2014-8-12.
    • A PMML file can be described by the following components:
      • Header: contains general information about the PMML document, such as copyright information for the model, its description, and information about the application used to generate the model such as name and version. ...
      • Data Dictionary: contains definitions for all the possible fields used by the model. ...
      • Data Transformations: transformations allow for the mapping of user data into a more desirable form to be used by the mining model. ...
      • Model: contains the definition of the data mining model. E.g., A multi-layered feedforward neural network is represented in PMML by a "NeuralNetwork" element which contains attributes such as:
        • Model Name (attribute modelName)
        • Function Name (attribute functionName)
        • Algorithm Name (attribute algorithmName)
        • Name (attribute name): must refer to a field in the data dictionary
        • Usage type (attribute usageType): defines the way a field is to be used in the model. Typical values are: active, predicted, and supplementary. Predicted fields are those whose values are predicted by the model.
        • Outlier Treatment (attribute outliers): defines the outlier treatment to be use. In PMML, outliers can be treated as missing values, as extreme values (based on the definition of high and low values for a particular field), or as is.
        • Missing Value Replacement Policy (attribute missingValueReplacement): if this attribute is specified then a missing value is automatically replaced by the given values.
        • Missing Value Treatment (attribute missingValueTreatment): indicates how the missing value replacement was derived (e.g. as value, mean or median).
      • Targets: allows for post-processing of the predicted value in the format of scaling if the output of the model is continuous. Targets can also be used for classification tasks. In this case, the attribute priorProbability specifies a default probability for the corresponding target category. ...
      • Output: this element can be used to name all the desired output fields expected from the model. ...


2013

2011

  1. Geisser, Seymour (1993). Predictive Inference: An Introduction. New York: Chapman & Hall. ISBN 0-412-03471-9. 

1999

1997

  • (Mitchell, 1997) ⇒ Tom M. Mitchell. (1997). “Machine Learning." McGraw-Hill.
    • QUOTE: 1.2.2 Choosing the Target Function. The next design choice is to determin exasctly what type of knowledge will be learned and how this will be used by the performance program. ... Let us call this 'target function [math]\displaystyle{ V }[/math] and again use the notation [math]\displaystyle{ V }[/math] : [math]\displaystyle{ B }[/math]R to denote that [math]\displaystyle{ V }[/math] maps any legal board state from the set [math]\displaystyle{ B }[/math] to some real value. We intend for this target function [math]\displaystyle{ V }[/math] to assign higher scores to better board states ... Thus, we have reduced the learning task in this case to the problem of discover an operational description of the ideal target function V. It may be very difficult in general to learn such an operational form of [math]\displaystyle{ V }[/math] perfectly. In fact we often expect learning algorithms to acquire only some approximation to the target function, and for this reason the process of learning the target function is often called function approximation. In the current discussion we will use the symbol V^ to refer to the function that is actually learned by our program, to distinguish it from the ideal target function V. ...

      The key point in the above paragraph is that a lazy learning has the option of (implicitly) representing the target function by a combination of many local approximations, whereas an eager learner must commit at training time to a single global approximation. The distinction between eager and lazy learning is thus related to the distinction between global and local approximations to the target function.