Software-based Prediction Structure
Jump to navigation
Jump to search
A Software-based Prediction Structure is a prediction artifact that is a software structure (which maps a learning test record to a target value).
- AKA: Target Prediction Model.
- Context:
- input: a Data Record.
- output: one or more Predicted Values.
- Optional Structure Output: A Confidence Score (such as a probability value)
- measure:
- Prediction Accuracy (such as F-measure or Root Mean Square).
- It can (typically) make use of Predictor Feature Functions.
- It can be represented by a Predictive Modeling Representation Language, such as PMML.
- It can range from being a Number Prediction Structure to being a Rank Prediction Structure to being an Class Prediction Structure (e.g. a sequence tagging model, or a relational predictive model).
- It can range from being a Heuristic Prediction Structure to being an Trained Prediction Structure.
- It can range from being a Model-based Prediction Structure to being an Instance-based Prediction Structure.
- It can range from being a Memory-based Prediction Structure to being a File-based Prediction Structure.
- It can range from being a Simple Predictive Function to being a Composite Predictive Model (such as an ensemble model or a joint predictive model).
- It can range from being a Ad-hoc Predictive Model to being a Productionized Predictive Model.
- It can range, depending on the Target Data Type, from being a: Predictive Relation (if the output set is binary), to being Predictive Classifier (if the output set is categorical), to being a Predictive Ranker, to being a Predictive Estimator.
- It can range from being a Champion Predictive Function to being a Challenger Predictive Function.
- It can range from being an Overfitted Predictive Model to being an Underfitted Predictive Model.
- It can range from being a Small Predictive Model to being a Large Predictive Model.
- It can have Domain/Coverage, that defines/restricts the type of Testing Records that it accepts.
- It can be used to predict Future States.
- It can be created by a Predictive Model Creation Task.
- Example(s):
- pCTR(), pEPC(), ...
- a Financial Delinquency Predictive Model.
- a Predictive Timeseries Model / Forecasting Model.
- a Item Relevance Scoring Model.
- a Spam Filtering Model.
- a Single Model-based Predictive Model, such as a: Decision Tree Structure, or a Linear Predictor, or a logistic regression based predictive model, or a kNN-based predictive model, or a (depending on which predictive modeling algorithm used).
- a scikit-learn Model, such as scikit-learn Model File.
- a PMML-based Model File.
- …
- Counter-Example(s):
- See: Predictive Operation, Decision Function, Non-Monotonic Reasoning, Justified Belief, Fact, Probability Function.
References
2016
- Jason Brownlee. (2016). “Deploy Your Predictive Model To Production.” In: Machine Learning Process, 2016-09-30.
- QUOTE: Not all predictive models are at Google-scale. Sometimes you develop a small predictive model that you want to put in your software. ...
2014
- (Wikipedia, 2014) ⇒ http://en.wikipedia.org/wiki/Predictive_Model_Markup_Language#PMML_Components Retrieved:2014-8-12.
- A PMML file can be described by the following components:
- Header: contains general information about the PMML document, such as copyright information for the model, its description, and information about the application used to generate the model such as name and version. ...
- Data Dictionary: contains definitions for all the possible fields used by the model. ...
- Data Transformations: transformations allow for the mapping of user data into a more desirable form to be used by the mining model. ...
- Model: contains the definition of the data mining model. E.g., A multi-layered feedforward neural network is represented in PMML by a "NeuralNetwork" element which contains attributes such as:
- Model Name (attribute modelName)
- Function Name (attribute functionName)
- Algorithm Name (attribute algorithmName)
- …
- Name (attribute name): must refer to a field in the data dictionary
- Usage type (attribute usageType): defines the way a field is to be used in the model. Typical values are: active, predicted, and supplementary. Predicted fields are those whose values are predicted by the model.
- Outlier Treatment (attribute outliers): defines the outlier treatment to be use. In PMML, outliers can be treated as missing values, as extreme values (based on the definition of high and low values for a particular field), or as is.
- Missing Value Replacement Policy (attribute missingValueReplacement): if this attribute is specified then a missing value is automatically replaced by the given values.
- Missing Value Treatment (attribute missingValueTreatment): indicates how the missing value replacement was derived (e.g. as value, mean or median).
- Targets: allows for post-processing of the predicted value in the format of scaling if the output of the model is continuous. Targets can also be used for classification tasks. In this case, the attribute priorProbability specifies a default probability for the corresponding target category. ...
- Output: this element can be used to name all the desired output fields expected from the model. ...
- A PMML file can be described by the following components:
2013
- http://scikit-learn.org/stable/modules/model_persistence.html#model-persistence
- QUOTE: After training a scikit-learn model, it is desirable to have a way to persist the model for future use without having to retrain.
2011
- http://en.wikipedia.org/wiki/Predictive_modelling
- Predictive modelling is the process by which a model is created or chosen to try to best predict the probability of an outcome.[1] In many cases the model is chosen on the basis of detection theory to try to guess the probability of an outcome given a set amount of input data, for example given an email determining how likely that it is spam.
Models can use one or more classifiers in trying to determine the probability of a set of data belonging to another set, say spam or 'ham'.
- Predictive modelling is the process by which a model is created or chosen to try to best predict the probability of an outcome.[1] In many cases the model is chosen on the basis of detection theory to try to guess the probability of an outcome given a set amount of input data, for example given an email determining how likely that it is spam.
- ↑ Geisser, Seymour (1993). Predictive Inference: An Introduction. New York: Chapman & Hall. ISBN 0-412-03471-9.
1999
- (Zaiane, 1999) ⇒ Osmar Zaiane. (1999). “Glossary of Data Mining Terms." University of Alberta, Computing Science CMPUT-690: Principles of Knowledge Discovery in Databases.
- QUOTE: Predictive Model: A structure and process for predicting the values of specified variables in a dataset.
1997
- (Mitchell, 1997) ⇒ Tom M. Mitchell. (1997). “Machine Learning." McGraw-Hill.
- QUOTE: 1.2.2 Choosing the Target Function. The next design choice is to determin exasctly what type of knowledge will be learned and how this will be used by the performance program. ... Let us call this 'target function [math]\displaystyle{ V }[/math] and again use the notation [math]\displaystyle{ V }[/math] : [math]\displaystyle{ B }[/math] → R to denote that [math]\displaystyle{ V }[/math] maps any legal board state from the set [math]\displaystyle{ B }[/math] to some real value. We intend for this target function [math]\displaystyle{ V }[/math] to assign higher scores to better board states ... Thus, we have reduced the learning task in this case to the problem of discover an operational description of the ideal target function V. It may be very difficult in general to learn such an operational form of [math]\displaystyle{ V }[/math] perfectly. In fact we often expect learning algorithms to acquire only some approximation to the target function, and for this reason the process of learning the target function is often called function approximation. In the current discussion we will use the symbol V^ to refer to the function that is actually learned by our program, to distinguish it from the ideal target function V. ...
The key point in the above paragraph is that a lazy learning has the option of (implicitly) representing the target function by a combination of many local approximations, whereas an eager learner must commit at training time to a single global approximation. The distinction between eager and lazy learning is thus related to the distinction between global and local approximations to the target function.
- QUOTE: 1.2.2 Choosing the Target Function. The next design choice is to determin exasctly what type of knowledge will be learned and how this will be used by the performance program. ... Let us call this 'target function [math]\displaystyle{ V }[/math] and again use the notation [math]\displaystyle{ V }[/math] : [math]\displaystyle{ B }[/math] → R to denote that [math]\displaystyle{ V }[/math] maps any legal board state from the set [math]\displaystyle{ B }[/math] to some real value. We intend for this target function [math]\displaystyle{ V }[/math] to assign higher scores to better board states ... Thus, we have reduced the learning task in this case to the problem of discover an operational description of the ideal target function V. It may be very difficult in general to learn such an operational form of [math]\displaystyle{ V }[/math] perfectly. In fact we often expect learning algorithms to acquire only some approximation to the target function, and for this reason the process of learning the target function is often called function approximation. In the current discussion we will use the symbol V^ to refer to the function that is actually learned by our program, to distinguish it from the ideal target function V. ...