Stochastic Decision Process
(Redirected from stochastic decision process)
Jump to navigation
Jump to search
A Stochastic Decision Process is a decision process that is a stochastic process.
- AKA: Sequential Planning Under Uncertainty.
- Context:
- It can range from being a Discrete-Time Stochastic Decision Process to being a Continuous-Time Decision Process.
- It can range from a Finite Stochastic Decision Process to being an Infinite Stochastic Decision Process.
- Example(s):
- Counter-Example(s):
- See: Markovian Stochastic Decision Process.
References
2015
- (Bäuerle & Riess, 2015) ⇒ Nicole Bäuerle, and Viola Riess. (2015). “On Markov Decision Processes.” In: SIAM News Journal, Printable version Printable version June 01, 2015.
- QUOTE: Sequential planning under uncertainty is a basic optimization problem that arises in many different settings, ranging from artificial intelligence to operations research. In a generic system, we have an agent who chooses among different actions and then receives a reward, after which the system moves on in a stochastic way. Usually the aim is to maximize the expected (discounted) reward of the system over a finite or, in certain cases, as described below, an infinite time horizon.
To obtain a tractable problem, it is often assumed that the transition law of the underlying state process is Markovian, i.e., that only the current state has an influence on future states. Such a situation leads to a Markov decision process (MDP); textbooks on MDPs include [1, 3, 5, 7]. MDPs differ from general stochastic control problems in that the actions are taken at discrete time points, rather than continuously. Stochastic shortest-path problems, consumption and investment of money, allocation of resources, production planning, and harvesting problems are a few examples of MDPs.
- QUOTE: Sequential planning under uncertainty is a basic optimization problem that arises in many different settings, ranging from artificial intelligence to operations research. In a generic system, we have an agent who chooses among different actions and then receives a reward, after which the system moves on in a stochastic way. Usually the aim is to maximize the expected (discounted) reward of the system over a finite or, in certain cases, as described below, an infinite time horizon.
1982
- Pliska, Stanley R. “A discrete time stochastic decision model." In Advances in Filtering and Optimal Stochastic Control, pp. 290-304. Springer Berlin Heidelberg, 1982.