Decision Epoch
Jump to navigation
Jump to search
- A Decision Epoch is a point in time at which a decision is made.
- Context:
- It can be defined in a Markov Decision Process represented by [math]\displaystyle{ (T, S, A_s, p_t(.|s,a), r_t(s,a)) }[/math] where
- [math]\displaystyle{ T = 1,\cdots, N }[/math] are the decision epochs, the set of points in time at which decisions are made
- [math]\displaystyle{ S }[/math] is a finite set of states,
- [math]\displaystyle{ A_s }[/math] is a finite set of actions (alternatively, [math]\displaystyle{ A_s }[/math] is the finite set of actions available from state [math]\displaystyle{ s }[/math]),
- [math]\displaystyle{ p_t(.|s,a) }[/math] is the probability that action [math]\displaystyle{ a }[/math] in state [math]\displaystyle{ s }[/math] at time (decision epoch) [math]\displaystyle{ t }[/math].
- [math]\displaystyle{ r_t(s,a)) }[/math] is the reward function, the immediate result of taking action [math]\displaystyle{ a }[/math] at state [math]\displaystyle{ s }[/math] and at time (decision epoch) [math]\displaystyle{ t }[/math].
- It can be defined in a Markov Decision Process represented by [math]\displaystyle{ (T, S, A_s, p_t(.|s,a), r_t(s,a)) }[/math] where
- See: Markov Decision Process, Decision Maker, Finite State Space, Finite Action Space.
References
2011
- (Sammut & Webb, 2011) ⇒ Claude Sammut (editor), and Geoffrey I. Webb (editor). (2011). “Decision Epoch.” In: (Sammut & Webb, 2011) p.261
- Decision Epoch - In a Markov decision process, decision epochs are sequences of times at which the decision-maker is required to make a decision. In a discrete time Markov decision process, decision epochs occur at regular, fixed intervals, whereas in a continuous time Markov decision process (or semi-Markov decision process), they may occur at randomly distributed intervals.
2010
- (Alagoz et al., 2010) ⇒ Alagoz, O., Hsu, H., Schaefer, A. J., & Roberts, M. S. (2010). Markov decision processes: a tool for sequential decision making under uncertainty. Medical Decision Making, 30(4), 474-483. doi: 10.1177/0272989X09353194
- QUOTE: The basic definition of a discrete-time MDP contains 5 components, described using a standard notation.[1] For comparison, Table 1 lists the components of an MDP and provides the corresponding structure in a standard Markov process model. [math]\displaystyle{ T = 1,\cdots, N }[/math] are the decision epochs, the set of points in time at which decisions are made (such as days or hours); [math]\displaystyle{ S }[/math] is the state space, the set of all possible values of dynamic information relevant to the decision process; for any state [math]\displaystyle{ s \in S,\; A_s }[/math] is the action space, the set of possible actions that the decision maker can take at state [math]\displaystyle{ s; p_t(.|s,a) }[/math] are the transition probabilities, the probabilities that determine the state of the system in the next decision epoch, which are conditional on the state and action at the current decision epoch; and [math]\displaystyle{ r_t(s,a) }[/math] is the reward function, the immediate result of taking action a at state [math]\displaystyle{ s.\;(T, S, A_s, p_t(.|s,a), r_t(s,a)) }[/math] collectively define an MDP.
- ↑ Sandikci B, Maillart LM, Schaefer AJ, Alagoz O, Roberts MS. Estimating the patient's price of privacy in liver transplantation. Oper Res. 2008;56(6):1393–410.