Adaptive Real-Time Dynamic Programming (ARTDP) Algorithm
(Redirected from ARTDP)
Jump to navigation
Jump to search
An Adaptive Real-Time Dynamic Programming (ARTDP) Algorithm is an online algorithm based on real-time dynamic programming that uses agent behavior.
- AKA: ARTDP.
- Context:
- It is a based on reinforcement learning.
- Example(s):
- …
- Counter-Example(s):
- See: Anytime Algorithm; Approximate Dynamic Programming; Reinforcement Learning; System Identification.
References
2011
- (Barto, 2017) ⇒ Barto A.G. (2017) Adaptive Real-Time Dynamic Programming. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA
- QUOTE: Adaptive Real-Time Dynamic Programming (ARTDP) is an algorithm that allows an agent to improve its behavior while interacting over time with an incompletely known dynamic environment. It can also be viewed as a heuristic search algorithm for finding shortest paths in incompletely known stochastic domains. ARTDP is based on Dynamic Programming (DP), but unlike conventional DP, which consists of off-line algorithms, ARTDP is an on-line algorithm because it uses agent behavior to guide its computation. ARTDP is adaptive because it does not need a complete and accurate model of the environment but learns a model from data collected during agent-environment interaction. When a good model is available, Real-Time Dynamic Programming (RTDP) is applicable, which is ARTDP without the model-learning component.