1990 IntegratedArchitectureforLearni
- (Sutton, 1990) ⇒ Richard S. Sutton. (1990). “Integrated Architecture for Learning, Planning, and Reacting based on Approximating Dynamic Programming.” In: [[Proceedings of the seventh international conference (1990) on Machine learning]]. ISBN:1-55860-141-4
Subject Headings: Reinforcement Learning.
Notes
Cited By
- http://scholar.google.com/scholar?q=%221990%22+Integrated+Architecture+for+Learning%2C+Planning%2C+and+Reacting+based+on+Approximating+Dynamic+Programming
- http://dl.acm.org/citation.cfm?id=101883.102055&preflayout=flat#citedby
Quotes
Abstract
This paper extends previous work with Dyna, a class of architectures for intelligent systems based on approximating dynamic programming methods. Dyna architectures integrate trial-and-error (reinforcement) learning and execution-time planning into a single process operating alternately on the world and on a learned model of the world. In this paper, I present and show results for two Dyna architectures. The Dyna-PI architecture is based on dynamic programming's policy iteration method and can be related to existing AI ideas such as evaluation functions and universal plans (reactive systems). Using a navigation task, results are shown for a simple Dyna-PI system that simultaneously learns by trial and error, learns a world model, and plans optimal routes using the evolving world model. The Dyna-Q architecture is based on Watkins's Q-learning, a new kind of reinforcement learning. Dyna-Q uses a less familiar set of data structures than does Dyna-PI, but is arguably simpler to implement and use. We how that Dyna-Q architectures are easy to adapt for use in changing.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
1990 IntegratedArchitectureforLearni | Richard S. Sutton | Integrated Architecture for Learning, Planning, and Reacting based on Approximating Dynamic Programming | 1990 |