Associative Reinforcement Learning Algorithm
(Redirected from Bandit-Problem with Side Information)
Jump to navigation
Jump to search
An Associative Reinforcement Learning Algorithm is a Reinforcement Learning Algorithm that applies concepts from associative learning.
- AKA: Associative Bandit Problem, Bandit Problem with Side Information, Bandit Problem with Side Observations, One-step Reinforcement Learning.
- Example(s):
- …
- Counter-Example(s):
- a k-Armed Bandit Algorithm,
- an Average-Reward Reinforcement Learning Algotithm,
- a Bayesian Reinforcement Learning Algorithm,
- a Deep Reinforcement Learning Algorithm,
- a Gaussian Process Reinforcement Learning Alogrithm,
- a Hierarchical Reinforcement Learning Algorithm,
- a Instance-Based Reinforcement Learning Algorithm,
- a Least Squares Reinforcement Learning Algorithm,
- an One-Step Reinforcement Learning Algorithm,
- a Q-Learning Algorithm,
- a Relational Reinforcement Learning Algorithm.
- See: Active Learning Algorithm, Machine Learning Algorithm, Deep Learning Algorithm.
References
2017
- (Strehl, 2017) ⇒ Alexander L. Strehl (2017) Associative Reinforcement Learning. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA
- QUOTE: The associative reinforcement-learning problem is a specific instance of the reinforcement learning problem whose solution requires generalization and exploration but not temporal credit assignment. In associative reinforcement learning, an action (also called an arm) must be chosen from a fixed set of actions during successive timesteps and from this choice a real-valued reward or payoff results. On each timestep, an input vector is provided that along with the action determines, often probabilistically, the reward. The goal is to maximize the expected long-term reward over a finite or infinite horizon. It is typically assumed that the action choices do not affect the sequence of input vectors. However, even if this assumption is not asserted, learning algorithms are not required to infer or model the relationship between input vectors from one timestep to the next. Requiring a learning algorithm to discover and reason about this underlying process results in the full reinforcement learning problem.
2013
- (Alonso & Mondragón, 2013) ⇒ Eduardo Alonso, and Esther Mondragón. (2013). “Associative Reinforcement Learning.” ...
- ABSTRACT: In this position paper we propose to enhance learning algorithms, reinforcement learning in particular, for agents and for multi-agent systems, with the introduction of concepts and mechanisms borrowed from associative learning theory. It is argued that existing algorithms are limited in that they adopt a very restricted view of what “learning” is, partly due to the constraints imposed by the Markov assumption upon which they are built. Interestingly, psychological theories of associative learning account for a wide range of social behaviours, making it an ideal framework to model learning in single agent scenarios as well as in multi-agent domains.