Learn or Exploit for Adversary Induced Markov Decision Process (LoE-AIM) Algorithm

(Redirected from LoE-AIM Algorithm)

A Learn or Exploit for Adversary Induced Markov Decision Process (LoE-AIM) Algorithm is a Multi-Agent Learning (MAL) Algorithm that can be implemented by a LoE-AIM System to solve a LoE-AIM Task.

AKA: LoE-AIM Algorithm.
Example(s):
- LoE-AIM-repeated Algorithm.
- …
Counter-Example(s):,
See: Game Theory, Machine Learning System, Q-Learning, Reinforment Learning, Nash Equilibrium.

References

(Chakraborty & Sons, 2008) ⇒ Doran Chakraborty, and Peter Stone. (y2008). “Online Multiagent Learning Against Memory Bounded Adversaries.” In: Proceedings of the 2008th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I. ISBN:3-540-87478-X, 978-3-540-87478-2 doi:10.1007/978-3-540-87479-9_32
- QUOTE: The traditional agenda in Multiagent Learning (MAL) has been to develop learners that guarantee convergence to an equilibrium in self-play or that converge to playing the best response against an opponent using one of a fixed set of known targeted strategies. This paper introduces an algorithm called Learn or Exploit for Adversary Induced Markov Decision Process (LoE-AIM) that targets optimality against any learning opponent that can be treated as a memory bounded adversary. LoE-AIM makes no prior assumptions about the opponent and is tailored to optimally exploit any adversary which induces a Markov decision process in the state space of joint histories. LoE-AIM either explores and gathers new information about the opponent or converges to the best response to the partially learned opponent strategy in repeated play.

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=Learn_or_Exploit_for_Adversary_Induced_Markov_Decision_Process_(LoE-AIM)_Algorithm&oldid=754668"