Multi-Agent Learning (MAL) Algorithm
(Redirected from multiagent learning algorithm)
Jump to navigation
Jump to search
A Multi-Agent Learning (MAL) Algorithm is an Multi-Agent Algorithm that can be implemented by a multi-agent learning system to solve a multi-agent learning task.
- Context:
- It can range from being a simple Multi-Agent Learning Algorithm to being a Multi-Agent Reinforcent Learning Algorithm.
- It can range from being a Cooperative Multi-Agent Learning Algorithm to being a Competitive Multi-Agent Learning Algorithm.
- Example(s):
- Adapt When Everybody is Stationary Otherwise Move to Equilibrium (AWESOME) Algorithm,
- Enhanced Cooperative Multi-Agent Learning Algorithm (ECMLA) Algorithm,
- Learn or Exploit for Adversary Induced Markov Decision Process (LoE-AIM) Algorithm,
- Replicatior Dynamics with a Variable Learning Rate (ReDVaLeR) Algorithm,
- Weighted Policy Learner (WPL) Algorithm,
- Win or Learn Fast (WoLF) Algorithm.
- …
- Counter-Example(s):
- See: Game Theory, Nash Equilibrium, Machine Learning System, Q-Learning, Reinforcent Learning, Multi-Agent Learning Testbed (MALT).
References
2017
- (Shoham & Powers, 2017) ⇒ Yoav Shoham, and Rob Powers (2017). “Multi-Agent Learning Algorithms”. In: (Sammut & Webb, 2017)
- QUOTE: Multi-agent learning (MAL) refers to settings in which multiple agents learn simultaneously. Usually defined in a game theoretic setting, specifically in repeated games or stochastic games, the key feature that distinguishes MAL from single-agent learning is that in the former the learning of one agent impacts the learning of others. As a result, neither the problem definition for multi-agent learning nor the algorithms offered follow in a straightforward way from the single-agent case. In this second of two entries on the subject, we focus on algorithms.
2014
- (Zawadzki et al., 2014) ⇒ Erik Zawadzki, Asher Lipson, and Kevin Leyton-Brown. (2014). “Empirically Evaluating Multiagent Learning Algorithms.” arXiv:1401.8074
- QUOTE: We have developed a new suite of tools for running multiagent experiments: the MultiAgent Learning Testbed (MALT). These tools are designed to facilitate larger and more comprehensive experiments by removing the need to build one-off experimental code. MALT also provides baseline implementations of many MAL algorithms, howefully eliminating or reducing differences between algorithm implementations and increasing the reproducibility of results. Using this test suite, we ran an experiment unprecedented in size. We analyzed the results according to a variety of performance metrics including reward, maximn distance, regret, and several notions of equilibrium convergence. We confirmed several pieces of conventional wisdom, but also discovered some surprising results. For example, we found that single-agent Q-learning outperformed many more complicated and more modern MAL algorithms.
2007
- (Conitzer & Sandholm, 2007) ⇒ Vincent Conitzer, and Tuomas Sandholm. (2007). “AWESOME: A General Multiagent Learning Algorithm That Converges in Self-play and Learns a Best Response Against Stationary Opponents.” In: Machine Learning Journal, 67(1-2). doi:10.1007/s10994-006-0143-1
- QUOTE: Two minimal desirable properties of a good multiagent learning algorithm are
- Learning to play optimally against stationary opponents (or even opponents that eventually become stationary) [1].
- Convergence to a Nash equilibrium in self-play (that is, when all the agents use the same learning algorithm).
- QUOTE: Two minimal desirable properties of a good multiagent learning algorithm are
- ↑ This property has sometimes been called rationality (Bowling & Veloso, 2002), but we avoid that term because it has an established, different meaning in economics.
2005
- (Hoen et al., 2005) ⇒ Pieter Jan 't Hoen, Karl Tuyls, Liviu Panait, Sean Luke, and J. A. La Poutré. (2005). “An Overview of Cooperative and Competitive Multiagent Learning.” In: Proceedings of the First International Conference on Learning and Adaption in Multi-Agent Systems. ISBN:3-540-33053-4, 978-3-540-33053-0 doi:10.1007/11691839_1----