2009 MonteCarloSamplingforRegretMini

From GM-RKB

Jump to navigation Jump to search

(Lanctot et al., 2009) ⇒ Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. (2009). “Monte Carlo Sampling for Regret Minimization in Extensive Games.” In: Proceedings of the 22nd International Conference on Neural Information Processing Systems. ISBN:978-1-61567-911-9

Subject Headings: Counterfactual Regret Minimization (CFR).

Notes

Cited By

Quotes

Abstract

Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome sampling. In this paper, we describe a general family of domain-independent CFR sample-based algorithms called Monte Carlo counterfactual regret minimization (MCCFR) of which the original and poker-specific versions are special cases. We start by showing that MCCFR performs the same regret updates as CFR on expectation. Then, we introduce two sampling schemes: outcome sampling and external sampling, showing that both have bounded overall regret with high probability. Thus, they can compute an approximate equilibrium using self-play. Finally, we prove a new tighter bound on the regret for the original CFR algorithm and relate this new bound to MCCFR's bounds. We show empirically that, although the sample-based algorithms require more iterations, their lower cost per iteration can lead to dramatically faster convergence in various games.

References

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2009 MonteCarloSamplingforRegretMini	Martin Zinkevich Marc Lanctot Michael Bowling Kevin Waugh			Monte Carlo Sampling for Regret Minimization in Extensive Games						2009

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=2009_MonteCarloSamplingforRegretMini&oldid=882360"

Facts

... more about "2009 MonteCarloSamplingforRegretMini"

Marc Lanctot +, Kevin Waugh +, Martin Zinkevich + and Michael Bowling +

Monte Carlo Sampling for Regret Minimization in Extensive Games +

2009 +