Counterfactual Regret Minimization (CFR) Algorithm
Jump to navigation
Jump to search
A Counterfactual Regret Minimization (CFR) Algorithm is a regret minimization algorithm that can be implemented by a counterfactual regret minimization system to solve a counterfactual regret minimization task (enables strategy optimization through regret-based learning).
- AKA: CFR Algorithm, CFR, Counterfactual Regret Minimization.
- Context:
- Task Input: game description, player information
- Task Output: strategy profile, regret values
- Task Performance Measure: convergence rate, exploitability, memory efficiency
- ...
- It can typically achieve Nash Equilibrium through iterative self-play.
- It can typically minimize Regret Value through counterfactual value updating.
- It can typically handle Imperfect Information through information set abstraction.
- It can typically compute Strategy Profile through regret matching.
- It can typically maintain Action Probability through strategy updating.
- ...
- It can often support Game Solving through recursive tree traversal.
- It can often optimize Mixed Strategy through cumulative regret minimization.
- It can often handle Partial Observability through sampling-based approximation.
- It can often process Sequential Decision through history-aware computation.
- ...
- It can range from being a Vanilla CFR Implementation to being a Discounted CFR Implementation, depending on its update mechanism.
- It can range from being a Table-Based CFR to being a Deep CFR, depending on its value representation.
- ...
- It can integrate with Neural Network for function approximation.
- It can support Monte Carlo Sampling for performance optimization.
- It can connect to Game Engine for strategy deployment.
- ...
- Examples:
- CFR Variants, such as:
- Vanilla CFRs, such as:
- Modern CFRs, such as:
- CFR Applications, such as:
- ...
- CFR Variants, such as:
- Counter-Examples:
- Policy Gradient Algorithm, which lacks counterfactual value computation.
- Q-Learning Algorithm, which lacks explicit regret minimization.
- Fictitious Play Algorithm, which lacks information set handling.
- See: Regret Minimization Algorithm, Counterfactual Regret Minimization System, Counterfactual Regret Minimization Task, Strategy Optimization, Regret-Based Learning, Game Theory Algorithm, Nash Equilibrium, Imperfect Information Game.
References
2009
- (Lanctot et al., 2009) ⇒ Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. (2009). “Monte Carlo Sampling for Regret Minimization in Extensive Games.” In: Proceedings of the 22nd International Conference on Neural Information Processing Systems. ISBN:978-1-61567-911-9
- QUOTE: Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome sampling. In this paper, we describe a general family of domain-independent CFR sample-based algorithms called Monte Carlo counterfactual regret minimization (MCCFR) of which the original and poker-specific versions are special cases.