Counterfactual Regret Minimization (CFR) Algorithm

From GM-RKB

Jump to navigation Jump to search

A Counterfactual Regret Minimization (CFR) Algorithm is a regret minimization algorithm that can be implemented by a counterfactual regret minimization system to solve a counterfactual regret minimization task (enables strategy optimization through regret-based learning).

AKA: CFR Algorithm, CFR, Counterfactual Regret Minimization.
Context:
- Task Input: game description, player information
  - Optional Input: historical play data, domain constraints
- Task Output: strategy profile, regret values
- Task Performance Measure: convergence rate, exploitability, memory efficiency
- ...
- It can typically achieve Nash Equilibrium through iterative self-play.
- It can typically minimize Regret Value through counterfactual value updating.
- It can typically handle Imperfect Information through information set abstraction.
- It can typically compute Strategy Profile through regret matching.
- It can typically maintain Action Probability through strategy updating.
- ...
- It can often support Game Solving through recursive tree traversal.
- It can often optimize Mixed Strategy through cumulative regret minimization.
- It can often handle Partial Observability through sampling-based approximation.
- It can often process Sequential Decision through history-aware computation.
- ...
- It can range from being a Vanilla CFR Implementation to being a Discounted CFR Implementation, depending on its update mechanism.
- It can range from being a Table-Based CFR to being a Deep CFR, depending on its value representation.
- ...
- It can integrate with Neural Network for function approximation.
- It can support Monte Carlo Sampling for performance optimization.
- It can connect to Game Engine for strategy deployment.
- ...
Examples:
- CFR Variants, such as:
  - Vanilla CFRs, such as:
    - External Sampling CFR for sampling-based computation.
    - Chance Sampling CFR for probability reduction.
  - Modern CFRs, such as:
    - Discounted CFR for faster convergence.
    - Linear CFR for improved stability.
- CFR Applications, such as:
  - Game Solvers, such as:
    - Poker CFR System for poker strategy optimization.
    - Board Game CFR Engine for board game analysis.
  - Decision Systems, such as:
    - Security CFR System for defense strategy planning.
    - Trading CFR Engine for market strategy optimization.
- ...
Counter-Examples:
- Policy Gradient Algorithm, which lacks counterfactual value computation.
- Q-Learning Algorithm, which lacks explicit regret minimization.
- Fictitious Play Algorithm, which lacks information set handling.
See: Regret Minimization Algorithm, Counterfactual Regret Minimization System, Counterfactual Regret Minimization Task, Strategy Optimization, Regret-Based Learning, Game Theory Algorithm, Nash Equilibrium, Imperfect Information Game.

References

2009

(Lanctot et al., 2009) ⇒ Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. (2009). “Monte Carlo Sampling for Regret Minimization in Extensive Games.” In: Proceedings of the 22nd International Conference on Neural Information Processing Systems. ISBN:978-1-61567-911-9
- QUOTE: Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome sampling. In this paper, we describe a general family of domain-independent CFR sample-based algorithms called Monte Carlo counterfactual regret minimization (MCCFR) of which the original and poker-specific versions are special cases.

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=Counterfactual_Regret_Minimization_(CFR)_Algorithm&oldid=933556"