TD-Gammon

Context:
- It is a game-playing program that is based on the Backgammom Board Game.
- It implements a Shallow Search Algorithm to determine its next move.
Example(s):
- ...
- …
Counter-Example(s):

See: Machine Learning System, Artificial Neural Network, Artificial Intelligence, Symbolic Artificial Intelligence Program, Question Answering System, Chatbot.

References

(Wikipedia, 2019) ⇒ https://en.wikipedia.org/wiki/TD-Gammon Retrieved:2019-11-9.
- TD-Gammon is a computer backgammon program developed in 1992 by Gerald Tesauro at IBM's Thomas J. Watson Research Center. Its name comes from the fact that it is an artificial neural net trained by a form of temporal-difference learning, specifically TD-lambda.
  TD-Gammon achieved a level of play just slightly below that of the top human backgammon players of the time. It explored strategies that humans had not pursued and led to advances in the theory of correct backgammon play.

(Sammut & Webb, 2017) ⇒ Claude Sammut, and Geoffrey I. Webb. (2017). "TD-Gammon". In: (Sammut & Webb, 2017). DOI:10.1007/978-1-4899-7687-1_813.
- QUOTE: TD-Gammon is a world-champion strength backgammon program developed by Gerald Tesauro. Its development relied heavily on machine learning techniques, in particular on a Temporal-Difference Learning. Contrary to successful game programs in domains such as chess, which can easily out-search their human opponents but still trail these ability of estimating the positional merits of the current board configuration, td-gammon was able to excel in backgammon for the same reasons that humans play well: its grasp of the positional strengths and weaknesses was excellent. In 1998, it lost a 100-game competition against the world champion with only 8 points. Its sometimes unconventional but very solid evaluation of certain opening strategies had a strong impact on the backgammon community and was soon adapted by professional players.
  
  (...)
  TD-Gammon is a conventional game-playing program that uses very shallow search (the first versions only searched one ply) for determining its move. Candidate moves are evaluated by a Neural Network, which is trained by [math]\displaystyle{ TD(\lambda) }[/math], a well-known algorithm for Temporal-Difference Learning (Tesauro 1992, ...)----