Can reinforcement learning agents generate and benefit from conventions when cooperating with each other in imperfect information game?
This is the question that led to my course project in “Advancing AI through cognitive science” course at NYU Center for Data Science. In summary, I applied theory-of-mind modeling to the Hanabi challenge [1] and observed an improvement.
Applying Theory of Mind to the Hanabi Challenge Hanabi is a card game created in 2010 which can be understood as cooperative solitaire.