David Taralla

PhD Student in Computer Science & Engineering, ULg


I have a real passion for video game design and development, mostly MOBAs like League of Legends and RTS games like Warcraft and Starcraft.

This passion lead me where I am now, and continues to drive my choices, in particular in research. Thanks to Prof. Damien Ernst, I was able to enter the world of research and apply mine on games like these.

I am mainly working on designing algorithms to find automatically the best strategy to solve complex problems. « My » complex problems are today's video games, that can be seen as highly challenging environments for automatically discovering intelligent agent strategies.

However, I do not restrict myself to apply my research to video games only. I try to make my methods work on video games in the first place of course, but an even more important goal I have is that this research can be applicable to any other real-life problem showing the same features (in terms of sequential decision making, observability, stochasticity, the number of agents in play, etc.). Active power network management for instance is another area onto which I want my methods to be applicable.

Learning Artificial Intelligence in Large-Scale Video Games — A First Case Study with Hearthstone: Heroes of Warcraft
2014 - 2015

I successfully finished my master thesis under the supervision of R. Fonteneau & D. Ernst, which is about learning artificial intelligence in large-scale video games.

My work is divided in two phases:

  1. Creating a simulator for the game Hearthstone: HoW — a 2-player, partially observable stochastic game — on which to test AI algorithms,
  2. Designing an autonomous agent for this game featuring the level of a beginner player.

The game I chose to implement as testbed is Hearthstone: HoW, a popular collectible card game developed by Blizzard. The implementation requirements were: flexibility, extensibility, modularity of the code and portability. The simulator is coded in C++/Qt. The source code is available for future AI research on my GitHub repositories. Feel free to fork it or contact me for any questions!

Classification of actions based on supervised learning is attempted in order to come up with a prototype of autonomous agent playing the game Hearthstone: HoW, without being able to train it using simulations of side-trajectories in the course of a game, nor knowing in advance what consequences any possible action has on the game. Indeed, Hearthstone: HoW presents so many dependencies amongst all its elements that it would be intractable to implement some « undo » mechanic making an algorithm able to backtrack once a trajectory was simulated from a given state. Moreover, no assumptions are made on the nature of actions that can occur in a Hearthstone: HoW game, making the theory developed applicable to other large-scale video games.

Browse my master's thesis on ORBi »

Browse my slides on ORBi »


Over the past twenty years, video games have become more and more complex thanks to the emergence of new computing technologies. The challenges players face now involve the simultaneous consideration of many game environment variables — they usually wander in rich 3D environments and have the choice to take numerous actions at any time, and taking an action has combinatorial consequences. However, the artificial intelligence (AI) featured in those games is often not complex enough to feel natural (human). Today's AI is still most of the time hard-coded, but as the game environments become increasingly complex, this task becomes exponentially difficult.

To circumvent this issue and come with rich autonomous agents in large-scale video games, many research works already tried and succeeded in making video game AI learn instead of being taught. This thesis does its bit towards this goal.

In this work, supervised learning classification based on extremely randomized trees is attempted as a solution to the problem of selecting an action amongst the set of available ones in a given state. In particular, we place ourselves in the context where no assumptions are made on the kind of actions available and where action simulations are not possible to find out what consequences these have on the game. This approach is tested on the collectible card game Hearthstone: HoW, for which an easily-extensible simulator was built. Encouraging results were obtained when facing Nora, the resulting Mage agent, against random and scripted (medium-level) Mage players. Furthermore, besides quantitative results, a qualitative experiment showed that the agent successfully learned to exhibit a board control behavior without having been explicitly taught to do so.

A feature-based approach for best arm identification in the case of the Monte Carlo search algorithm discovery for one-player games
Summer 2013

I carried out a two-month research internship in the Reinforcement Learning and Artificial Intelligence Group of the University of Alberta (Canada). During my stay at this University, I worked with Prof. Csaba Szepesvári on the study/design of Monte-Carlo Tree Search (MCTS). Monte-Carlo Tree Search algorithms make decisions in sequential decision processes by iteratively building a search tree based on an analysis of the most promising moves. The work carried out was based on a scientific paper written by researchers from the Systems and Modelling Research Unit of the University of Liège, that proposes a technique for automatically discovering MCTS algorithms for single-player games [Maes2013].

In summary, this technique works in three steps:

  1. It introduces grammar that enables inducing a rich space of candidate MCTS algorithms.
  2. Second, it defines a distribution over training problems.
  3. Third, it searches in the rich space of candidate MCTS algorithms for the one that performs best on the distribution of training problems.

Simulation results have shown that this simple technique could lead to an MCTS algorithm that could significantly outperform those published in the literature, such as, among others, Upper Confident Tree search [Kocsis2006] or nested Monte-Carlo [Cazenave2009]. During my internship, I worked on improving the third step of this approach. More specifically, I developed a new algorithm based on contextual bandits for rapidly identifying MCTS algorithms that were leading to high-performances inside this search space. The algorithm is expected to perform better than the optimization method used in [Maes2013]. The premises of a theoretical analysis of this algorithm were also carried out during this internship.

Browse my internship report on ORBi »

Browse my slides on ORBi »


[Maes2013] – F. Maes, D. Lupien St-Pierre and D. Ernst. Monte Carlo search algorithm discovery for single-player games. IEEE Transactions on Computational Intelligence and AI in Games, September 2013, Volume 5, Issue 3, pp. 201-213.
[Kocsis2006] – L. Kocsis and C. Szepesvári. Bandit based Monte Carlo planning. 17th European Conference on Machine Learning, 2006, pp. 282–293.
[Cazenave2009] – T. Cazenave. Nested Monte Carlo search. 21st International Joint Conference on Artificial Intelligence, 2009, pp. 456–461.

Bachelor in Computer Science & Engineering

Passionate about video games development and design, I began studying computer science at the ULg in 2009.
At the end of the second year of my bachelor degree, I wanted something more and decided to head towards a master in engineering; I thus needed to get the title of civil engineer.
To do so, during the third year, I took additional courses while succeeding in getting my bachelor degree in computer science. The next year, I caught up with various fields of knowledge like thermodynamics and electrical engineering, what got me the title of civil engineer in one year instead of three.


Internship at the RLAI lab

During 2013 Summer, I did an internship at the University of Alberta, under the supervision of Prof. Damien Ernst (ULg) and Prof. Csaba Szepesvari (UoA).
I was integrated in the Reinforcement Learning and Artificial Intelligence laboratory for two months, working on A feature-based approach for best arm identification in the case of the Monte Carlo search algorithm discovery for one-player games (more about this here).

Browse my internship report on ORBi »

Browse my slides on ORBi »


Master in Computer Science & Engineering

After the internship, I continued working closely with Prof. Ernst who introduced me to Raphael Fonteneau, postdoctoral researcher at the ULg. I worked under their supervision for my master thesis (see above).