Gautier Hamon

I'm a PhD student at Inria FLOWERS team (France) working on open-ended skill acquisition and major transitions in simulated environments. Including work on reinforcement learning and meta reinforcement learning, evolutionary simulations and self-organization in cellular automata. At the beginning of the year I did a 3 month visit in Ricard Solé's complex system lab.

I have a machine learning background. I obtained my MSc MVA (Mathematics, Vision, Learning) from Institut Polytechnique de Paris/ENS Paris-Saclay, with highest honors. As well as my Master of Engineering (MEng) from Télécom Paris (GPA 4.).

Email / CV / Scholar / Twitter / Github

Research

Research interests

Reinforcement learning : Meta-RL, continual learning, open-ended skill acquisition, Multi agents (MARL)

Artificial life: cellular automata:, open-ended evolution, major transisition, evolutionary algorithms

Publications

	Flow-Lenia: Towards open-ended evolution in cellular automata through mass conservation and parameter localization Erwan Plantec, Gautier Hamon, Mayalen Etcheverry, Pierre-Yves Oudeyer, Clément Moulin-Frier, Bert Wang-Chak Chan, Alife, 2023 ( Best paper award !) project page / Paper Introducing mass conservation and multi species simulation in a continuous cellular automaton leading to intrinsic evolution in the system.
	Discovering Sensorimotor Agency in Cellular Automata using Diversity Search Gautier Hamon, Mayalen Etcheverry, Clément Moulin-Frier, Bert Wang-Chak Chan, Pierre-Yves Oudeyer Arxiv, 2023 Blogpost / Paper In this paper, we leverage recent advances in machine learning, combining algorithms for diversity search, curriculum learning and gradient descent, to automate the search in cellular automaton of parameter leading to the self-organization of localized structures that move around with the ability to react in a coherent manner to external obstacles and maintain their integrity, hence primitive forms of sensorimotor agency. The emerging macro agents does not possess a central control "brain" but rather all the simple constituents self-organize into coherent entities capable of sensorimotor behavior. They also display impressive generalization capabilities as well as emergent behaviors.
	Emergence of Collective Open-Ended Exploration from Decentralized Meta-Reinforcement Learning Richard Bornemann^, Gautier Hamon^, Eleni Nisioti, Clément Moulin-Frier, Presented at ALOE workshop Neurips, 2023 Project page / Paper We introduce a novel environment with an open ended procedurally generated task space which dynamically combines multiple subtasks sampled from five diverse task types to form a vast distribution of task trees. We show that decentralized agents trained in our environment exhibit strong generalization abilities when confronted with novel objects at test time. Additionally, despite never being forced to cooperate during training the agents learn collective exploration strategies which allow them to solve novel tasks never encountered during training. We further find that the agents learned collective exploration strategies extend to an open ended task setting, allowing them to solve task trees of twice the depth compared to the ones seen during training
	Eco-evolutionary Dynamics of Non-episodic Neuroevolution in Large Multi-agent Environments Gautier Hamon, Eleni Nisioti, Clément Moulin-Frier, Gecco, 2023 Project page / Paper In this work we present a method for continuously evolving adaptive agents without any environment or population reset. The environment is a large grid world with complex spatiotemporal resource generation, containing many agents that are each controlled by an evolvable recurrent neural network and locally reproduce based on their internal physiology. The entire system is implemented in JAX, allowing very fast simulation on a GPU. We show that NE can operate in an ecologically-valid non-episodic multi-agent setting, finding sustainable collective foraging strategies in the presence of a complex interplay between ecological and evolutionary dynamics.
	Evolving Reservoirs for Meta Reinforcement Learning Corentin Léger^, Gautier Hamon^, Eleni Nisioti, Xavier Hinaut Clément Moulin-Frier, EvoApps (part of EvoStar), 2024 Arxiv We adopt a computational framework based on meta reinforcement learning as a model of the interplay between evolution and development. At the evolutionary scale, we evolve reservoirs, a family of recurrent neural networks that differ from conventional networks in that one optimizes not the synaptic weights, but hyperparameters controlling macro-level properties of the resulting network architecture. At the developmental scale, we employ these evolved reservoirs to facilitate the learning of a behavioral policy through Reinforcement Learning (RL). Within an RL agent, a reservoir encodes the environment state before providing it to an action policy. We evaluate our approach on several 2D and 3D simulated environments. Our results show that the evolution of reservoirs can improve the learning of diverse challenging tasks.
	Autotelic Reinforcement Learning in Multi-Agent Environments Elias Masquil^, Eleni Nisioti^, Gautier Hamon^, Clément Moulin-Frier, CoLLAs (Conference on Lifelong Learning Agents)*, 2023 Paper In the intrinsically motivated skills acquisition problem, the agent is set in an environment with- out any pre-defined goals and needs to acquire an open-ended repertoire of skills. To do so the agent needs to be autotelic (deriving from the Greek auto (self) and telos (end goal)): it needs to generate goals and learn to achieve them following its own intrinsic motivation rather than external supervision. Multi-agent environments pose an additional challenge for autotelic agents: to discover and master goals that require cooperation agents must pursue them simultaneously, but they have low chances of doing so if they sample them independently. In this work, we propose a new learning paradigm for modelling such settings, the Decentralized Intrinsically Motivated Skills Acquisition Problem (Dec-IMSAP), and employ it to solve cooperative navigation tasks. First, we show that agents setting their goals independently fail to master the full diversity of goals. Then, we show that a sufficient condition for achieving this is to ensure that a group aligns its goals, i.e., the agents pursue the same cooperative goal. Finally, we intro- duce the Goal-coordination game, a fully-decentralized emergent communication algorithm, where goal alignment emerges from the maximization of individual rewards in multi-goal cooperative environments and show that it is able to reach equal performance to a centralized training baseline that guarantees aligned goals.

Additional Open-Source code

TransformerXL for RL (PPO) in JAX
Gautier Hamon^*,
Github
Source code

Jax implementation of the paper "Stabilizing Transformers for Reinforcement Learning". The code follows the PureJaxRL template to be as clear as possible and can take any gymnax environment. The code has been tested on Craftax on which it has set a new Sota being the first to obtain advanced acheivements (without much finetuning). Training a 5M transformer on Craftax for 1e9 steps takes 6h on a A100. On Xland-minigrid (where environment parallelization is even more advantageous), training on 16 000 parallel environments with a 5M transformer reaches speed of 250 000 steps per seconds on a single A100 (so 1e9 steps in a little bit more than an hour).

Template from here