Exploring Evolution Strategies for Reinforcement Learning in the Obstacle Tower Environment

Kuypers, Julian

Utilize este identificador para referenciar este registo: http://hdl.handle.net/10362/127522

Título:	Exploring Evolution Strategies for Reinforcement Learning in the Obstacle Tower Environment
Autor:	Kuypers, Julian
Orientador:	Castelli, Mauro Bakurov, Illya Olegovich
Palavras-chave:	deep reinforcement learning evolution strategies genetic algorithm obstacle Tower obstacle Tower Challenge unity neuroevolution reinforcement learning benchmark.
Data de Defesa:	2-Nov-2021
Resumo:	In 2017 OpenAI demonstrated that it was possible to train an AI agent by using Evolution Strategies (ES), and that the results rivaled standard Reinforcement Learning (RL) techniques on modern benchmarks. Their research effectively showed that Evolution Strategies is a viable alternative to traditional Reinforcement Learning techniques, and that it bypasses many of Reinforcement Learning’s inconveniences, notably the use of backpropagation. The Obstacle Tower environment aims to set a new Reinforcement Learning benchmark by challenging Artificial Intelligence (AI) agents to traverse 3-Dimensional procedurally generated levels using a real-time 3-Dimensional physics system. The environment tests an agent’s ability to generalize by requiring it to optimize aspects that are common in many Reinforcement Learning environments, but rarely combined in the same environment: vision, planning, and control. In this research, the original implementation of OpenAI’s Evolution Strategies algorithm was applied for the first time to the Obstacle Tower environment to assess how well it performs in a more complex environment, where the agent’s generalization ability is critical. Additionally, in the interest of exploring Evolution Strategies in this environment, common Genetic Algorithm selection and mutation techniques were developed and applied to try and improve the performance of the original Evolution Strategies implementation. Crossover techniques were not explored during this research, as they are rarely applied in Evolution Strategies. The results show that although the basic implementation of Evolution Strategies does not perform well in the complex Obstacle Tower environment, it is possible to improve its performance by applying different evolution methods borrowed from Genetic Algorithm (GA), which are algorithms belonging to the same family as Evolution Strategies.
Descrição:	Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics
URI:	http://hdl.handle.net/10362/127522
Designação:	Mestrado em Métodos Analíticos Avançados
Aparece nas colecções:	NIMS - Dissertações de Mestrado em Ciência de Dados e Métodos Analíticos Avançados (Data Science and Advanced Analytics)

Ficheiros deste registo:

Ficheiro	Descrição	Tamanho	Formato
TAA0105.pdf		2,22 MB	Adobe PDF	Ver/Abrir

Mostrar registo em formato completo Dê a sua opinião sobre este registo.