Détail d'un enseignement

ECTS : 2

Volume horaire : 21

Description du contenu de l'enseignement :

Introduction of Reinforcement Learning
Multi-armed Bandits problem
Finite Markov Decision processes
Dynamic programming
Sample-based Learning Methods (Monte-Carlo methods, Temporal-difference learning)
Prediction and Control with Function Approximation

Compétence à acquérir :

Build a Reinforcement Learning system for sequential decision making. Understand how to formalize your task as a Reinforcement Learning problem, and how to begin implementing a solution.
Understand RL algorithms (Temporal-Difference learning, Monte Carlo, Q-learning, Policy Gradients etc).

Mode de contrôle des connaissances :

Project