ECTS : 2
Volume horaire : 21
Description du contenu de l'enseignement :
1/ Introduction to reinforcement learning
2/ Theoretical formalism: Markov Decision Processes (MDPs), value function (Bellman equation and Hamilton–Jacobi–Bellman equation), etc.
3/ Common strategies illustrated with the “multi-armed bandit” example
4/ Deep learning strategies: Q-learning, DQN
5/ Deep learning strategies: SARSA and variants
6/ Deep learning strategies: Actor–Critic and variants
7/ Various Python implementations
8/ Ethical perspectives, the alignment problem, recent approaches and applications
Compétence à acquérir :
Introduction to reinforcement learning and deep reinforcement learning, with an empirical machine learning perspective: main algorithms, practical implementations (gymnasium)
Bibliographie, lectures recommandées :
https://turinici.com