Course Name:
Reinforcement Learning (IT354)
Credits (L-T-P):
Introduction to Reinforcement Learning, Markov Processes Markov Reward Processes (MRPs) Markov Decision Processes (MDPs), MDP Policies, Policy Evaluation, Policy Improvement, Policy Iteration, Value operators, Model-free learning - Q-learning, SARSA, Scaling up: RL with function approximation, RL with function approximation, Imitation learning in large spaces, Policy search, Exploration/Exploitation, Meta-Learning, Batch Reinforcement Learning, Bandit problems and online learning, Solution methods: dynamic programming, Monte Carlo learning, Temporal difference learning, Eligibility traces, Value function approximation, Models and planning, Case studies: successful examples of RL systems, Frontiers of RL research