Skip to main contentSkip to main navigationSkip to footer content

Eight weeks8-10 hours per week

There was an error: {{ status.errorMessage }}

Online Learning with recorded live sessions
$1199.00 + applicable taxesView available bundles and discounts

Reinforcement learning (RL) is a foundational approach in modern artificial intelligence (AI) that enables systems to learn optimal behaviors through experience and interaction with the environment, rather than being given fixed rules. It is used to build AI that can evaluate choices, adapt over time, and make smart decisions - whether that’s mastering a game, controlling a robot, or guiding autonomous systems in dynamic environments - RL powers some of the most advanced applications in artificial intelligence today

This course introduces learners to the foundational principles and practical implementation of RL. You’ll explore key concepts such as Markov Decision Processes, value functions, policy optimization, and deep reinforcement learning techniques. Through hands-on assignments, including building a game-playing agent, you will gain experience applying RL algorithms to solve complex problems. 

As part of the AI Certificate, this course builds on your machine learning knowledge and complements other advanced topics like Large Language Models (LLMs), offering a deeper understanding of how intelligent agents learn and adapt in dynamic environments. It lays the groundwork for understanding cutting-edge techniques such as Reinforcement Learning with Human Feedback (RLHF), increasingly used in training sophisticated AI systems. 

 

What you will learn

By the end of this course, you’ll have a strong conceptual and practical foundation in reinforcement learning, including:

  • Applying core reinforcement learning principles and agent-based problem solving to real decision-making problems 
  • Modeling environments using Markov Decision Processes (MDPs) to represent states, actions, rewards, and long-term outcomes, a common framework used in industry
  • Using value functions and policy functions (V and Q functions) in learning. 
  • Implementing essential algorithms such as temporal-difference learning and Q-learning 
  • Using modern deep reinforcement learning techniques (e.g., DQN, PPO, Actor-Critic) to tackle more complex problems using neural networks
  • Understanding how reinforcement learning is applied to language models, including Reinforcement Learning from Human Feedback (RLHF) and Goal-Sensitive Policy Optimization (GSPO)

 

Skills you’ll gain 

  • Designing and training reinforcement learning agents that improve their behaviour through interaction and feedback
  • Design AI that learns to optimize behaviour in changing environments
  • Applying reinforcement learning algorithms in Python with hands-on experience implementing and testing models
  • Using frameworks like PyTorch for deep reinforcement learning (RL) workflows
  • Integrating RL methods with other AI components in larger systems
  • Applying core RL algorithms to game environments and interactive systems 
  • Understanding how RL is used to fine-tune language models and autonomous agents 
  • Explaining reinforcement learning concepts and results to technical and non-technical stakeholders, supporting collaboration and informed decision-making

 

Course format 

  • Commitment: Eight weeks, 8-10 hours per week 
  • Prerequisite: Completion of the Machine Learning course or equivalent experience 
    • This course is designed for learners with intermediate to advanced programming and machine learning experience, including familiarity with Python and basic ML concepts. 
  • Project: Build a reinforcement learning agent that learns to play a game 
  • Delivery: Hybrid delivery, instructor-led live sessions with hands-on assignments 

 

Course author 

Headshot of Larry Simon

Larry Simon, MBA

Founder and Managing Director, Inflection Group

Larry Simon is an entrepreneur, management consultant, and angel investor, specializing in IT strategy and data analytics. He has over 30 years of experience advising startups, global corporations, and government institutions. He is the founder and managing director of Inflection Group. Prior to this he was a partner with Ernst & Young Consulting, their CTO and national director of their strategy and delivery centres.

He has previously served on the faculty of the Rotman School of Management, as the head judge of the Canadian Information Productivity Awards (CIPA), and as a councillor of the Institute of Certified Management Consultants of Ontario. Larry holds an MBA from the University of Toronto and a B.Math (Computer Science) from the University of Waterloo.

 

Related courses

Machine Learning

 

Language Models

 

Agentic AI

 

 

You may also be interested in:

AI and Business Strategy

 

Python I

 

Python 2: Data Science and AI Applications

 

Data Science Certificate

 

Machine Learning Practitioner Certificate

 

Foundations of Large Language Models

 

Cybersecurity, Networking, and Cloud Computing Courses

 

Sign up to attend an upcoming workshop!  

Free workshop: Getting Started with AI Agents

see all events

Sign up for more information!

Get program and launch updates, expert insights, event updates, and AI news straight to your inbox.

Questions? Let's chat!

Office hours: Monday to Friday, 8:30 a.m. - 4:30 p.m. ET

  +1 (519) 888-4773

  watspeed@uwaterloo.ca

Register now