All Courses

Introduction to Reinforcement Learning

Chapter 1: Foundations of Reinforcement Learning

What is Reinforcement Learning?

Agents and Environments

States, Actions, and Rewards

Policies: Mapping States to Actions

The RL Workflow: Interaction Loops

Types of RL Tasks: Episodic vs Continuing

Comparing RL with Other Learning Types

Setting up Your Python Environment for RL

Quiz for Chapter 1

Chapter 2: Markov Decision Processes (MDPs)

Modeling Sequential Decision Making

Formal Definition of an MDP

State Transition Probabilities

Reward Functions

Return: Cumulative Future Rewards

Discounting Future Rewards

Policies and Value Functions (Vπ, Qπ)

Finding Optimal Policies

Quiz for Chapter 2

Chapter 3: Estimating Value Functions

The Bellman Expectation Equation

The Bellman Optimality Equation

Solving Bellman Equations (Overview)

Dynamic Programming: Policy Iteration

Dynamic Programming: Value Iteration

Limitations of Dynamic Programming

Quiz for Chapter 3

Chapter 4: Monte Carlo Methods

Learning from Complete Episodes

Monte Carlo Prediction: Estimating Vπ

Monte Carlo Control: Estimating Qπ

On-Policy vs Off-Policy Learning

MC Control without Exploring Starts

On-Policy First-Visit MC Control Implementation

Off-Policy MC Prediction and Control Intro

Practice: Implementing MC Prediction

Quiz for Chapter 4

Chapter 5: Temporal-Difference Learning

Learning from Incomplete Episodes

TD(0) Prediction: Estimating Vπ

Advantages of TD Learning over MC

SARSA: On-Policy TD Control

Q-Learning: Off-Policy TD Control

Comparing SARSA and Q-Learning

Hands-on Practical: Implementing Q-Learning

Quiz for Chapter 5

Chapter 6: Function Approximation in RL

Handling Large State Spaces

Value Function Approximation (VFA)

Feature Vectors for State Representation

Linear Methods for VFA

Gradient Descent for Parameter Learning

Semi-gradient TD Methods

Using Neural Networks for VFA

Practice: Applying Linear VFA

Quiz for Chapter 6

Chapter 7: Introduction to Deep Q-Networks (DQN)

Combining Q-Learning with Deep Learning

Challenges with Neural Networks in RL

Experience Replay Mechanism

Fixed Q-Targets (Target Networks)

The DQN Algorithm Structure

Architectural Considerations for DQNs

Hands-on Practical: Building a Basic DQN

Quiz for Chapter 7

Chapter 8: Introduction to Policy Gradient Methods

Learning Policies Directly

Policy Gradient Theorem (Concept)

REINFORCE Algorithm

Baselines for Variance Reduction

Actor-Critic Methods Overview

Comparing Value-Based and Policy-Based Methods

Practice: Implementing REINFORCE

Quiz for Chapter 8

Quiz
Beta

Chapter: Monte Carlo Methods

Test your understanding and practice the concepts from this chapter

Quiz Instructions

This quiz contains 14 questions to help you practice.
You need to score at least 70% to pass.
Attempts: Unlimited.
Your highest score will be kept.
Please attempt this quiz without assistance; however, feel free to refer to the chapter notes or use a code interpreter if needed.
This assessment is for your learning and understanding, It is not formally accredited (yet).

Question Format

The questions are designed to be engaging, focusing on understanding, application, and interpretation rather than rote memorization. Expect scenario-based problems that test your ability to apply what you've learned.

Attempts

Best scores and quiz attempts will appear.

© 2025 ApX Machine Learning