Reinforcement Learning Toolbox™ provides an app, functions, and a Simulink® block for training policies using reinforcement learning algorithms, including DQN, PPO, SAC, and DDPG. You can use these policies to implement controllers and decision-making algorithms for complex applications such as resource allocation, robotics, and autonomous systems.
The toolbox lets you represent policies and value functions using deep neural networks or look-up tables and train them through interactions with environments modeled in MATLAB® or Simulink. You can evaluate the single- or multi-agent reinforcement learning algorithms provided in the toolbox or develop your own. You can experiment with hyperparameter settings, monitor training progress, and simulate trained agents either interactively through the app or programmatically. To improve training performance, simulations can be run in parallel on multiple CPUs, GPUs, computer clusters, and the cloud (with Parallel Computing Toolbox™ and MATLAB Parallel Server™).
Through the ONNX™ model format, existing policies can be imported from deep learning frameworks such as TensorFlow™ Keras and PyTorch (with Deep Learning Toolbox™). You can generate optimized C, C++, and CUDA® code to deploy trained policies on microcontrollers and GPUs. The toolbox includes reference examples to help you get started.
Get Started:
Reinforcement Learning Algorithms
Create agents using deep Q-network (DQN), deep deterministic policy gradient (DDPG), proximal policy optimization (PPO), and other built-in algorithms. Use templates to develop custom agents for training policies.
Reinforcement Learning Designer App
Interactively design, train, and simulate reinforcement learning agents. Export trained agents to MATLAB for further use and deployment.
Policy and Value Function Representation Using Deep Neural Networks
For complex systems with large state-action spaces, define deep neural network policies programmatically, using layers from Deep Learning Toolbox, or interactively, with Deep Network Designer. Alternatively, use the default network architecture suggested by the toolbox. Initialize the policy using imitation learning to accelerate training. Import and export ONNX models for interoperability with other deep learning frameworks.
Single- and Multi-Agent Reinforcement Learning in Simulink
Create and train reinforcement learning agents in Simulink with the RL Agent block. Train multiple agents simultaneously (multi-agent reinforcement learning) in Simulink using multiple instances of the RL Agent block.
Simulink and Simscape Environments
Use Simulink and Simscape™ to create a model of an environment. Specify the observation, action, and reward signals within the model.
MATLAB Environments
Use MATLAB functions and classes to model an environment. Specify observation, action, and reward variables within the MATLAB file.
Distributed Computing and Multicore Acceleration
Speed up training by running parallel simulations on multicore computers, cloud resources, or compute clusters using Parallel Computing Toolbox and MATLAB Parallel Server.
GPU Acceleration
Speed up deep neural network training and inference with high-performance NVIDIA® GPUs. Use MATLAB with Parallel Computing Toolbox and most CUDA-enabled NVIDIA GPUs that have compute capability 3.0 or higher.
Code Generation
Use GPU Coder™ to generate optimized CUDA code from MATLAB code representing trained policies. Use MATLAB Coder™ to generate C/C++ code to deploy policies.
MATLAB Compiler Support
Use MATLAB Compiler™ and MATLAB Compiler SDK™ to deploy trained policies as standalone applications, C/C++ shared libraries, Microsoft® .NET assemblies, Java® classes, and Python® packages.
Getting Started
See how to develop reinforcement learning policies for problems such as inverting a simple pendulum, navigating a grid world, balancing a cart-pole system, and solving generic Markov decision processes.
Automated Driving
Design reinforcement learning policies for automated driving applications such as adaptive cruise control, lane keeping assistance, and automatic parking.
Robotics
Design reinforcement learning policies for robotics applications.
Tuning, Calibration, and Scheduling
Design reinforcement learning policies for tuning, calibration, and scheduling applications.
Product Resources:
Reinforcement Learning Video Series
Watch the videos in this series to learn more about reinforcement learning.