Autonomous Driving with Reinforcement Learning in CARLA

This project is a reinforcement learning project that trains an autonomous driving agent in the CARLA simulator. Using the Soft Actor-Critic algorithm combined with DrQ image augmentation, the agent learns to navigate between random points on multiple maps while handling traffic lights, varying weather, and dynamic conditions. The project addresses common challenges in RL for driving, such as sparse rewards and visual robustness, through a carefully designed dense reward function and a progressive curriculum learning strategy. The result is a modular, configurable system capable of learning safe and efficient driving behaviour directly from camera images and vector observations.

The CARLA simulator when the agent is being tested
Project Objectives

The main objectives of this project were to develop and train a reinforcement learning agent capable of autonomous navigation in the CARLA simulator. Specifically, the project aimed to implement a custom Gymnasium-compatible environment using the Soft Actor-Critic algorithm, design a dense multi-component reward function to guide the agent effectively, and incorporate DrQ-style image augmentation to improve robustness to weather variations. Additional goals included building a curriculum learning system that gradually increases task difficulty, enabling the agent to master basic driving skills before tackling more complex scenarios such as traffic light compliance and longer routes. The project also sought to create a stable, resumable training pipeline with configurable parameters for experimentation.

Project Outcomes

The project successfully produced a fully functional custom CARLA environment integrated with Stable-Baselines3, supporting both image and vector observations. A dense reward system was implemented with components for waypoint progress, speed compliance, lane keeping, smoothness, collision avoidance, and traffic light behaviour. Curriculum learning was applied across maps, weather conditions, episode length, and traffic light activation, allowing the agent to build skills progressively. Custom extensions were developed to enable DrQ augmentation within the SAC algorithm and replay buffer. Key challenges addressed included traffic manager instability, accurate traffic light detection, and preventing dead steps during training. The final result is an agent that can efficiently drive from a random point to another on a variety of maps and in harsh weather conditions.

Thesis: Reinforcement Learning for Autonomous Driving in the CARLA simulator

The thesis for this project explores how an agent can learn to navigate between random points on multiple maps using the Soft Actor-Critic (SAC) algorithm, while processing both camera images and vector-based observations such as goal direction and traffic light states.

The research focused on addressing two major challenges in applying reinforcement learning to driving: the sparse reward problem in long-horizon tasks and the agent’s sensitivity to visual changes caused by weather conditions. To tackle these, the thesis implemented a dense, multi-component reward function and a curriculum-learning strategy that gradually increased difficulty across maps, weather, episode length, and traffic light compliance. DrQ-style random shift image augmentation was also integrated to improve the agent’s robustness to visual noise.

The work demonstrates a practical end-to-end approach to autonomous driving in simulation, combining off-policy reinforcement learning, reward shaping, and visual augmentation. The resulting system is modular, configurable, and capable of stable training over millions of steps.

Roan Knight
BSc (Hons) Creative Computing

I’m Roan Knight, a final-year creative computing student at IADT. My final year project explores autonomous driving in the CARLA simulator using reinforcement learning and Soft Actor-Critic combined with DrQ augmentation. I focused on creating a stable agent through curriculum learning, dense reward design, and solving real simulator challenges like traffic light detection and visual robustness under varying weather conditions.

BSc (Hons) Creative Computing