·¬ÇÑÉçÇø

Event Details

Optimizing UAV Trajectory for Maximum Sum Rate Using Proximal Policy Optimization

Presenter: Optimizing UAV Trajectory for Maximum Sum Rate Using Proximal Policy Optimization
Supervisor: Supervisor: Professor Hong-Chuan Yang

Date: Fri, August 16, 2024
Time: 12:00:00 - 00:00:00
Place: Remote Via Zoom

ABSTRACT

Speaker: Yawen Li

Title:  Optimizing UAV Trajectory for Maximum Sum Rate Using Proximal Policy Optimization

 

Supervisor:  Professor Hong-Chuan Yang

 

Date:  August 16, 2024

Time:  12:00pm

Location: Join Zoom Meeting

 

Abstract:  This seminar presents a study on the optimization of unmanned aerial vehicle (UAV) trajectory using advanced reinforcement learning (RL) algorithms, specifically Proximal Policy Optimization (PPO). The primary objective is to maximize the communication sum rate between the UAV and ground users by formulating it into a Markov Decision Process (MDP). The study introduces an innovative approach of action elimination to enhance the learning efficiency of RL agents by preventing them from selecting actions that do not contribute to the mission’s success. This method proved crucial in helping agents achieve higher rewards and reach their destinations on time, thereby avoiding unnecessary explorations. Additionally, the research explores the impact of different reward functions on the learning dynamics and performance of the RL agents. PPO shows a marked preference for cumulative rewards, reflecting its design to capitalize on long-term benefits. A significant portion of the research was dedicated to hyperparameter tuning within the PPO framework, where variables such as learning rates, clipping ratios, and buffer sizes were meticulously adjusted to refine the learning process. This tuning not only enhanced the performance of the PPO agent but also offered valuable insights into the sensitivity of RL algorithms to their operational parameters. However, the study acknowledges limitations, including the simplification of environmental factors and the two-dimensional trajectory optimization. Future work is suggested to integrate more complex environmental models and consider three-dimensional trajectory planning to address real-world applicability more effectively.