作成者 |
|
|
|
|
本文言語 |
|
出版者 |
|
|
発行日 |
|
収録物名 |
|
巻 |
|
号 |
|
開始ページ |
|
終了ページ |
|
出版タイプ |
|
アクセス権 |
|
Crossref DOI |
|
権利関係 |
|
概要 |
This article preprocesses environmental information and use it as input for the Proximal Policy Optimization (PPO) algorithm. The algorithm is directly trained on a model vehicle in a real environment..., allowing it to control the distance between the vehicle and surrounding objects. The training converges after approximately 200 episodes, demonstrating the PPO algorithm's ability to tolerate uncertainty, noise, and interference in a real training environment to some extent. Furthermore, tests of the trained model in different scenarios reveal that even when the input information is processed and does not provide a comprehensive view of the environment, the PPO algorithm can still effectively achieve control objectives and accomplish challenging tasks.続きを見る
|