多约束强化学习最优智能滑翔制导方法 Multi constraint optimal intelligent gliding guidance via reinforcement learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

多约束强化学习最优智能滑翔制导方法

引用本文：	朱建文,赵长见,李小平,包为民.多约束强化学习最优智能滑翔制导方法[J].国防科技大学学报,2022,44(4):116-124.

作者姓名：	朱建文赵长见李小平包为民

作者单位：	西安电子科技大学空间科学与技术学院, 西安 710126;中国运载火箭技术研究院, 北京 100076;西安电子科技大学空间科学与技术学院, 西安 710126;中国航天科技集团公司, 北京 100048

基金项目：	国家自然科学基金资助项目(61703409)；中国博士后科学基金资助项目(2019M66364) 〖

摘要：	为提升复杂飞行任务下滑翔制导的自主性,提出一种基于最优制导与强化学习的多约束智能滑翔制导策略。引入三维最优制导以满足终端经纬度、高度以及速度倾角约束。提出基于侧向正弦机动的速度控制策略,研究考虑机动飞行的终端速度解析预测方法。针对速度控制中机动幅值无法离线确定的问题,研究基于强化学习的智能调参方法。该方法基于终端速度设计状态空间,以机动幅值设计动作空间,设计综合终端速度误差与滑翔制导任务的回报函数,采用Q-Learning实现机动幅值的智能调整。仿真结果表明,智能滑翔制导方法能够高精度满足终端多种约束,并能有效提升复杂任务下的自主决策能力。
关键词：	滑翔飞行最优制导智能调参强化学习 Q-Learning
收稿时间：	2020/10/13 0:00:00
Multi constraint optimal intelligent gliding guidance via reinforcement learning

ZHU Jianwen,ZHAO Changjian,LI Xiaoping,BAO Weimin.Multi constraint optimal intelligent gliding guidance via reinforcement learning[J].Journal of National University of Defense Technology,2022,44(4):116-124.

Authors:	ZHU Jianwen ZHAO Changjian LI Xiaoping BAO Weimin

Abstract:	In order to improve the autonomy of gliding guidance for complex flight missions, a multi-constrained intelligent gliding guidance strategy based on optimal guidance and RL (reinforcement learning) was proposed. Three-dimensional optimal guidance was introduced to meet the terminal latitude, longitude, altitude and flight-path-angle constraints. A velocity control strategy through lateral sinusoidal maneuver was proposed, and an analytical terminal velocity prediction method considering maneuvering flight was studied. Aiming at the problem that the maneuvering amplitude in velocity control cannot be determined offline, an intelligent parameter adjustment method based on RL was studied. This method designed a state space via terminal velocity and an action space with maneuvering amplitude. In addition, it constructed a reward function that integrated the terminal velocity error and gliding guidance tasks, and used Q-Learning to achieve the intelligent adjustment of maneuvering amplitude. The simulation results show that the intelligent gliding guidance method can meet various terminal constraints with high accuracy, and can improve the autonomous decision-making ability under complex tasks effectively.

Keywords:	gliding flight optimal guidance intelligent parameter adjustment reinforcement learning Q-Learning

	点击此处可从《国防科技大学学报》浏览原始摘要信息
	点击此处可从《国防科技大学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏