基于马尔科夫决策的目标选择策略 Research on the Method of Target Selecting Policy Based on the Markov Decision Process期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于马尔科夫决策的目标选择策略

引用本文：	雷霆,朱承,张维明. 基于马尔科夫决策的目标选择策略[J]. 国防科技大学学报, 2014, 36(2): 161-167

作者姓名：	雷霆朱承张维明

作者单位：	国防科技大学信息系统工程重点实验室,国防科技大学信息系统工程重点实验室,国防科技大学信息系统工程重点实验室

基金项目：	国家自然科学基金项目（71001105, 91024006）

摘要：	目标选择是军事计划的关键要素之一。基于马尔科夫决策方法,解决具有复杂目标间关联的多阶段目标选择问题。使用与或树描述目标体系各层状态间的影响关联,并以目标体系整体失效为求解目的,建立了基于离散时间MDP的多阶段打击目标选择模型。在LRTDP算法基础上提出一种启发式方法,通过判断从当前目标体系状态到达体系失效状态的演化过程中的可能资源消耗和失败概率,来提供对当前状态的评估值,该方法能有效排除问题搜索空间中不能到达体系失效目的的中间状态,压缩了由于目标间复杂关联而增长的巨大状态空间。用实验验证了该方法有效性,实验结果表明,该方法直观实用,对目标间具有复杂关联关系的目标打击决策有一定参考价值。
关键词：	目标选择目标体系与或树离散时间马尔科夫决策过程
收稿时间：	2013-07-16
Research on the Method of Target Selecting Policy Based on the Markov Decision Process

LEI Ting,ZHU Cheng,ZHANG Weiming. Research on the Method of Target Selecting Policy Based on the Markov Decision Process[J]. Journal of National University of Defense Technology, 2014, 36(2): 161-167

Authors:	LEI Ting ZHU Cheng ZHANG Weiming

Affiliation:	LEI Ting;ZHU Cheng;ZHANG Weiming;Science and Technology on Information Systems Engineering Laboratory,National University of Defense Technology;Institute of Military Operation Research,Academy of the Military Science;

Abstract:	Target selecting is an important aspect of military operational planning. The Markov Decision Process(MDP) method was used to solve the multi-phase target selecting problem which has complex relations among targets. Firstly, the and-or tree was used to describe the relations among the layers of the target system of system(TSoS), and a Discrete Time Markov Decision Process(DTMDP) method was proposed for modeling target selecting whose objective was to neutralize the TSoS. Secondly, a heuristic based on the LRTDP algorithm was proposed to give the estimate value of the current state of the TSoS, which was calculated by considering the potential resource consumption and failure probability of the evolution process from the current state to the lapse state of the TSoS, and the heuristic can effectively exclude the intermediate states which cannot be transferred to the lapse state, in order to reduce the huge search space of the model because of the complex relations among targets. Finally, a case is proposed to validate the method. The results show that the method is intuitive and practical, and facilitate the target selecting decision making when there are complex relations among the targets.

Keywords:	Target system of system And-Or Tree Markov Decision Process.
本文献已被 CNKI 等数据库收录！
	点击此处可从《国防科技大学学报》浏览原始摘要信息
	点击此处可从《国防科技大学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏