首页 | 本学科首页   官方微博 | 高级检索  
     

用户偏好提取MDP建模研究
引用本文:黄海清,张平,张曦文. 用户偏好提取MDP建模研究[J]. 国防科技大学学报, 2006, 28(6): 81-85
作者姓名:黄海清  张平  张曦文
作者单位:北京邮电大学,电信工程学院,北京,100876;航天部第二研究院,中心军代室,北京,100854
基金项目:国家863高技术资助项目(2003AA12331004)
摘    要:将马尔可夫判决过程和智能强化学习算法相结合,给出了异构无线网络环境下用户业务偏好评估模型的技术框架。为动态环境下用户需求的感知、量化和适配特征的研究提供了基本的数学描述,对解决用户体验的评价问题和业务与业务环境的适配问题提供了新的研究思路。仿真结果表明所构建的MDP模型能够在多状态条件下学习用户偏好,根据用户需求智能选择业务。

关 键 词:效用理论  用户偏好  马尔可夫判决过程  强化学习
文章编号:1001-2486(2006)06-0081-05
收稿时间:2006-06-25
修稿时间:2006-06-25

Modeling of User Preference Based on MDP
HUANG HaiQing,ZHANG Ping and ZHANG Xiwen. Modeling of User Preference Based on MDP[J]. Journal of National University of Defense Technology, 2006, 28(6): 81-85
Authors:HUANG HaiQing  ZHANG Ping  ZHANG Xiwen
Affiliation:1.School of Telecommunication Engineering, Beijing Univ. of Posts and Telecommunications,Beijing 100876, China;1.School of Telecommunication Engineering, Beijing Univ. of Posts and Telecommunications,Beijing 100876, China;2.The 2th Institute of China Aerospace Science & Industry, Beijing 100854, China
Abstract:A technical architecture for user preference model is presented,and the nature of the problem represented within a Markov Decision Process(MDP) combined with adaptive reinforcement learning algorithm is displayed.We provided a possible candidate solution for user modeling dynamically to satisfy the user's expected preference based on minimal or missing information.It is also a exploration for the evaluation of the user experience when selecting service providers.Simulations of the user models show that the MDP model is effective for learning the user preference with multi-state profiles.
Keywords:utility theory  user preference  Markov decision process  reinforcement learning
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《国防科技大学学报》浏览原始摘要信息
点击此处可从《国防科技大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号