首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper we consider computation techniques associated with the optimization of large scale Markov decision processes. Markov decision processes and the successive approximation procedure of White are described. Then a procedure for scaling continuous time and renewal processes so that they are amenable to the White procedure is discussed. The effect of the scale factor value on the convergence rate of the procedure and insights into proper scale factor selection are given.  相似文献   

2.
In this article we consider a Markov decision process subject to the constraints that result from some observability restrictions. We assume that the state of the Markov process under consideration is unobservable. The states are grouped so that the group that a state belongs to is observable. So, we want to find an optimal decision rule depending on the observable groups instead of the states. This means that the same decision applies to all the states in the same group. We prove that a deterministic optimal policy exists for the finite horizon. An algorithm is developed to compute policies minimizing the total expected discounted cost over a finite horizon. © 1997 John Wiley & Sons, Inc. Naval Research Logistics 44 : 439–456, 1997  相似文献   

3.
A search is conducted for a target moving in discrete time among a finite number of cells according to a known Markov process. The searcher must choose one cell in which to search in each time period. The set of cells available for search depends upon the cell chosen in the last time period. The problem is to find a search path, i.e., a sequence of search cells, that either maximizes the probability of detection or minimizes the mean number of time periods required for detection. The search problem is modelled as a partially observable Markov decision process and several approximate solutions procedures are proposed. © 1995 John Wiley & Sons, Inc.  相似文献   

4.
We propose a novel simulation‐based approach for solving two‐stage stochastic programs with recourse and endogenous (decision dependent) uncertainty. The proposed augmented nested sampling approach recasts the stochastic optimization problem as a simulation problem by treating the decision variables as random. The optimal decision is obtained via the mode of the augmented probability model. We illustrate our methodology on a newsvendor problem with stock‐dependent uncertain demand both in single and multi‐item (news‐stand) cases. We provide performance comparisons with Markov chain Monte Carlo and traditional Monte Carlo simulation‐based optimization schemes. Finally, we conclude with directions for future research.  相似文献   

5.
In this paper, we consider the multiple criteria decision‐making problem of partitioning alternatives into acceptable and unacceptable sets. We develop interactive procedures for the cases when the underlying utility function of the decision maker is linear, quasiconcave, and general monotone. We present an application of the procedures to the problem of admitting students to the master's degree program at the Industrial Engineering Department, Middle East Technical University. © 2001 John Wiley & Sons, Inc. Naval Research Logistics 48: 592–606, 2001.  相似文献   

6.
In this paper, a condition-based maintenance model for a multi-unit production system is proposed and analyzed using Markov renewal theory. The units of the system are subject to gradual deterioration, and the gradual deterioration process of each unit is described by a three-state continuous time homogeneous Markov chain with two working states and a failure state. The production rate of the system is influenced by the deterioration process and the demand is constant. The states of the units are observable through regular inspections and the decision to perform maintenance depends on the number of units in each state. The objective is to obtain the steady-state characteristics and the formula for the long-run average cost for the controlled system. The optimal policy is obtained using a dynamic programming algorithm. The result is validated using a semi-Markov decision process formulation and the policy iteration algorithm. Moreover, an analytical expression is obtained for the calculation of the mean time to initiate maintenance using the first passage time theory.  相似文献   

7.
In the framework of a discrete Markov decision process with state information lag, this article suggests a way for selecting an optimal policy using the control limit rule. The properties sufficient for an optimal decision rule to be contained in the class of control limit rules are also studied. The degradation in expected reward from that of the perfect information process provides a measure of the potential value of improving the information system.  相似文献   

8.
This paper considers the maintenance of aircraft engine components that are subject to stress. We model the deterioration process by means of the cumulative jump process representation of crack growth. However, because in many cases cracks are not easily observable, maintenance decisions must be made on the basis of other information. We incorporate stress information collected via sensors into the scheduling decision process by means of a partially observable Markov decision process model. Using this model, we demonstrate the optimality of structured maintenance policies, which support practical maintenance schedules. © 1998 John Wiley & Sons, Inc. Naval Research Logistics 45: 335–352, 1998  相似文献   

9.
舰艇搜索潜艇的过程可以近似地看作一种马尔可夫过程.研究了潜艇训练仿真系统中舰艇CGF(计算机生成兵力)搜索潜艇过程中的状态转移过程和随机搜索发现目标的概率模型,将舰艇CGF搜索潜艇的过程分为若干独立的阶段,建立了搜潜过程马氏决策规划模型,提出了在各种不同初始搜索状态下的舰艇搜索策略.仿真结果给出了舰艇CGF的最优搜索策略集合和发现概率,验证了马氏决策规划模型的有效性.  相似文献   

10.
In this article we consider a continuous-time Markov decision process with a denumerable state space and nonzero terminal rewards. We first establish the necessary and sufficient optimality condition without any restriction on the cost functions. The necessary condition is derived through the Pontryagin maximum principle and the sufficient condition, by the inherent structure of the problem. We introduce a dynamic programming approximation algorithm for the finite-horizon problem. As the time between discrete points decreases, the optimal policy of the discretized problem converges to that of the continuous-time problem in the sense of weak convergence. For the infinite-horizon problem, a successive approximation method is introduced as an alternative to a policy iteration method.  相似文献   

11.
分析了Markov模型的一般预测过程,建立了一种基于Markov决策支持系统的作战效能预测分析模型,根据坦克排在担负不同作战任务时的特点,给出了坦克排作战效能预测分析的应用实例,预测分析的结果表明,该模型能够对作战态势做以正确的预测,有利于指挥员更好地进行兵力部署,发挥坦克排作战的最佳效能.  相似文献   

12.
动态武器目标分配问题的马尔可夫性   总被引:2,自引:2,他引:0       下载免费PDF全文
动态武器目标分配(weapon target assignment,WTA)问题是军事运筹学研究的重要理论问题,也是作战指挥决策中迫切需要解决的现实问题。在对动态WTA问题进行描述分析的基础上,运用随机过程理论证明了动态WTA过程的马尔可夫性;给出了该马尔可夫决策过程的状态转移概率的解析表达式,并对其状态特点进行了简要分析。研究结果可以为动态WTA及相关问题的研究提供理论和方法依据。  相似文献   

13.
We consider the decision‐making problem of dynamically scheduling the production of a single make‐to stock (MTS) product in connection with the product's concurrent sales in a spot market and a long‐term supply channel. The spot market is run by a business to business (B2B) online exchange, whereas the long‐term channel is established by a structured contract. The product's price in the spot market is exogenous, evolves as a continuous time Markov chain, and affects demand, which arrives sequentially as a Markov‐modulated Poisson process (MMPP). The manufacturer is obliged to fulfill demand in the long‐term channel, but is able to rein in sales in the spot market. This is a significant strategic decision for a manufacturer in entering a favorable contract. The profitability of the contract must be evaluated by optimal performance. The current problem, therefore, arises as a prerequisite to exploring contracting strategies. We reveal that the optimal strategy of coordinating production and sales is structured by the spot price dependent on the base stock and sell‐down thresholds. Moreover, we can exploit the structural properties of the optimal strategy to conceive an efficient algorithm. © 2010 Wiley Periodicals, Inc. Naval Research Logistics, 2010  相似文献   

14.
The Federal Aviation Administration (FAA) and the airline community within the United States have adopted a new paradigm for air traffic flow management, called Collaborative Decision Making (CDM). A principal goal of CDM is shared decision‐making responsibility between the FAA and airlines, so as to increase airline control over decisions that involve economic tradeoffs. So far, CDM has primarily led to enhancements in the implementation of Ground Delay Programs, by changing procedures for allocating slots to airlines and exchanging slots between airlines. In this paper, we discuss how these procedures may be formalized through appropriately defined optimization models. In addition, we describe how inter‐airline slot exchanges may be viewed as a bartering process, in which each “round” of bartering requires the solution of an optimization problem. We compare the resulting optimization problem with the current procedure for exchanging slots and discuss possibilities for increased decision‐making capabilities by the airlines. © 2005 Wiley Periodicals, Inc. Naval Research Logistics, 2006  相似文献   

15.
基于隐马尔可夫模型的IDS程序行为异常检测   总被引:3,自引:0,他引:3       下载免费PDF全文
提出一种新的基于隐马尔可夫模型的程序行为异常检测方法,此方法利用系统调用序列,并基于隐马尔可夫模型来描述程序行为,根据程序行为模式的出现频率对其进行分类,并将行为模式类型同隐马尔可夫模型的状态联系在一起。由于各状态对应的观测值集合互不相交,模型训练中采用了运算量较小的序列匹配方法,与传统的Baum Welch算法相比,训练时间有较大幅度的降低。考虑到模型中状态的特殊含义以及程序行为的特点,将加窗平滑后的状态序列出现概率作为判决依据。实验表明,此方法具有很高的检测准确性,其检测效率也优于同类方法。  相似文献   

16.
对炮兵自动化指挥系统中决策组织的结构和能力进行了研究,建立了较为完善的各级指挥员指挥决策体系的Petri网模型。通过有色Petri网对群一级的炮兵自动化指挥系统在不同任务环境下的不同工作方式进行了建模表示和分析,引入了决策时延这一反应指挥决策体系能力的重要性能指标,并进一步运用马尔科夫链的分析方法对建立的指挥决策体系的Petri网模型进行定量分析,得出了量化结果,为自动化指挥系统中决策组织结构的设计和分析提供了理论根据。  相似文献   

17.
In this paper we study strategies for better utilizing the network capacity of Internet Service Providers (ISPs) when they are faced with stochastic and dynamic arrivals and departures of customers attempting to log‐on or log‐off, respectively. We propose a method in which, depending on the number of modems available, and the arrival and departure rates of different classes of customers, a decision is made whether to accept or reject a log‐on request. The problem is formulated as a continuous time Markov Decision Process for which optimal policies can be readily derived using techniques such as value iteration. This decision maximizes the discounted value to ISPs while improving service levels for higher class customers. The methodology is similar to yield management techniques successfully used in airlines, hotels, etc. However, there are sufficient differences, such as no predefined time horizon or reservations, that make this model interesting to pursue and challenging. This work was completed in collaboration with one of the largest ISPs in Connecticut. The problem is topical, and approaches such as those proposed here are sought by users. © 2001 John Wiley & Sons, Inc., Naval Research Logistics 48:348–362, 2001  相似文献   

18.
This paper considers the problem of computing, by iterative methods, optimal policies for Markov decision processes. The policies computed are optimal for all sufficiently small interest rates.  相似文献   

19.
针对Markov方法分析多阶段任务系统(Phased-Mission System,PMS)可靠性时的状态空间爆炸问题,基于层次化建模思想,建立了PMS任务可靠性的顶层系统二维决策图(Binary Decision Diagram,BDD)模型和底层部件Markov模型。通过分析BDD中的同构节点和冗余节点,提出顶层模型构造过程中的同构节点合并策略和冗余节点删除策略。利用上述节点压缩策略生成简化模型,提高模型构造和存储效率。基于PMS部件排序规则,给出了层次化模型的递归求解方法,该方法的计算复杂度与顶层模型中的节点总数呈线性关系。通过算例分析,对比采用节点压缩策略前后的模型节点数,以及层次化方法与Markov方法的计算结果,验证了简化层次模型的正确性和有效性。  相似文献   

20.
In this article we consider the optimal control of an M[X]/M/s queue, s ≧ 1. In addition to Poisson bulk arrivals we incorporate a reneging function. Subject to control are an admission price p and the service rate μ. Thus, through p, balking response is induced. When i customers are present a cost h(i,μ,p) per unit time is incurred, discounted continuously. Formulated as a continuous time Markov decision process, conditions are given under which the optimal admission price and optimal service rate are each nondecreasing functions of i. In Section 4 we indicate how the infinite state space may be truncated to a finite state space for computational purposes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号