首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 238 毫秒
1.
In this article we consider a Markov decision process subject to the constraints that result from some observability restrictions. We assume that the state of the Markov process under consideration is unobservable. The states are grouped so that the group that a state belongs to is observable. So, we want to find an optimal decision rule depending on the observable groups instead of the states. This means that the same decision applies to all the states in the same group. We prove that a deterministic optimal policy exists for the finite horizon. An algorithm is developed to compute policies minimizing the total expected discounted cost over a finite horizon. © 1997 John Wiley & Sons, Inc. Naval Research Logistics 44 : 439–456, 1997  相似文献   

2.
We study stochastic clearing systems with a discrete-time Markovian input process, and an output mechanism that intermittently and instantaneously clears the system partially or completely. The decision to clear the system depends on both quantities and delays of outstanding inputs. Clearing the system incurs a fixed cost, and outstanding inputs are charged a delay penalty, which is a general increasing function of the quantities and delays of individual inputs. By recording the quantities and delays of outstanding inputs in a sequence, we model the clearing system as a tree-structured Markov decision process over both a finite and infinite horizon. We show that the optimal clearing policies, under realistic conditions, are of the on-off type or the threshold type. Based on the characterization of the optimal policies, we develop efficient algorithms to compute parameters of the optimal policies for such complex clearing systems for the first time. We conduct a numerical analysis on the impact of the nonlinear delay penalty cost function, the comparison of the optimal policy and the classical hybrid policy (ie, quantity and age thresholds), and the impact of the state of the input process. Our experiments demonstrate that (a) the classical linear approximation of the cost function can lead to significant performance differences; (b) the classical hybrid policy may perform poorly (as compared to the optimal policies); and (c) the consideration of the state of the input process makes significant improvement in system performance.  相似文献   

3.
A system receives shocks at random points of time. Each shock causes a random amount of damage which accumulates over time. The system fails when the accumulated damage exceeds a fixed threshold. Upon failure the system is replaced by a new one. The damage process is controlled by means of a maintenance policy. There are M possible maintenance actions. Given that a maintenance action m is employed, then the cumulative damage decreases at rate rm. Replacement costs and maintenance costs are considered. The objective is to determine an optimal maintenance policy under the following optimality criteria: (1) long-run average cost; (2) total expected discounted cost over an infinite horizon. For a diffusion approximation, we show that the optimal maintenance expenditure rate is monotonically increasing in the cumulative damage level.  相似文献   

4.
In the finite-horizon stochastic (s, S) inventory model with periodic review the parameters of the optimal policy generally vary with the length of the horizon. A stationary policy, however, is easier to implement and may be easier to calculate. This paper studies optimal stationary policies for a finite horizon and relates them to optimal policies through their relation to optimal stationary policies for an infinite horizon.  相似文献   

5.
We consider state-age-dependent replacement policies for a multistate deteriorating system. We assume that operating cost rates and replacement costs are both functions of the underlying states. Replacement times and sojourn times in different states are all state-dependent random variables. The optimization criterion is to minimize the expected long-run cost rate. A policy-improvement algorithm to derive the optimal policy is presented. We show that under reasonable assumptions, the optimal replacement policies have monotonic properties. In particular, when the failure-rate functions are nonincreasing, or when all the replacement costs and the expected replacement times are independent of state, we show that the optimal policies are only state dependent. Examples are given to illustrate the structure of the optimal policies in the special case when the sojourntime distributions are Weibull. © 1994 John Wiley & Sons, Inc.  相似文献   

6.
We consider a finite horizon periodic review, single product inventory system with a fixed setup cost and two stochastic demand classes that differ in their backordering costs. In each period, one must decide whether and how much to order, and how much demand of the lower class should be satisfied. We show that the optimal ordering policy can be characterized as a state dependent (s,S) policy, and the rationing structure is partially obtained based on the subconvexity of the cost function. We then propose a simple heuristic rationing policy, which is easy to implement and close to optimal for intensive numerical examples. We further study the case when the first demand class is deterministic and must be satisfied immediately. We show the optimality of the state dependent (s,S) ordering policy, and obtain additional rationing structural properties. Based on these properties, the optimal ordering and rationing policy for any state can be generated by finding the optimal policy of only a finite set of states, and for each state in this set, the optimal policy is obtained simply by choosing a policy from at most two alternatives. An efficient algorithm is then proposed. © 2010 Wiley Periodicals, Inc. Naval Research Logistics, 2010  相似文献   

7.
This paper discusses the properties of the inventory and advertising policy minimizing the expected discounted cost over a finite horizon in a dynamic nonstationary inventory model with random demand which is influenced by the level of promotion or goodwill. Attention is focused on the relation between the fluctuations over time of the optimal policies and the variations over time of the factors involved, i.e., demand distributions and various costs. The optimal policies are proved to be monotone in the various factors. Also, three types of fluctuations over time of the optimal policies are discussed according to which factor varies over time. For example, if over a finite interval, the random demand increases (stochastically) from one period to the next, reaches a maximum and then decreases, then the optimal inventory level will do the same. Also the period of maximum of demand never precedes that of maximum inventory. The optimal advertising behaves in the opposite way and its minimum will occur at the same time as the maximum of the inventory. The model has a linear inventory ordering cost and instantaneous delivery of stocks; many results, however, are extended to models with a convex ordering cost or a delivery time lag.  相似文献   

8.
We seek dynamic server assignment policies in finite‐capacity queueing systems with flexible and collaborative servers, which involve an assembly and/or a disassembly operation. The objective is to maximize the steady‐state throughput. We completely characterize the optimal policy for a Markovian system with two servers, two feeder stations, and instantaneous assembly and disassembly operations. This optimal policy allocates one server per station unless one of the stations is blocked, in which case both servers work at the unblocked station. For Markovian systems with three stations and instantaneous assembly and/or disassembly operations, we consider similar policies that move a server away from his/her “primary” station only when that station is blocked or starving. We determine the optimal assignment of each server whose primary station is blocked or starving in systems with three stations and zero buffers, by formulating the problem as a Markov decision process. Using this optimal assignment, we develop heuristic policies for systems with three or more stations and positive buffers, and show by means of a numerical study that these policies provide near‐optimal throughput. Furthermore, our numerical study shows that these policies developed for assembly‐type systems also work well in tandem systems. © 2008 Wiley Periodicals, Inc. Naval Research Logistics, 2008  相似文献   

9.
We consider the following replacement model in reliability theory. A technical system with random lifetime is replaced upon failure. Preventive replacements can be carried out before failure. The time for such a replacement depends on the observation of a random state parameter and is therefore in general a random time. Different costs for preventive and failure replacements are introduced which may depend on the age of the working system. The optimization criterion followed here to find an optimal replacement time is to minimize the total expected discounted costs. The optimal replacement policy depends on the observation of the state of the system. Results of the theory of stochastic processes are used to obtain the optimal strategy for different information levels. Several examples based on a two-component parallel system with possibly dependent component lifetimes show how the optimal replacement policy depends on the different information levels and on the degree of dependence of the components. © 1992 John Wiley & Sons, Inc.  相似文献   

10.
Infinite-horizon, countable-state, continuous-time Markovian decision models are solved by formulating as a pair of infinite linear-programming problems. Expected discounted and average returns are considered as criterion functions. For both criterion functions, the existence of deterministic optimal stationary policies is established by solving the associated infinite linear-programming problems. Computational procedures for finite state and action sets are discussed by considering associated finite linear-programming problems.  相似文献   

11.
The parallel machine replacement problem consists of finding a minimum cost replacement policy for a finite population of economically interdependent machines. In this paper, we formulate a stochastic version of the problem and analyze the structure of optimal policies under general classes of replacement cost functions. We prove that for problems with arbitrary cost functions, there can be optimal policies where a machine is replaced only if all machines in worse states are replaced (Worse Cluster Replacement Rule). We then show that, for problems with replacement cost functions exhibiting nonincreasing marginal costs, there are optimal policies such that, in any stage, machines in the same state are either all kept or all replaced (No‐Splitting Rule). We also present an example that shows that economies of scale in replacement costs do not guarantee optimal policies that satisfy the No‐Splitting Rule. These results lead to the fundamental insight that replacement decisions are driven by marginal costs, and not by economies of scale as suggested in the literature. Finally, we describe how the optimal policy structure, i.e., the No‐Splitting and Worse Cluster Replacement Rules, can be used to reduce the computational effort required to obtain optimal replacement policies. © 2005 Wiley Periodicals, Inc. Naval Research Logistics, 2005  相似文献   

12.
The individual and social optimum control policies for entry to an M/M//1 queue serving several classes of customers have been shown to be control-limit policies. The technique of policy iteration provides the social optimum policy for such a queue in a straightforward manner. In this article, the problem of finding the optimal control policy for the M/Ek/1 system is solved, thereby expanding the potential applicability of the solutions developed. The Markovian nature of the queueing system is preserved by considering the service as having k sequential phases, each with independent, identically distributed, exponential service times, through which a customer must pass to be serviced. The optimal policy derived by policy iteration for such a system is likely to be difficult to use because it requires knowledge of the number of phases rather than customers in the system when an arrival occurs. To circumvent this difficulty, a heuristic is used to find a good usable (implementable) solution. In addition, a mixed-integer program is developed which yields the optimal implementable solution when solved.  相似文献   

13.
A single component system is assumed to progress through a finite number of increasingly bad levels of deterioration. The system with level i (0 ≤ i ≤ n) starts in state 0 when new, and is definitely replaced upon reaching the worthless state n. It is assumed that the transition times are directly monitored and the admissible class of strategies allows substitution of a new component only at such transition times. The durations in various deterioration levels are dependent random variables with exponential marginal distributions and a particularly convenient joint distribution. Strategies are chosen to maximize the average rewards per unit time. For some reward functions (with the reward rate depending on the state and the duration in this state) the knowledge of previous state duration provides useful information about the rate of deterioration.  相似文献   

14.
We study joint preventive maintenance (PM) and production policies for an unreliable production‐inventory system in which maintenance/repair times are non‐negligible and stochastic. A joint policy decides (a) whether or not to perform PM and (b) if PM is not performed, then how much to produce. We consider a discrete‐time system, formulating the problem as a Markov decision process (MDP) model. The focus of the work is on the structural properties of optimal joint policies, given the system state comprised of the system's age and the inventory level. Although our analysis indicates that the structure of optimal joint policies is very complex in general, we are able to characterize several properties regarding PM and production, including optimal production/maintenance actions under backlogging and high inventory levels, and conditions under which the PM portion of the joint policy has a control‐limit structure. In further special cases, such as when PM set‐up costs are negligible compared to PM times, we are able to establish some additional structural properties. © 2005 Wiley Periodicals, Inc. Naval Research Logistics, 2005.  相似文献   

15.
The purpose of this paper is to investigate and optimize policies which can be used to terminate a two-state stochastic process with a random lifetime. Such a policy consists of a schedule of times at which termination attempts should be made. Conditions are given which reduce the difficulty of finding the optimal policy by eliminating constraints and some boundary points from consideration. Finally, a bound for the optimal policy is derived for a case where some restrictions are imposed on the model.  相似文献   

16.
We consider a denumerable state Markovian sequential control process. It is well known that when we consider the expected total discounted income as a criterion, there exists a nonrandomized stationary policy that is optimal. It is also well known that when we consider the expected average income as a criterion, an optimal nonrandomized stationary policy exists when a certain system of equations has a solution. The problem considered here is: if there exist two optimal nonrandomized stationary policies, will a randomization of these two policies be optimal? It is shown that in the discounted case the answer is always yes, but in the average income case, the answer is yes only under certain additional conditions.  相似文献   

17.
18.
This paper deals with a periodic review inventory system in which a constant proportion of stock issued to meet demand each period feeds back into the inventory after a fixed number of periods. Various applications of the model are discussed, including blood bank management and the control of reparable item inventories. We assume that on hand inventory is subject to proportional decay. Demands in successive periods are assumed to be independent identically distributed random variables. The functional equation defining an optimal policy is formulated and a myopic base stock approximation is developed. This myopic policy is shown to be optimal for the case where the feedback delay is equal to one period. Both cost and ordering decision comparisons for optimal and myopic policies are carried out numerically for a delay time of two periods over a wide range of input parameter values.  相似文献   

19.
We present a computationally efficient procedure to determine control policies for an infinite horizon Markov Decision process with restricted observations. The optimal policy for the system with restricted observations is a function of the observation process and not the unobservable states of the system. Thus, the policy is stationary with respect to the partitioned state space. The algorithm we propose addresses the undiscounted average cost case. The algorithm combines a local search with a modified version of Howard's (Dynamic programming and Markov processes, MIT Press, Cambridge, MA, 1960) policy iteration method. We demonstrate empirically that the algorithm finds the optimal deterministic policy for over 96% of the problem instances generated. For large scale problem instances, we demonstrate that the average cost associated with the local optimal policy is lower than the average cost associated with an integer rounded policy produced by the algorithm of Serin and Kulkarni Math Methods Oper Res 61 (2005) 311–328. © 2008 Wiley Periodicals, Inc. Naval Research Logistics 2009  相似文献   

20.
This paper investigates the problem of choosing between two simple hypothesis, H0 and H1, in terms of independent, identically distributed random variables, when observations can be taken in groups. At any stage in the decision process it must be decided whether to stop and take action now or to continue, in which case the size of the next group of observations must be decided upon. The problem is to find an optimal procedure incorporating a stopping, group size (batch) and terminal action rule. It is proven, in general, that the optimal stopping and terminal action rule is of the sequential probability ratio type (SPRT). Fixed stopping rules of the SPRT type are studied and an iterative procedure of the policy improvement type, both with and without a value determination step, is developed. It is shown, for the general situation, that both the average risk and scheduling rule converge to the optima. Also, six suboptimal scheduling rules are considered with respect to the average risks they achieve. Numerical results are presented to illustrate the effectiveness of the procedures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号