Markov decision processes with restricted observations: Finite horizon case

Download
1997-08-01
In this article we consider a Markov decision process subject to the constraints that result from some observability restrictions. We assume that the state of the Markov process under consideration is unobservable. The states are grouped so that the group that a state belongs to is observable. So, we want to find an optimal decision rule depending on the observable groups instead of the states. This means that the same decision applies to all the states in the same group. We prove that a deterministic optimal policy exists for the finite horizon. An algorithm is developed to compute policies minimizing the total expected discounted cost over a finite horizon. (C) 1997 John Wiley & Sons, Inc.

Suggestions

Markov decision processes under observability constraints
Serin, Yaşar Yasemin (Springer Science and Business Media LLC, 2005-06-01)
We develop an algorithm to compute optimal policies for Markov decision processes subject to constraints that result from some observability restrictions on the process. We assume that the state of the Markov process is unobservable. There is an observable process related to the unobservable state. So, we want to find a decision rule depending only on this observable process. The objective is to minimize the expected average cost over an infinite horizon. We also analyze the possibility of performing observ...
A New Outranking-Based Approach for Assigning Alternatives to Ordered Classes
Köksalan, Mustafa Murat; Mousseau, Vincent; Ozpeynirci, Oezguer; Ozpeynirci, Selin Bilgin (Wiley, 2009-02-01)
We consider the problem of assigning alternatives evaluated on several criteria into ordered categories C(1), C(2), ..., C(p). This problem is known as the multi-criteria sorting problem and arises in many situations such as classifying countries into different risk levels based on economical and socio-political criteria, evaluating credit applications of bank customers. We are interested in sorting methods that are grounded on the construction Of Outranking relations. Among these, the Electre Tri method re...
Neural network calibrated stochastic processes: forecasting financial assets
Giebel, Stefan; Rainer, Martin (Springer Science and Business Media LLC, 2013-03-01)
If a given dynamical process contains an inherently unpredictable component, it may be modeled as a stochastic process. Typical examples from financial markets are the dynamics of prices (e.g. prices of stocks or commodities) or fundamental rates (exchange rates etc.). The unknown future value of the corresponding stochastic process is usually estimated as the expected value under a suitable measure, which may be determined from distribution of past (historical) values. The predictive power of this estimati...
Constructing a strict total order for alternatives characterized by multiple criteria: An extension
Dehnokhalaji, Akram; Korhonen, Pekka J.; Köksalan, Mustafa Murat; Nasrabadi, Nasim; Ozturk, Diclehan Tezcaner; Wallenius, Jyrki (Wiley, 2014-03-01)
The problem of finding a strict total order for a finite set of multiple criteria alternatives is considered. Our research extends previous work by us, which considered finding a partial order for a finite set of alternatives. We merge the preference information extracted from the preference cones and corresponding polyhedral sets, with the information derived from pairwise comparisons of two alternatives, yielding a preference matrix. This preference matrix is used as input to an integer programming model ...
Interactive algorithms for a broad underlying family of preference functions
Karakaya, Gülşah; AHİPAŞAOĞLU, Selin Damla (Elsevier BV, 2018-02-16)
In multi-criteria decision making approaches it is typical to consider an underlying preference function that is assumed to represent the decision maker's preferences. In this paper we introduce a broad family of preference functions that can represent a wide variety of preference structures. We develop the necessary theory and interactive algorithms for both the general family of the preference functions and for its special cases. The algorithms guarantee to find the most preferred solution (point) of the ...
Citation Formats
Y. Y. Serin and Z. M. Avşar, “Markov decision processes with restricted observations: Finite horizon case,” NAVAL RESEARCH LOGISTICS, pp. 439–456, 1997, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/36042.