Contact Us

A time aggregation approach to Markov decision

2002-6-1  Abstract. We propose a time aggregation approach for the solution of infinite horizon average cost Markov decision processes via policy iteration. In this approach, policy update is only carried out when the process visits a subset of the state space. As in state aggregation, this approach leads to a reduced state space, which may lead to a

A time aggregation approach to Markov decision

The time aggregation approach (Cao, Ren, Bhatnagar, Fu, & Marcus, 2002) affords MDPs state reduction by dividing the process into time segments according to certain state subsets. Performance gradient estimation for Markov processes with time aggregation was presented using the stochastic recursive method and likelihood ratio in Zhang and Ho

(PDF) A time aggregation approach to Markov decision

A time aggregation approach to Markov decision processes. Automatica, 2002. Shalabh Bhatnagar. Download Download PDF. Download Full PDF Package. Translate PDF. Related Papers. A New Adaptive Aggregation Algorithm for Infinite Horizon Dynamic Programming Counample Generation for Discrete-Time Markov Chains Using Bounded Model

A time aggregation approach to Markov decision

CiteSeerX Document Details (Isaac Councill, Lee Giles, Pradeep Teregowda): We propose a time aggregation approach for the solution of infinite horizon average cost Markov decision processes via policy iteration. In this approach, policy update is only carried out when the process visits a subset of the state space. As in state aggregation, this approach leads to a reduced

A time aggregation approach to Markov decision process

Time aggregation approach in MDP (Markov decision processes) theory is applied to handle the problem, then policy iteration can be implemented. Both

A time aggregation approach to Markov decision

We propose a time aggregation approach for the solution of infinite horizon average cost Markov decision processes via policy iteration. In this approach, policy update is only carried out when the process visits a subset of the state space. As in state aggregation, this approach leads to a reduced state space, which may lead to a substantial reduction in computational and storage

A time aggregation approach to Markov decision

BibTeX @MISC{Cao02atime, author = {Xi-ren Cao and Zhiyuan Ren and Shalabh Bhatnagar and Michael Fu and Steven Marcus}, title = { A time aggregation approach to Markov decision processes }, year = {2002}}

A time aggregation approach to Markov decision processes

We propose a time aggregation approach for the solution of infinite horizon average cost Markov decision processes via policy iteration. In this approach, policy update is only carried out when the process visits a subset of the state space. As in state aggregation, this approach leads to a reduced state space, which may lead to a substantial

A time aggregation approach to Markov decision

We propose a time aggregation approach for the solution of infinite horizon average cost Markov decision processes via policy iteration. In this approach, policy

A time aggregation approach to Markov decision

BibTeX @MISC{Cao02atime, author = {Xi-ren Cao and Zhiyuan Ren and Shalabh Bhatnagar and Michael Fu and Steven Marcus}, title = { A time aggregation approach to Markov decision processes }, year = {2002}}

A time aggregation approach to Markov decision processes

A time aggregation approach to Markov decision processes @article{Cao2002ATA, title={A time aggregation approach to Markov decision processes}, author={X. Cao and Z. Ren and S. Bhatnagar and M. Fu and S. Marcus}, journal={Autom.}, year={2002}, volume={38}, pages={929-943} } X. Cao, Z. Ren, +2 authors S. Marcus; Published 2002

A time aggregation approach to Markov decision

We propose a time aggregation approach for the solution of infinite horizon average cost Markov decision processes via policy iteration. In this approach, policy update is only carried out when the process visits a subset of the state space. As in state aggregation, this approach leads to a reduced state space, which may lead to a substantial reduction in computational and storage

A unified approach to time-aggregated Markov decision

Article. A unified approach to time-aggregated Markov decision processes. May 2016; Automatica 67:77-84

Bounded Aggregation for Continuous Time Markov

Request PDF Bounded Aggregation for Continuous Time Markov Decision Processes Markov decision processes suffer from two problems, namely

(PDF) Standard Dynamic Programming Applied to Time

In this note we address the time aggregation approach to ergodic finite state Markov decision processes with uncontrollable states. We propose the use of the time aggregation approach as an

The control of a two-level Markov decision process by

A time aggregation approach to Markov decision processes. Automatica, 38(6), 929–943.], and the lower-level MDPs are solved by embedded Markov chains. Discover the world's research

CiteSeerX — Search Results — Time aggregated Markov

;actor-critic" or policy-iteration architectures (e.g., Policy Gradient Theorem We consider the standard reinforcement learning framework (see, e.g., Sutton and Barto, 1998), in which a learning agent interacts with a Markov decision process (MDP). The state, action, and reward at each time t ∈ {0, 1, 2

[2201.06827] A sojourn-based approach to semi-Markov

2022-1-19  A sojourn-based approach to semi-Markov Reinforcement Learning. In this paper we introduce a new approach to discrete-time semi-Markov decision processes based on the sojourn time process. Different characterizations of discrete-time semi-Markov processes are exploited and decision processes are constructed by means of these characterizations.

[PDF] A time aggregation approach to Markov decision

We propose a time aggregation approach for the solution of infinite horizon average cost Markov decision processes via policy iteration. In this approach, policy update is only carried out when the process visits a subset of the state space. As in state aggregation, this approach leads to a reduced state space, which may lead to a substantial reduction in computational and storage

A time aggregation approach to Markov decision processes

A time aggregation approach to Markov decision processes @article{Cao2002ATA, title={A time aggregation approach to Markov decision processes}, author={X. Cao and Z. Ren and S. Bhatnagar and M. Fu and S. Marcus}, journal={Autom.}, year={2002}, volume={38}, pages={929-943} } X. Cao, Z. Ren, +2 authors S. Marcus; Published 2002

Time aggregated Markov decision processes via standard

The proposed approach simplifies the iterative search for the optimal solution by eliminating the need to define an equivalent parametric function, and results in a problem that can be solved by simpler, standard MDP algorithms. This note addresses the time aggregation approach to ergodic finite state Markov decision processes with uncontrollable states.

A unified approach to time-aggregated Markov decision

This paper presents a unified approach to time-aggregated Markov decision processes (MDPs) with an average cost criterion. The approach is based on

State aggregation in Markov decision processes Semantic

We study state aggregation for Markov decision processes (MDPs) with long-run average-cost optimality criterion in this paper. The aggregation is based on a definition of an (/spl epsiv//sub p/, /spl epsiv//sub f/)-lumpable partition of the state space, where the difference between the control effect of any control action on any two states belonging to the same subset in the partition is

A two-phase time aggregation algorithm for average cost

This paper introduces a two-phase approach to solve average cost Markov decision processes, which is based on state space embedding or time aggregation. In the first phase, time aggregation is applied for policy evaluation in a prescribed subset of the state space, and a novel result is applied to expand the evaluation to the whole state space.

[PDF] Solving average cost Markov decision processes by

A two-phase approach to solve average cost Markov decision processes, which is based on state space embedding or time aggregation, and a novel result is applied to expand the evaluation to the whole state space. This paper introduces a two-phase approach to solve average cost Markov decision processes, which is based on state space embedding or time aggregation.

Bounded Aggregation for Continuous Time Markov

2017-8-13  Abstract. Markov decision processes suffer from two problems, namely the so-called state space explosion which may lead to long computation times and the memoryless property of states which limits the modeling power with respect to real systems. In this paper we combine existing state aggregation and optimization methods for a new aggregation based optimization

[2201.06827] A sojourn-based approach to semi-Markov

2022-1-19  A sojourn-based approach to semi-Markov Reinforcement Learning. In this paper we introduce a new approach to discrete-time semi-Markov decision processes based on the sojourn time process. Different characterizations of discrete-time semi-Markov processes are exploited and decision processes are constructed by means of these characterizations.

Adaptive aggregation for reinforcement learning in

2012-1-24  We present an algorithm which aggregates online when learning to behave optimally in an average reward Markov decision process. The algorithm is based on the reinforcement learning algorithm UCRL and uses confidence intervals for aggregating the state space. We derive bounds on the regret our algorithm suffers with respect to an optimal policy. These