Inicio  /  Applied Sciences  /  Vol: 9 Par: 7 (2019)  /  Artículo
ARTÍCULO
TITULO

Multi-Agent Planning under Uncertainty with Monte Carlo Q-Value Function

Jian Zhang    
Yaozong Pan    
Ruili Wang    
Yuqiang Fang and Haitao Yang    

Resumen

Decentralized partially observable Markov decision processes (Dec-POMDPs) are general multi-agent models for planning under uncertainty, but are intractable to solve. Doubly exponential growth of the search space as the horizon increases makes a brute-force search impossible. Heuristic methods can guide the search towards the right direction quickly and have been successful in different domains. In this paper, we propose a new Q-value function representation?Monte Carlo Q-value function QMC Q MC , which is proved to be an upper bound of the optimal Q-value function Q* Q * . We introduce two Monte Carlo tree search enhancements?heavy playout for a simulation policy and adaptive samples?to speed up computation of QMC Q MC . Then, we present a clustering and expansion with Monte-Carlo algorithm (CEMC)?an offline planning algorithm using QMC Q MC as Q-value function, which is based on the generalized multi-agent A* with incremental clustering and expansion (GMAA*-ICE or ICE). CEMC calculates Q-value functions as required, without computing and storing all Q-value functions. An extended policy pruning strategy is used in CEMC. Finally, we present empirical results demonstrating that CEMC outperforms the best heuristic algorithm with a compact Q-value presentation in term of runtime for the same horizon, and has less memory usage for larger problems.

 Artículos similares

       
 
David Rajaratnam, Torsten Schaub, Philipp Wanko, Kai Chen, Sirui Liu and Tran Cao Son    
A warehouse delivery problem consists of a set of robots that undertake delivery jobs within a warehouse. Items are moved around the warehouse in response to events. A solution to a warehouse delivery problem is a collision-free schedule of robot movemen... ver más
Revista: Algorithms

 
Okan Asik, Fatma Basak Aydemir and Hüseyin Levent Akin    
The number of agents exponentially increases the complexity of a cooperative multi-agent planning problem. Decoupled planning is one of the viable approaches to reduce this complexity. By integrating decoupled planning with Monte Carlo Tree Search, we pr... ver más
Revista: Applied Sciences

 
Shao Xuan Seah and Sutthiphong Srigrarom    
This paper explores the use of deep reinforcement learning in solving the multi-agent aircraft traffic planning (individual paths) and collision avoidance problem for a multiple UAS, such as that for a cargo drone network. Specifically, the Deep Q-Networ... ver más
Revista: Aerospace

 
Qing Yang, Bingyu Song, Yingguo Chen, Lei He and Pei Wang    
With the improvement of satellite autonomy, multi-satellite cooperative mission planning has become an important application. This requires multiple satellites to interact with each other via inter-satellite links to reach a consistent mission planning s... ver más
Revista: Algorithms

 
Bingyu Song, Yingwu Chen, Qing Yang, Yahui Zuo, Shilong Xu and Yuning Chen    
The multi-satellite on-board observation planning (MSOOP) is a variant of the multi-agent task allocation problem (MATAP). MSOOP is used to complete the observation task allocation in a fully cooperative mode to maximize the profits of the whole system. ... ver más
Revista: Algorithms