REVISTA
Applied Sciences

TODAS

Inicio / Applied Sciences / Vol: 9 Par: 7 (2019) / Artículo

ARTÍCULO

TITULO

Multi-Agent Planning under Uncertainty with Monte Carlo Q-Value Function

Jian Zhang

Yaozong Pan

Ruili Wang

Yuqiang Fang and Haitao Yang

Resumen

Decentralized partially observable Markov decision processes (Dec-POMDPs) are general multi-agent models for planning under uncertainty, but are intractable to solve. Doubly exponential growth of the search space as the horizon increases makes a brute-force search impossible. Heuristic methods can guide the search towards the right direction quickly and have been successful in different domains. In this paper, we propose a new Q-value function representation?Monte Carlo Q-value function QMC Q MC , which is proved to be an upper bound of the optimal Q-value function Q* Q * . We introduce two Monte Carlo tree search enhancements?heavy playout for a simulation policy and adaptive samples?to speed up computation of QMC Q MC . Then, we present a clustering and expansion with Monte-Carlo algorithm (CEMC)?an offline planning algorithm using QMC Q MC as Q-value function, which is based on the generalized multi-agent A* with incremental clustering and expansion (GMAA*-ICE or ICE). CEMC calculates Q-value functions as required, without computing and storing all Q-value functions. An extended policy pruning strategy is used in CEMC. Finally, we present empirical results demonstrating that CEMC outperforms the best heuristic algorithm with a compact Q-value presentation in term of runtime for the same horizon, and has less memory usage for larger problems.

Palabras claves

multi-agent - uncertainty - Monte Carlo - Q-value function - Dec-POMDP

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 9 Parte: 7 (2019)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Information
Aerospace
Journal of Marine Science and Engineering

DOI

https://doi.org/10.3390/app9071430

Artículos similares

Solving an Industrial-Scale Warehouse Delivery Problem with Answer Set Programming Modulo Difference Constraints

Acceso

David Rajaratnam, Torsten Schaub, Philipp Wanko, Kai Chen, Sirui Liu and Tran Cao Son

A warehouse delivery problem consists of a set of robots that undertake delivery jobs within a warehouse. Items are moved around the warehouse in response to events. A solution to a warehouse delivery problem is a collision-free schedule of robot movemen... ver más

Revista: Algorithms

Decoupled Monte Carlo Tree Search for Cooperative Multi-Agent Planning

Acceso

Okan Asik, Fatma Basak Aydemir and Hüseyin Levent Akin

The number of agents exponentially increases the complexity of a cooperative multi-agent planning problem. Decoupled planning is one of the viable approaches to reduce this complexity. By integrating decoupled planning with Monte Carlo Tree Search, we pr... ver más

Revista: Applied Sciences

Multiple UAS Traffic Planning Based on Deep Q-Network with Hindsight Experience Replay and Economic Considerations

Acceso

Shao Xuan Seah and Sutthiphong Srigrarom

This paper explores the use of deep reinforcement learning in solving the multi-agent aircraft traffic planning (individual paths) and collision avoidance problem for a multiple UAS, such as that for a cargo drone network. Specifically, the Deep Q-Networ... ver más

Revista: Aerospace

A Distributed Autonomous Mission Planning Method for the Low-Orbit Imaging Constellation

Acceso

Qing Yang, Bingyu Song, Yingguo Chen, Lei He and Pei Wang

With the improvement of satellite autonomy, multi-satellite cooperative mission planning has become an important application. This requires multiple satellites to interact with each other via inter-satellite links to reach a consistent mission planning s... ver más

Revista: Algorithms

On-Board Decentralized Observation Planning for LEO Satellite Constellations

Acceso

Bingyu Song, Yingwu Chen, Qing Yang, Yahui Zuo, Shilong Xu and Yuning Chen

The multi-satellite on-board observation planning (MSOOP) is a variant of the multi-agent task allocation problem (MATAP). MSOOP is used to complete the observation task allocation in a fully cooperative mode to maximize the profits of the whole system. ... ver más

Revista: Algorithms

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas