REVISTA
Algorithms

TODAS

Redirigiendo al acceso original de articulo en 22 segundos...

Inicio / Algorithms / Vol: 17 Par: 2 (2024) / Artículo

ARTÍCULO

TITULO

Learning State-Specific Action Masks for Reinforcement Learning

Ziyi Wang

Xinran Li

Luoyang Sun

Haifeng Zhang

Hualin Liu and Jun Wang

Resumen

Efficient yet sufficient exploration remains a critical challenge in reinforcement learning (RL), especially for Markov Decision Processes (MDPs) with vast action spaces. Previous approaches have commonly involved projecting the original action space into a latent space or employing environmental action masks to reduce the action possibilities. Nevertheless, these methods often lack interpretability or rely on expert knowledge. In this study, we introduce a novel method for automatically reducing the action space in environments with discrete action spaces while preserving interpretability. The proposed approach learns state-specific masks with a dual purpose: (1) eliminating actions with minimal influence on the MDP and (2) aggregating actions with identical behavioral consequences within the MDP. Specifically, we introduce a novel concept called Bisimulation Metrics on Actions by States (BMAS) to quantify the behavioral consequences of actions within the MDP and design a dedicated mask model to ensure their binary nature. Crucially, we present a practical learning procedure for training the mask model, leveraging transition data collected by any RL policy. Our method is designed to be plug-and-play and adaptable to all RL policies, and to validate its effectiveness, an integration into two prominent RL algorithms, DQN and PPO, is performed. Experimental results obtained from Maze, Atari, and μ" role="presentation">??µ µ RTS2 reveal a substantial acceleration in the RL learning process and noteworthy performance improvements facilitated by the introduced approach.

Palabras claves

reinforcement learning - exploration efficiency - space reduction

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 17 Parte: 2 (2024)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Algorithms
Applied Sciences
Information

DOI

https://doi.org/10.3390/a17020060

Artículos similares

Spatio-Temporal Behavior Detection in Field Manual Labor Based on Improved SlowFast Architecture

Acceso

Mingxin Zou, Yanqing Zhou, Xinhua Jiang, Julin Gao, Xiaofang Yu and Xuelei Ma

Field manual labor behavior recognition is an important task that applies deep learning algorithms to industrial equipment for capturing and analyzing people?s behavior during field labor. In this study, we propose a field manual labor behavior recogniti... ver más

Revista: Applied Sciences

Vision-Based Hand Rotation Recognition Technique with Ground-Truth Dataset

Acceso

Hui-Jun Kim, Jung-Soon Kim and Sung-Hee Kim

The existing question-and-answer screening test has a limitation in that test accuracy varies due to a high learning effect and based on the inspector?s competency, which can have consequences for rapid-onset cognitive-related diseases. To solve this pro... ver más

Revista: Applied Sciences

IoT-Assisted Automatic Driver Drowsiness Detection through Facial Movement Analysis Using Deep Learning and a U-Net-Based Architecture

Acceso

Shiplu Das, Sanjoy Pratihar, Buddhadeb Pradhan, Rutvij H. Jhaveri and Francesco Benedetto

The main purpose of a detection system is to ascertain the state of an individual?s eyes, whether they are open and alert or closed, and then alert them to their level of fatigue. As a result of this, they will refrain from approaching an accident site. ... ver más

Revista: Information

Lunar Rover Collaborated Path Planning with Artificial Potential Field-Based Heuristic on Deep Reinforcement Learning

Acceso

Siyao Lu, Rui Xu, Zhaoyu Li, Bang Wang and Zhijun Zhao

The International Lunar Research Station, to be established around 2030, will equip lunar rovers with robotic arms as constructors. Construction requires lunar soil and lunar rovers, for which rovers must go toward different waypoints without encounterin... ver más

Revista: Aerospace

Aircraft Upset Recovery Strategy and Pilot Assistance System Based on Reinforcement Learning

Acceso

Jin Wang, Peng Zhao, Zhe Zhang, Ting Yue, Hailiang Liu and Lixin Wang

The upset state is an unexpected flight state, which is characterized by an unintentional deviation from normal operating parameters. It is difficult for the pilot to recover the aircraft from the upset state accurately and quickly. In this paper, an ups... ver más

Revista: Aerospace

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas