Resumen
In the modern world, the extremely rapid growth of traffic demand has become a major problem for urban traffic development. Continuous optimization of signal control systems is an important way to relieve traffic pressure in cities. In recent years, with the impressive development of deep reinforcement learning (DRL), some DRL approaches have started to be applied to traffic signal control. Unlike traditional signal control methods, agents trained using DRL approaches continuously receive feedback from the environment to continuously improve the policy. Since current research in the field is more focused on the performance of the agent, data efficiency during training is ignored to some extent. However, in traffic signal control tasks, the cost of trial and error is very expensive. In this paper, we propose a DRL approach based on a traffic inference model. The proposed traffic inference model is based on the future information given based on upstream intersections and data from the environment to continuously learn the changing patterns of the traffic environment in order to make inferences about changes in the traffic environment. In the proposed algorithm, the inference model interacts with the agent instead of the environment. Through comprehensive experiments based on realistic datasets, we demonstrate that our proposed algorithm is superior to other algorithms in terms of its data efficiency and stronger performance.