Resumen
Opportunistic networks are highly stochastic networks supported by sporadic encounters between mobile devices. To route data efficiently, opportunistic-routing algorithms must capitalize on devices? movement and data transmission patterns. This work proposes a routing method based on reinforcement learning, specifically Q-learning. As usual in routing algorithms, the objective is to select the best candidate devices to put forward once an encounter occurs. However, there is also the possibility of not forwarding if we know that a better candidate might be encountered in the future. This decision is not usually considered in learning schemes because there is no obvious way to represent the temporal evolution of the network. We propose a novel, distributed, and online method that allows learning both the network?s connectivity and its temporal evolution with the help of a temporal graph. This algorithm allows learning to skip forwarding opportunities to capitalize on future encounters. We show that explicitly representing the action for deferring forwarding increases the algorithm?s performance. The algorithm?s scalability is discussed and shown to perform well in a network of considerable size.