Resumen
Order scheduling is of a great significance in the internet and communication industries. With the rapid development of the communication industry and the increasing variety of user demands, the number of work orders for communication operators has grown exponentially. Most of the research that tries to solve the order scheduling problem has focused on improving assignment rules based on real-time performance. However, these traditional methods face challenges such as poor real-time performance, high human resource consumption, and low efficiency. Therefore, it is crucial to solve multi-objective problems in order to obtain a robust order scheduling policy to meet the multiple requirements of order scheduling in real problems. The priority dispatching rule (PDR) is a heuristic method that is widely used in real-world scheduling systems In this paper, we propose an approach to automatically optimize the Priority Dispatching Rule (PDR) using a deep multiple-objective reinforcement learning agent and to optimize the weighted vector with a convex hull to obtain the most objective and efficient weights. The convex hull method is employed to calculate the maximal linearly scalarized value, enabling us to determine the optimal weight vector objectively and achieve a balanced optimization of each objective rather than relying on subjective weight settings based on personal experience. Experimental results on multiple datasets demonstrate that our proposed algorithm achieves competitive performance compared to existing state-of-the-art order scheduling algorithms.