Resumen
To fully unleash the potential of edge devices, it is popular to cut a neural network into multiple pieces and distribute them among available edge devices to perform inference cooperatively. Up to now, the problem of partitioning a deep neural network (DNN), which can result in the optimal distributed inferencing performance, has not been adequately addressed. This paper proposes a novel layer-based DNN partitioning approach to obtain an optimal distributed deployment solution. In order to ensure the applicability of the resulted deployment scheme, this work defines the partitioning problem as a constrained optimization problem and puts forward an improved genetic algorithm (GA). Compared with the basic GA, the proposed algorithm can result in a running time approximately one to three times shorter than the basic GA while achieving a better deployment.