Resumen
Graphics processor units (GPUs) are many-core processors that perform better than central processing units (CPUs) on data parallel, throughput-oriented applications with intense arithmetic operations. Thus, they can considerably reduce the execution time of the algorithms by performing a wide range of calculations in a parallel manner. On the other hand, imprecise usage of GPU may cause significant loss in the performance. This study examines the impact of GPU resource allocations on the GPU performance. Our aim is to provide insights about parallelization strategies in CUDA and to propose strategies for utilizing GPU resources effectively. We investigate the parallelization of 2-opt and 3-opt local search heuristics for solving the travelling salesman problem. We perform an extensive experimental study on different instances of various sizes and attempt to determine an effective setting which accelerates the computation time the most. We also compare the performance of the GPU against that of the CPU. In addition, we revise the 3-opt implementation strategy presented in the literature for parallelization.