Resumen
The leaked IoT botnet source-codes have facilitated the proliferation of different IoT botnet variants, some of which are equipped with new capabilities and may be difficult to detect. Despite the availability of solutions for automated analysis of IoT botnet samples, the identification of new variants is still very challenging because the analysis results must be manually interpreted by malware analysts. To overcome this challenge, we propose an approach for automated behaviour-based clustering of IoT botnet samples, aimed to enable automatic identification of IoT botnet variants equipped with new capabilities. In the proposed approach, the behaviour of the IoT botnet samples is captured using a sandbox and represented as behaviour profiles describing the actions performed by the samples. The behaviour profiles are vectorised using TF-IDF and clustered using the DBSCAN algorithm. The proposed approach was evaluated using a collection of samples captured from IoT botnets propagating on the Internet. The evaluation shows that the proposed approach enables accurate automatic identification of IoT botnet variants equipped with new capabilities, which will help security researchers to investigate the new capabilities, and to apply the investigation findings for improving the solutions for detecting and preventing IoT botnet infections.