Resumen
Using multispectral sensors attached to unmanned aerial vehicles (UAVs) can assist in the collection of morphological and physiological information from several crops. This approach, also known as high-throughput phenotyping, combined with data processing by machine learning (ML) algorithms, can provide fast, accurate, and large-scale discrimination of genotypes in the field, which is crucial for improving the efficiency of breeding programs. Despite their importance, studies aimed at accurately classifying sorghum hybrids using spectral variables as input sets in ML models are still scarce in the literature. Against this backdrop, this study aimed: (I) to discriminate sorghum hybrids based on canopy reflectance in different spectral bands (SB) and vegetation indices (VIs); (II) to evaluate the performance of ML algorithms in classifying sorghum hybrids; (III) to evaluate the best dataset input for the algorithms. A field experiment was carried out in the 2022 crop season in a randomized block design with three replications and six sorghum hybrids. At 60 days after crop emergence, a flight was carried out over the experimental area using the Sensefly eBee real time kinematic. The spectral bands (SB) acquired by the sensor were: blue (475 nm, B_475), green (550 nm, G_550), red (660 nm, R_660), Rededge (735 nm, RE_735) e NIR (790 nm, NIR_790). From the SB acquired, vegetation indices (VIs) were calculated. Data were submitted to ML classification analysis, in which three input settings (using only SB, using only VIs, and using SB + VIs) and six algorithms were tested: artificial neural networks (ANN), support vector machine (SVM), J48 decision trees (J48), random forest (RF), REPTree (DT) and logistic regression (LR, conventional technique used as a control). There were differences in the spectral signature of each sorghum hybrid, which made it possible to differentiate them using SBs and VIs. The ANN algorithm performed best for the three accuracy metrics tested, regardless of the input used. In this case, the use of SB is feasible due to the speed and practicality of analyzing the data, as it does not require calculations to perform the VIs. RF showed better accuracy when VIs were used as an input. The use of VIs provided the best performance for all the algorithms, as did the use of SB + VIs which provided good performance for all the algorithms except RF. Using ML algorithms provides accurate identification of the hybrids, in which ANNs using only SB and RF using VIs as inputs stand out (above 55 for CC, above 0.4 for kappa and around 0.6 for F-score). There were differences in the spectral signature of each sorghum hybrid, which makes it possible to differentiate them using wavelengths and vegetation indices. Processing the multispectral data using machine learning techniques made it possible to accurately differentiate the hybrids, with emphasis on artificial neural networks using spectral bands as inputs and random forest using vegetation indices as inputs.