Inicio  /  Applied Sciences  /  Vol: 12 Par: 22 (2022)  /  Artículo
ARTÍCULO
TITULO

Hybrid Inductive Model of Differentially and Co-Expressed Gene Expression Profile Extraction Based on the Joint Use of Clustering Technique and Convolutional Neural Network

Sergii Babichev    
Lyudmyla Yasinska-Damri    
Igor Liakh and Jirí ?kvor    

Resumen

The development of hybrid models focused on gene expression data processing for the allocation of differentially expressed and mutually correlated genes is one of the current directions in modern bioinformatics. The solution to this problem can allow us to improve the effectiveness of existing systems for complex diseases diagnosis based on gene expression data analysis on the one hand and increase the efficiency of gene regulatory network reconstruction procedures by more careful selection of genes by considering the type of disease on the other hand. In this research, we propose a stepwise procedure to form the subsets of mutually correlated and differentially expressed gene expression profiles (GEP). Firstly, we allocate an informative GEP in terms of statistical and entropy criteria using the Harrington desirability function. Then, we performed cluster analysis using SOTA and spectral clustering algorithms implemented within the framework of objective clustering inductive technology. The result of this step?s implementation is a set of clusters containing co- and differentially expressed GEPs. Validation of the model was performed using a one-dimensional two-layer convolutional neural network (CNN). The analysis of the simulation results has shown the high efficiency of the proposed model. The clusters of GEPs formed based on the clustering quality criteria values allowed us to identify the investigated objects with high accuracy. Moreover, the simulation results have also shown that the hybrid inductive model based on the spectral clustering algorithm is more effective in comparison with the use of the SOTA clustering algorithm in terms of both the complexity of the formed optimal cluster structure and the classification accuracy of the objects that contain the allocated gene expression data as attributes. The proposed hybrid inductive model contributes to increasing objectivity during the formation of the subsets of differentially and co-expressed gene expression profiles for further their application in various disease diagnosis systems and for gene regulatory network reconstruction.

 Artículos similares