Capturing Protein Domain Structure and Function Using Self-Supervision on Domain Architectures

Damianos P. Melidis and Wolfgang Nejdl

Resumen

Predicting biological properties of unseen proteins is shown to be improved by the use of protein sequence embeddings. However, these sequence embeddings have the caveat that biological metadata do not exist for each amino acid, in order to measure the quality of each unique learned embedding vector separately. Therefore, current sequence embedding cannot be intrinsically evaluated on the degree of their captured biological information in a quantitative manner. We address this drawback by our approach, dom2vec, by learning vector representation for protein domains and not for each amino acid base, as biological metadata do exist for each domain separately. To perform a reliable quantitative intrinsic evaluation in terms of biology knowledge, we selected the metadata related to the most distinctive biological characteristics of a domain, which are its structure, enzymatic, and molecular function. Notably, dom2vec obtains an adequate level of performance in the intrinsic assessment?therefore, we can draw an analogy between the local linguistic features in natural languages and the domain structure and function information in domain architectures. Moreover, we demonstrate the dom2vec applicability on protein prediction tasks, by comparing it with state-of-the-art sequence embeddings in three downstream tasks. We show that dom2vec outperforms sequence embeddings for toxin and enzymatic function prediction and is comparable with sequence embeddings in cellular location prediction.

Palabras claves

protein domain architectures - word embeddings - quantitative quality assessment - SCOPe secondary structure class - enzymatic commission class

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 14 Parte: 1 (2021)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Journal of Marine Science and Engineering
Applied Sciences
Information

DOI

https://doi.org/10.3390/a14010028

Artículos similares

Potential Applications of Whisker Sensors in Marine Science and Engineering: A Review

Acceso

Siyuan Wang, Jianhua Liu, Bo Liu, Hao Wang, Jicang Si, Peng Xu and Minyi Xu

Perception plays a pivotal role in both biological and technological interactions with the environment. Recent advancements in whisker sensors, drawing inspiration from nature?s tactile systems, have ushered in a new era of versatile and highly sensitive... ver más

Revista: Journal of Marine Science and Engineering

FEM-BEM Vibroacoustic Simulations of Motion Driven Cymbal-Drumstick Interactions

Acceso

Evaggelos Kaselouris, Stella Paschalidou, Chrisoula Alexandraki and Vasilis Dimitriou

The transient acoustic dynamics of a splash cymbal are investigated via the Finite Element Method-Boundary Element Method. Real three-dimensional motion data recorded from the interaction of drummer?drumstick?cymbal provide the initial and the loading co... ver más

Revista: Acoustics

Dynamic Safety Assessment and Enhancement of Port Operational Infrastructure Systems during the COVID-19 Era

Acceso

Siqi Wang, Jingbo Yin and Rafi Ullah Khan

Seaports function as lifeline systems in maritime transportation, facilitating critical processes like shipping, distribution, and allied cargo handling. These diverse subsystems constitute the Port Infrastructure System (PIS) and have intricate function... ver más

Revista: Journal of Marine Science and Engineering

Modeling Extreme Water Levels in the Salish Sea: The Importance of Including Remote Sea Level Anomalies for Application in Hydrodynamic Simulations

Acceso

Eric E. Grossman, Babak Tehranirad, Cornelis M. Nederhoff, Sean C. Crosby, Andrew W. Stevens, Nathan R. Van Arendonk, Daniel J. Nowacki, Li H. Erikson and Patrick L. Barnard

Extreme water-level recurrence estimates for a complex estuary using a high-resolution 2D model and a new method for estimating remotely generated sea level anomalies (SLAs) at the model boundary have been developed. The hydrodynamic model accurately res... ver más

Revista: Water

Improved Alkali?Silica Reaction Forecast in Concrete Infrastructures through Stochastic Climate Change Impact Analysis

Acceso

Md Asif Rahman and Yang Lu

The assessment of concrete infrastructures? functionality during natural hazards is fundamental in evaluating their performance and emergency response. In this work, the alkali?silica reaction (ASR) in concrete is evaluated under the climate change impac... ver más

Revista: Infrastructures

Revistas destacadas

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas