Title of article :
O-GlcNAcylation Prediction: An Unattained Objective
Author/Authors :
Mauri, Theo Univ. Lille, CNRS; UMR8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Lille, F-59000, France , Bouaouiche, Laurence Menu Normandy University - UNIROUEN, Laboratoire Glyco-MEV EA4358, Rouen, France , Bardor, Muriel Normandy University - UNIROUEN, Laboratoire Glyco-MEV EA4358, Rouen, France , Lefebvre, Tony Univ. Lille, CNRS; UMR8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Lille, F-59000, France , Lensink, Marc F Univ. Lille, CNRS; UMR8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Lille, F-59000, France , Brysbaert, Guillaume Univ. Lille, CNRS; UMR8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Lille, F-59000, France
Pages :
16
From page :
1
To page :
16
Abstract :
Background: O-GlcNAcylation is an essential post-translational modification (PTM) in mammalian cells. It consists in the addition of a N-acetylglucosamine (GlcNAc) residue onto serines or threonines by an O-GlcNAc transferase (OGT). Inhibition of OGT is lethal, and misregulation of this PTM can lead to diverse pathologies including diabetes, Alzheimer’s disease and cancers. Knowing the location of O-GlcNAcylation sites and the ability to accurately predict them is therefore of prime importance to a better understanding of this process and its related pathologies. Purpose: Here, we present an evaluation of the current predictors of O-GlcNAcylation sites based on a newly built dataset and an investigation to improve predictions. Methods: Several datasets of experimentally proven O-GlcNAcylated sites were combined, and the resulting meta-dataset was used to evaluate three prediction tools. We further defined a set of new features following the analysis of the primary to tertiary structures of experimentally proven O-GlcNAcylated sites in order to improve predictions by the use of different types of machine learning techniques. Results: Our results show the failure of currently available algorithms to predict O-GlcNAcylated sites with a precision exceeding 9%. Our efforts to improve the precision with new features using machine learning techniques do succeed for equal proportions of O-GlcNAcylated and non-O-GlcNAcylated sites but fail like the other tools for real-life proportions where ~1.4% of S/T are O-GlcNAcylated. Conclusion: Present-day algorithms for O-GlcNAcylation prediction narrowly outperform random prediction. The inclusion of additional features, in combination with machine learning algorithms, does not enhance these predictions, emphasizing a pressing need for further development. We hypothesize that the improvement of prediction algorithms requires characterization of OGT’s partners.
Farsi abstract :
فاقد چكيده فارسي
Keywords :
machine learning , glycosylation , O-GlcNAc , post-translational modification , dataset , OGT
Journal title :
Advances and Applications in Bioinformatics and Chemistry: AABC
Serial Year :
2021
Full Text URL :
Record number :
2625553
Link To Document :
بازگشت