Summary: Topology determination is one of the most important intermediate steps towards building the atomic structure of proteins from their medium-resolution cryo-electron microscopy (cryo-EM) map. The main goal in the topology determination is to identify correct matches (i.e. assignment and direction) between secondary structure elements (α-helices and β-sheets) detected in a protein sequence and cryo-EM density map. Despite many recent advances in molecular biology technologies, the problem remains a challenging issue. To overcome the problem, this article proposes a Linear Programming-based Topology Determination method (LPTD) to solve the secondary structure topology problem in three-dimensional geometrical space. Through modeling of the protein's sequence with the aid of extracting highly reliable features and a distance-based scoring function, the secondary structure matching problem is transformed into a complete weighted bipartite graph matching problem. Subsequently, an algorithm based on linear programming is developed as a decision-making strategy to extract the true topology (native topology) between all possible topologies. The proposed automatic framework is verified using 12 experimental and 15 simulated α-β proteins. Results demonstrate that LPTD is highly efficient and extremely fast in such a way that for 77% of cases in the data set, the native topology has been detected in the first rank topology in less than 2 seconds. Besides, this method is able to successfully handle large complex proteins with as many as 65 secondary structure elements. Such a large number of secondary structure elements have never been solved with current tools/methods. Availability: The LPTD package (source code and data) is publicly available at https://github.com/B-Behkamal/LPTD. Moreover, two test samples as well as the instruction of utilizing the graphical user interface (GUI) have been provided in the shared readme file. Supplementary information: Supplementary data will be available at Bioinformatics online.

{LPTD}: a novel linear programming-based topology determination method for cryo-{EM} maps / Behkamal, Bahareh; Naghibzadeh, Mahmoud; Pagnani, Andrea; Reza Saberi, Mohammad; Al Nasr, Kamal. - In: BIOINFORMATICS. - ISSN 1367-4803. - ELETTRONICO. - (2022). [10.1093/bioinformatics/btac170]

{LPTD}: a novel linear programming-based topology determination method for cryo-{EM} maps

Andrea Pagnani;
2022

Abstract

Summary: Topology determination is one of the most important intermediate steps towards building the atomic structure of proteins from their medium-resolution cryo-electron microscopy (cryo-EM) map. The main goal in the topology determination is to identify correct matches (i.e. assignment and direction) between secondary structure elements (α-helices and β-sheets) detected in a protein sequence and cryo-EM density map. Despite many recent advances in molecular biology technologies, the problem remains a challenging issue. To overcome the problem, this article proposes a Linear Programming-based Topology Determination method (LPTD) to solve the secondary structure topology problem in three-dimensional geometrical space. Through modeling of the protein's sequence with the aid of extracting highly reliable features and a distance-based scoring function, the secondary structure matching problem is transformed into a complete weighted bipartite graph matching problem. Subsequently, an algorithm based on linear programming is developed as a decision-making strategy to extract the true topology (native topology) between all possible topologies. The proposed automatic framework is verified using 12 experimental and 15 simulated α-β proteins. Results demonstrate that LPTD is highly efficient and extremely fast in such a way that for 77% of cases in the data set, the native topology has been detected in the first rank topology in less than 2 seconds. Besides, this method is able to successfully handle large complex proteins with as many as 65 secondary structure elements. Such a large number of secondary structure elements have never been solved with current tools/methods. Availability: The LPTD package (source code and data) is publicly available at https://github.com/B-Behkamal/LPTD. Moreover, two test samples as well as the instruction of utilizing the graphical user interface (GUI) have been provided in the shared readme file. Supplementary information: Supplementary data will be available at Bioinformatics online.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2962036