LINGO-DL: a text-based approach for molecular ...
Type de document :
Compte-rendu et recension critique d'ouvrage
Titre :
LINGO-DL: a text-based approach for molecular similarity searching
Auteur(s) :
Pupin, Maude [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Abdo, Ammar [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Hodeidah University, Hodeidah, Yemen
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Abdo, Ammar [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Hodeidah University, Hodeidah, Yemen
Titre de la revue :
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN
Pagination :
657-665
Éditeur :
Springer Verlag
Date de publication :
2021-05
ISSN :
0920-654X
Discipline(s) HAL :
Informatique [cs]/Bio-informatique [q-bio.QM]
Chimie/Chemo-informatique
Chimie/Chemo-informatique
Résumé en anglais : [en]
The line notations of chemical structures are more compact than those of graphs and connection tables, so they can be useful for storing and transferring a large number of molecular structures. The simplified molecular ...
Lire la suite >The line notations of chemical structures are more compact than those of graphs and connection tables, so they can be useful for storing and transferring a large number of molecular structures. The simplified molecular input line system (SMILES) representation is the most extensively used, as it is much easier to utilise and comprehend than others, and it can be generated automatically from connection tables. A SMILES represents and encodes the molecule structure. It has been used by an existing method, LINGO, to calculate the molecular similarities and predict the structure-related properties. The LINGO method decomposes a canonical SMILES into a set of substrings of four characters referred to as LINGOs. The purpose of LINGO method is to measure the similarity between a pair of molecules by comparing the LINGOs that occur in each molecule. This paper aims to introduce an alternative version of the LINGO method using LINGOs of different lengths, called LINGO-DL. LINGO-DL is based on the fragmentation of canonical SMILES into substrings of three different lengths rather than one in LINGO method. Retrospective virtual screening experiments with MDDR, DUD, and MUV datasets show that the LINGO-DL outperforms the LINGO method, especially when the active molecules being sought have a high degree of structural heterogeneity.Lire moins >
Lire la suite >The line notations of chemical structures are more compact than those of graphs and connection tables, so they can be useful for storing and transferring a large number of molecular structures. The simplified molecular input line system (SMILES) representation is the most extensively used, as it is much easier to utilise and comprehend than others, and it can be generated automatically from connection tables. A SMILES represents and encodes the molecule structure. It has been used by an existing method, LINGO, to calculate the molecular similarities and predict the structure-related properties. The LINGO method decomposes a canonical SMILES into a set of substrings of four characters referred to as LINGOs. The purpose of LINGO method is to measure the similarity between a pair of molecules by comparing the LINGOs that occur in each molecule. This paper aims to introduce an alternative version of the LINGO method using LINGOs of different lengths, called LINGO-DL. LINGO-DL is based on the fragmentation of canonical SMILES into substrings of three different lengths rather than one in LINGO method. Retrospective virtual screening experiments with MDDR, DUD, and MUV datasets show that the LINGO-DL outperforms the LINGO method, especially when the active molecules being sought have a high degree of structural heterogeneity.Lire moins >
Langue :
Anglais
Vulgarisation :
Non
Collections :
Source :