Distribution-Based Similarity Measures ...
Type de document :
Article dans une revue scientifique: Article original
DOI :
PMID :
URL permanente :
Titre :
Distribution-Based Similarity Measures Applied to Laboratory Results Matching.
Auteur(s) :
Courtois, M. [Auteur]
Filiot, A. [Auteur]
Ficheur, Gregoire [Auteur]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Filiot, A. [Auteur]
Ficheur, Gregoire [Auteur]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Titre de la revue :
Studies in Health Technology and Informatics
Nom court de la revue :
Stud Health Technol Inform
Numéro :
287
Pagination :
94-98
Date de publication :
2021
ISSN :
1879-8365
Discipline(s) HAL :
Sciences du Vivant [q-bio]
Résumé en anglais : [en]
The use of international laboratory terminologies inside hospital information systems is required to conduct data reuse analyses through inter-hospital databases. While most terminology matching techniques performing ...
Lire la suite >The use of international laboratory terminologies inside hospital information systems is required to conduct data reuse analyses through inter-hospital databases. While most terminology matching techniques performing semantic interoperability are language-based, another strategy is to use distribution matching that performs terms matching based on the statistical similarity. In this work, our objective is to design and assess a structured framework to perform distribution matching on concepts described by continuous variables. We propose a framework that combines distribution matching and machine learning techniques. Using a training sample consisting of correct and incorrect correspondences between different terminologies, a match probability score is built. For each term, best candidates are returned and sorted in decreasing order using the probability given by the model. Searching 101 terms from Lille University Hospital among the same list of concepts in MIMIC-III, the model returned the correct match in the top 5 candidates for 96 of them (95%). Using this open-source framework with a top-k suggestions system could make the expert validation of terminologies alignment easier.Lire moins >
Lire la suite >The use of international laboratory terminologies inside hospital information systems is required to conduct data reuse analyses through inter-hospital databases. While most terminology matching techniques performing semantic interoperability are language-based, another strategy is to use distribution matching that performs terms matching based on the statistical similarity. In this work, our objective is to design and assess a structured framework to perform distribution matching on concepts described by continuous variables. We propose a framework that combines distribution matching and machine learning techniques. Using a training sample consisting of correct and incorrect correspondences between different terminologies, a match probability score is built. For each term, best candidates are returned and sorted in decreasing order using the probability given by the model. Searching 101 terms from Lille University Hospital among the same list of concepts in MIMIC-III, the model returned the correct match in the top 5 candidates for 96 of them (95%). Using this open-source framework with a top-k suggestions system could make the expert validation of terminologies alignment easier.Lire moins >
Langue :
Anglais
Audience :
Internationale
Vulgarisation :
Non
Établissement(s) :
Université de Lille
CHU Lille
CHU Lille
Date de dépôt :
2023-11-15T05:31:57Z
2024-01-11T12:30:50Z
2024-01-11T12:30:50Z
Fichiers
- SHTI-287-SHTI210823.pdf
- Non spécifié
- Accès libre
- Accéder au document