Distribution-Based Similarity Measures ...
Document type :
Article dans une revue scientifique: Article original
DOI :
PMID :
Permalink :
Title :
Distribution-Based Similarity Measures Applied to Laboratory Results Matching.
Author(s) :
Courtois, M. [Auteur]
Filiot, A. [Auteur]
Ficheur, Gregoire [Auteur]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Filiot, A. [Auteur]
Ficheur, Gregoire [Auteur]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Journal title :
Studies in Health Technology and Informatics
Abbreviated title :
Stud Health Technol Inform
Volume number :
287
Pages :
94-98
Publication date :
2021
ISSN :
1879-8365
HAL domain(s) :
Sciences du Vivant [q-bio]
English abstract : [en]
The use of international laboratory terminologies inside hospital information systems is required to conduct data reuse analyses through inter-hospital databases. While most terminology matching techniques performing ...
Show more >The use of international laboratory terminologies inside hospital information systems is required to conduct data reuse analyses through inter-hospital databases. While most terminology matching techniques performing semantic interoperability are language-based, another strategy is to use distribution matching that performs terms matching based on the statistical similarity. In this work, our objective is to design and assess a structured framework to perform distribution matching on concepts described by continuous variables. We propose a framework that combines distribution matching and machine learning techniques. Using a training sample consisting of correct and incorrect correspondences between different terminologies, a match probability score is built. For each term, best candidates are returned and sorted in decreasing order using the probability given by the model. Searching 101 terms from Lille University Hospital among the same list of concepts in MIMIC-III, the model returned the correct match in the top 5 candidates for 96 of them (95%). Using this open-source framework with a top-k suggestions system could make the expert validation of terminologies alignment easier.Show less >
Show more >The use of international laboratory terminologies inside hospital information systems is required to conduct data reuse analyses through inter-hospital databases. While most terminology matching techniques performing semantic interoperability are language-based, another strategy is to use distribution matching that performs terms matching based on the statistical similarity. In this work, our objective is to design and assess a structured framework to perform distribution matching on concepts described by continuous variables. We propose a framework that combines distribution matching and machine learning techniques. Using a training sample consisting of correct and incorrect correspondences between different terminologies, a match probability score is built. For each term, best candidates are returned and sorted in decreasing order using the probability given by the model. Searching 101 terms from Lille University Hospital among the same list of concepts in MIMIC-III, the model returned the correct match in the top 5 candidates for 96 of them (95%). Using this open-source framework with a top-k suggestions system could make the expert validation of terminologies alignment easier.Show less >
Language :
Anglais
Audience :
Internationale
Popular science :
Non
Administrative institution(s) :
Université de Lille
CHU Lille
CHU Lille
Submission date :
2023-11-15T05:31:57Z
2024-01-11T12:30:50Z
2024-01-11T12:30:50Z
Files
- SHTI-287-SHTI210823.pdf
- Non spécifié
- Open access
- Access the document
Except where otherwise noted, this item's license is described as Attribution-NonCommercial 3.0 United States