Interactive Inference of Join Queries
Type de document :
Communication dans un congrès avec actes
Titre :
Interactive Inference of Join Queries
Auteur(s) :
Bonifati, Angela [Auteur]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Linking Dynamic Data [LINKS]
Ciucanu, Radu [Auteur correspondant]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Linking Dynamic Data [LINKS]
Staworko, Slawomir [Auteur]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Linking Dynamic Data [LINKS]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Linking Dynamic Data [LINKS]
Ciucanu, Radu [Auteur correspondant]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Linking Dynamic Data [LINKS]
Staworko, Slawomir [Auteur]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Linking Dynamic Data [LINKS]
Titre de la manifestation scientifique :
Gestion de Données - Principes, Technologies et Applications (BDA)
Ville :
Grenoble-Autrans
Pays :
France
Date de début de la manifestation scientifique :
2014-10-14
Discipline(s) HAL :
Informatique [cs]/Base de données [cs.DB]
Résumé en anglais : [en]
We investigate the problem of inferring join queries from user interactions. The user is presented with a set of candidate tuples and is asked to label them as positive or negative depending on whether or not she would ...
Lire la suite >We investigate the problem of inferring join queries from user interactions. The user is presented with a set of candidate tuples and is asked to label them as positive or negative depending on whether or not she would like the tuples as part of the join result. The goal is to quickly infer an arbitrary n-ary join predicate across two relations by keeping the number of user interactions as minimal as possible. We assume no prior knowledge of the integrity constraints between the involved relations. This kind of scenario occurs in several application settings, such as data integration, reverse engineering of database queries, and constraint inference. In such scenarios, the database instances may be too big to be skimmed. We explore the search space by using a set of strategies that let us prune what we call "uninformative" tuples, and directly present to the user the informative ones i.e., those that allow to quickly find the goal query that the user has in mind. In this paper, we focus on the inference of joins with equality predicates and we show that for such joins deciding whether a tuple is uninformative can be done in polynomial time. Next, we propose several strategies for presenting tuples to the user in a given order that lets minimize the number of interactions. We show the efficiency and scalability of our approach through an experimental study on both benchmark and synthetic datasets. Finally, we prove that adding projection to our queries makes the problem intractable.Lire moins >
Lire la suite >We investigate the problem of inferring join queries from user interactions. The user is presented with a set of candidate tuples and is asked to label them as positive or negative depending on whether or not she would like the tuples as part of the join result. The goal is to quickly infer an arbitrary n-ary join predicate across two relations by keeping the number of user interactions as minimal as possible. We assume no prior knowledge of the integrity constraints between the involved relations. This kind of scenario occurs in several application settings, such as data integration, reverse engineering of database queries, and constraint inference. In such scenarios, the database instances may be too big to be skimmed. We explore the search space by using a set of strategies that let us prune what we call "uninformative" tuples, and directly present to the user the informative ones i.e., those that allow to quickly find the goal query that the user has in mind. In this paper, we focus on the inference of joins with equality predicates and we show that for such joins deciding whether a tuple is uninformative can be done in polynomial time. Next, we propose several strategies for presenting tuples to the user in a given order that lets minimize the number of interactions. We show the efficiency and scalability of our approach through an experimental study on both benchmark and synthetic datasets. Finally, we prove that adding projection to our queries makes the problem intractable.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Nationale
Vulgarisation :
Non
Collections :
Source :
Fichiers
- https://hal.inria.fr/hal-01052764/document
- Accès libre
- Accéder au document
- https://hal.inria.fr/hal-01052764/document
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- ciucanu-bda14a.pdf
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- ciucanu-bda14a.pdf
- Accès libre
- Accéder au document