ERBlox: combining matching dependencies with machine learning for entity resolution
From MaRDI portal
Publication:518610
DOI10.1016/J.IJAR.2017.01.003zbMATH Open1404.68093arXiv1508.06013OpenAlexW2964226158MaRDI QIDQ518610
Nikolaos Vasiloglou, Zeinab Bahmani, Leopoldo Bertossi
Publication date: 29 March 2017
Published in: International Journal of Approximate Reasoning (Search for Journal in Brave)
Abstract: Entity resolution (ER), an important and common data cleaning problem, is about detecting data duplicate representations for the same external entities, and merging them into single representations. Relatively recently, declarative rules called matching dependencies (MDs) have been proposed for specifying similarity conditions under which attribute values in database records are merged. In this work we show the process and the benefits of integrating three components of ER: (a) Classifiers for duplicate/non-duplicate record pairs built using machine learning (ML) techniques, (b) MDs for supporting both the blocking phase of ML and the merge itself; and (c) The use of the declarative language LogiQL -an extended form of Datalog supported by the LogicBlox platform- for data processing, and the specification and enforcement of MDs.
Full work available at URL: https://arxiv.org/abs/1508.06013
Cites Work
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Nearest neighbor pattern classification
- Bridging logic and kernel machines
- Data cleaning and query answering with matching dependencies and matching functions
- Foundations of Rule Learning
- Data Quality and Record Linkage Techniques
- Kernel Methods and Machine Learning
Cited In (2)
Uses Software
This page was built for publication: ERBlox: combining matching dependencies with machine learning for entity resolution
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q518610)