Algebraic data integration
From MaRDI portal
Publication:4577809
DOI10.1017/S0956796817000168zbMATH Open1475.68069arXiv1503.03571OpenAlexW2964296895MaRDI QIDQ4577809FDOQ4577809
Ryan Wisnesky, Patrick Schultz
Publication date: 3 August 2018
Published in: Journal of Functional Programming (Search for Journal in Brave)
Abstract: In this paper we develop an algebraic approach to data integration by combining techniques from functional programming, category theory, and database theory. In our formalism, database schemas and instances are algebraic (multi-sorted equational) theories of a certain form. Schemas denote categories, and instances denote their initial (term) algebras. The instances on a schema S form a category, S-Inst, and a morphism of schemas F : S -> T induces three adjoint data migration functors: Sigma_F : S-Inst -> T-Inst, defined by substitution along F, which has a right adjoint Delta_F : T-Inst -> S-Inst, which in turn has a right adjoint Pi_F : S-Inst -> T-Inst. We present a query language based on for/where/return syntax where each query denotes a sequence of data migration functors; a pushout-based design pattern for performing data integration using our formalism; and describe the implementation of our formalism in a tool we call AQL.
Full work available at URL: https://arxiv.org/abs/1503.03571
Recommendations
- Algebra of Data Reconciliation
- Data based algorithmic algebra
- An algebraic theory for data linkage
- Algebraic databases
- Algebraic approach to data processing. Techniques and applications
- Data refinement and algebraic structure
- Logic Programming and Nonmonotonic Reasoning
- Data algebra and its application in database design
Database theory (68P15) Theories (e.g., algebraic theories), structure, and semantics (18C10) Functional programming and lambda calculus (68N18)
Cites Work
- Title not available (Why is that?)
- A mathematical introduction to logic.
- Title not available (Why is that?)
- Data exchange: semantics and query answering
- Term Rewriting and All That
- Title not available (Why is that?)
- Interactive theorem proving and program development. Coq'Art: the calculus of inductive constructions. Foreword by Gérard Huet and Christine Paulin-Mohring.
- Fast Decision Procedures Based on Congruence Closure
- Database queries and constraints via lifting problems
- Title not available (Why is that?)
- What Is a Derived Signature Morphism?
- Composing Hidden Information Modules over Inclusive Institutions
- Title not available (Why is that?)
- The Knuth-Bendix Completion Procedure and Thue Systems
- Algebraic specification of modules and their basic interconnections
- Entity-relationship-attribute designs and sketches
- Functorial data migration
- Algebraic Databases
- Computing left Kan extensions.
- A database of categories.
- Title not available (Why is that?)
- Recent Trends in Algebraic Development Techniques
- Title not available (Why is that?)
Cited In (10)
- Fast left Kan extensions using the chase
- Persistent obstruction theory for a model category of measures with applications to data merging
- Title not available (Why is that?)
- Title not available (Why is that?)
- Extending Maximal Completion (Invited Talk)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Institutions for SQL database schemas and datasets
- Certified equational reasoning via ordered completion
- Adjunctions and data collections
Uses Software
This page was built for publication: Algebraic data integration
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4577809)