An enriched category theory of language: from syntax to semantics
From MaRDI portal
Publication:2153136
Categories admitting limits (complete categories), functors preserving limits, completions (18A35) Topoi (18B25) Functor categories, comma categories (18A25) Limits and colimits (products, sums, directed limits, pushouts, fiber products, equalizers, kernels, ends and coends, etc.) (18A30) Enriched categories (over closed or monoidal categories) (18D20) Preorders, orders, domains and lattices (viewed as categories) (18B35)
Abstract: State of the art language models return a natural language text continuation from any piece of input text. This ability to generate coherent text extensions implies significant sophistication, including a knowledge of grammar and semantics. In this paper, we propose a mathematical framework for passing from probability distributions on extensions of given texts, such as the ones learned by today's large language models, to an enriched category containing semantic information. Roughly speaking, we model probability distributions on texts as a category enriched over the unit interval. Objects of this category are expressions in language, and hom objects are conditional probabilities that one expression is an extension of another. This category is syntactical -- it describes what goes with what. Then, via the Yoneda embedding, we pass to the enriched category of unit interval-valued copresheaves on this syntactical category. This category of enriched copresheaves is semantic -- it is where we find meaning, logical operations such as entailment, and the building blocks for more elaborate semantic concepts.
Recommendations
- Language modeling with reduced densities
- scientific article; zbMATH DE number 7453969
- A generalised quantifier theory of natural language in categorical compositional distributional semantics with bialgebras
- Ambiguity and incomplete information in categorical models of language
- Open system categorical quantum semantics in natural language processing
- Lambek vs. Lambek: functorial vector space semantics and string diagrams for Lambek calculus
- Compositionality for recursive neural networks
- Generalized relations in linguistics and cognition
- Categorical vector space semantics for Lambek calculus with a relevant modality (extended abstract)
- Coherent diagrammatic reasoning in compositional distributional semantics
Cites work
- scientific article; zbMATH DE number 4057746 (Why is no real title available?)
- scientific article; zbMATH DE number 3751225 (Why is no real title available?)
- scientific article; zbMATH DE number 1944711 (Why is no real title available?)
- scientific article; zbMATH DE number 5593534 (Why is no real title available?)
- Adjointness in Foundations
- An Invitation to Applied Category Theory
- Basic Category Theory
- Brief introduction to tropical geometry
- Categorical homotopy theory
- Category theory in context
- From frequency to meaning: vector space models of semantics
- Introduction to tropical algebraic geometry
- Language modeling with reduced densities
- Metric spaces, generalized logic, and closed categories
- Semantic unification. A sheaf theoretic approach to natural language
- Sheaves in geometry and logic: a first introduction to topos theory
- The tropical Grassmannian
- Tight spans, Isbell completions and semi-tropical modules
- Tropical convexity
- Tropical mathematics
Cited in
(7)- A generalised quantifier theory of natural language in categorical compositional distributional semantics with bialgebras
- scientific article; zbMATH DE number 1568791 (Why is no real title available?)
- A Study of Entanglement in a Categorical Framework of Natural Language
- A category theory approach to the semiotics of machine learning
- Semantic unification. A sheaf theoretic approach to natural language
- Language modeling with reduced densities
- scientific article; zbMATH DE number 1848279 (Why is no real title available?)
This page was built for publication: An enriched category theory of language: from syntax to semantics
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2153136)