Portal/TA1/data guidelines: Difference between revisions

Latest revision as of 10:43, 11 March 2024

Data Formats and Serialization

Mission

Our mission is to broaden the accessibility, usability and find-ability of mathematical research results so that the data can be used amongst different research groups and software systems.

Our View on Data

Software and data formats are in a state of flux, which is quite the opposite of mathematical objects. Mathematical data should be made to stand the test of time. Our mission is to provide guidelines and a data format that can be adapted to as many technologies as possible as well as to new technologies.

The Data Format

We have written a paper about our file format, currently available here. Our approach to data serialization is a bottom up approach, meaning we have started the implementation of (de)serialization from a specific software system namely OSCAR expanding to other software systems when specific use cases arise.

Use cases

OSCAR currently takes advantage of our Data format in a couple ways.

OSCAR stores some standard mathematical constructions in our format, these can be found in the OSCAR github repo here. Storing these constructions and loading them on demand offers a remarkable speed up.

Although still it's testing phase, OSCAR uses the data format for inter process communication. OSCAR is not thread safe, hence cannot share memory with another OSCAR process. This issue is circumvented by the provided OSCAR (de)serialization in combination with Julia's Distributed standard package for parallel computation.

@@ Line 2: / Line 2: @@
 === Mission ===
-Our mission is to broaden the accessibility, usability and find-ability of mathematical research results so that the data can be used amongst different research groups and software systems. Our first goal is to establish a standardised data format for storing abstract mathematical objects and provide guidelines for the serialization of such objects.
+Our mission is to broaden the accessibility, usability and find-ability of mathematical research results so that the data can be used amongst different research groups and software systems.
 === Our View on Data ===
 Software and data formats are in a state of flux, which is quite the opposite of mathematical objects. Mathematical data should be made to stand the test of time. Our mission is to provide guidelines and a data format that can be adapted to as many technologies as possible as well as to new technologies.
-=== Current Developments and OSCAR ===
+=== The Data Format ===
-We are currently developing methods and a data format for the serialization of mathematical objects in OSCAR. OSCAR is an open source computer algebra systems that provides functionality for working with objects such as groups, rings, fields, linear and commutative algebra, number theory, algebraic and polyhedral geometry. OSCAR is in active development, more information can be found [https://oscar-system.github.io/Oscar.jl/stable/ here].
+We have written a paper about our file format, currently available [https://arxiv.org/abs/2309.00465 here]. Our approach to data serialization is a bottom up approach, meaning we have started the implementation of (de)serialization from a specific software system namely [https://oscar-system.github.io/Oscar.jl/stable/ OSCAR] expanding to other software systems when specific use cases arise.
-== Examples ==
+==== Use cases ====
-Here are some links to jupyter notebooks we have created to demonstrate how one could work with OSCAR and use it to store mathematical objects as JSON files. Cloning the notebooks and running them locally will produce JSON files in the same directory as the notebook.
+OSCAR currently takes advantage of our Data format in a couple ways.
-# [https://lab.mardi4nfdi.de/vecchia/serialization-examples/-/blob/main/polynomials.ipynb Polynomials]
+* OSCAR stores some standard mathematical constructions in our format, these can be found in the OSCAR github repo [https://github.com/oscar-system/Oscar.jl/tree/master/data here]. Storing these constructions and loading them on demand offers a remarkable speed up.
+* Although still it's testing phase, OSCAR uses the data format for inter process communication. OSCAR is not thread safe, hence cannot share memory with another OSCAR process. This issue is circumvented by the provided OSCAR (de)serialization in combination with Julia's [https://github.com/JuliaLang/Distributed.jl Distributed] standard package for parallel computation.