Portal/TA1/data guidelines: Difference between revisions

From MaRDI portal
Created page with "== Data Formats and Serialization =="
 
 
(5 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== Data Formats and Serialization ==
== Data Formats and Serialization ==
=== Mission ===
Our mission is to broaden the accessibility, usability and find-ability of mathematical research results so that the data can be used amongst different research groups and software systems.
=== Our View on Data ===
Software and data formats are in a state of flux, which is quite the opposite of mathematical objects. Mathematical data should be made to stand the test of time. Our mission is to provide guidelines and a data format that can be adapted to as many technologies as possible as well as to new technologies.
=== The Data Format ===
We have written a paper about our file format, currently available [https://arxiv.org/abs/2309.00465 here]. Our approach to data serialization is a bottom up approach, meaning we have started the implementation of (de)serialization from a specific software system namely [https://oscar-system.github.io/Oscar.jl/stable/ OSCAR] expanding to other software systems when specific use cases arise.
==== Use cases ====
OSCAR currently takes advantage of our Data format in a couple ways.
* OSCAR stores some standard mathematical constructions in our format, these can be found in the OSCAR github repo [https://github.com/oscar-system/Oscar.jl/tree/master/data here]. Storing these constructions and loading them on demand offers a remarkable speed up.
* Although still it's testing phase, OSCAR uses the data format for inter process communication. OSCAR is not thread safe, hence cannot share memory with another OSCAR process. This issue is circumvented by the provided OSCAR (de)serialization in combination with Julia's [https://github.com/JuliaLang/Distributed.jl Distributed] standard package for parallel computation.

Latest revision as of 10:43, 11 March 2024

Data Formats and Serialization

Mission

Our mission is to broaden the accessibility, usability and find-ability of mathematical research results so that the data can be used amongst different research groups and software systems.

Our View on Data

Software and data formats are in a state of flux, which is quite the opposite of mathematical objects. Mathematical data should be made to stand the test of time. Our mission is to provide guidelines and a data format that can be adapted to as many technologies as possible as well as to new technologies.

The Data Format

We have written a paper about our file format, currently available here. Our approach to data serialization is a bottom up approach, meaning we have started the implementation of (de)serialization from a specific software system namely OSCAR expanding to other software systems when specific use cases arise.

Use cases

OSCAR currently takes advantage of our Data format in a couple ways.

  • OSCAR stores some standard mathematical constructions in our format, these can be found in the OSCAR github repo here. Storing these constructions and loading them on demand offers a remarkable speed up.
  • Although still it's testing phase, OSCAR uses the data format for inter process communication. OSCAR is not thread safe, hence cannot share memory with another OSCAR process. This issue is circumvented by the provided OSCAR (de)serialization in combination with Julia's Distributed standard package for parallel computation.