Project:MilestonesMeeting/20240223: Difference between revisions
From MaRDI portal
No edit summary |
|||
(14 intermediate revisions by 2 users not shown) | |||
Line 12: | Line 12: | ||
# Mission clarification | # Mission clarification | ||
## What IS our mission in TA5? Connect papers/software AND data-sets? | ## What IS our mission in TA5? Connect papers/software AND data-sets? | ||
##:Make it easy to access and find the data produced by | ##: --> Make it easy to access and find the data produced by MaRDI TAs 1-4 | ||
# Milestone Planning | # Milestone Planning | ||
## What are '''our''' 2024 goals for MaRDI? | ## What are '''our''' 2024 goals for MaRDI? | ||
##: --> Bring in content from TAs 1-4 | |||
## What are the official 2024 milestones? | ## What are the official 2024 milestones? | ||
## Who is doing what? (-> Personal Milestone Planning) PART1 - Presentations | ## Who is doing what? (-> Personal Milestone Planning) PART1 - Presentations | ||
# Open (technical) topics (see below) | # Open (technical) topics (see below) | ||
# Documentation | # Documentation | ||
## How to improve internal documentation? | ## How to improve internal documentation? | ||
## How to improve documentation for external? | ### --> If possible, update the upstream documentation (e.g. MediaWiki) - and link it from our Wiki | ||
# Outreach (to other SFBs, Math+, Libraries, ...) | ### --> Use Rim as a test-person to check whether all needed information is documented on our Wiki | ||
# | ## How to improve documentation for external? | ||
### --> Collect technical questions from other TAs and create documentation about it | |||
### --> Start with a FAQ-like document (potentially link to more complex documentations from there) | |||
# Outreach (to other SFBs, Math+, Libraries, ...) | |||
## --> Connect better with: Math+, LifeDocs (Christoph Lehrenfeld), TU Darmstadt Library (Jens Freund) | |||
== (Technical) Topics to discuss == | == (Technical) Topics to discuss == | ||
* How to define items? | * How to define items? --> Create a property ("mardi-profile") for each item that can be used to identify an item's type (software, formula, publication, ...) | ||
** How to define profile types? | |||
*** --> see [[Project:Profile types]] | |||
** Formulae | ** Formulae | ||
*** Which | *** Which properties to use? | ||
*** | **** --> Same as DLMF | ||
** Papers | ** Papers | ||
*** Current way of selecting papers in SPARQL queries by "has zbMath ID"? --> solved through the new "mardi-profile" property | |||
*** Current way of selecting papers in SPARQL queries by "has zbMath ID"? | *** How to link from a paper, as in "cites software"? / "uses dataset"? | ||
*** How to link to a paper, as in "This data-set / software was used in this paper" (Now: in software-item we use "is described in" and in ) | **** --> use https://www.wikidata.org/wiki/Property:P4510 to link software to a publication if this software was used in the publication | ||
*** | **** --> info available in swMath | ||
*** How to link to a paper, as in "This data-set / software was used in this paper" (Now: in software-item we use "is described in" and in ) | |||
**** --> use the reverse 4510 | |||
** Datasets | ** Datasets | ||
*** Which properties to use? | *** Which properties to use? | ||
**** --> Larissa made a first draft; compatibility should be checked with Zenodo items; then implement it | |||
*** How to link to a paper, as in "was used in paper"? (Is this necessary?) | *** How to link to a paper, as in "was used in paper"? (Is this necessary?) | ||
**** --> as before for software | |||
** Software items (How can we query all of them - "instance of X" - what is X?) | ** Software items (How can we query all of them - "instance of X" - what is X?) | ||
*** "instance of software" is violating the WikiData hierarchy? (Software is quite high-level) | *** "instance of software" is violating the WikiData hierarchy? (Software is quite high-level) | ||
* arXiv Importer | **** --> Solved by using the new "mardi-profile" property | ||
* arXiv Importer | |||
** What is the plan? | ** What is the plan? | ||
*** --> Use zbMath data about arXiv paper meta-data (blocker: API is not yet giving out that information) | |||
*** Import of formulae (can we use an LLM to describe a particular formula? parameters etc.?) | *** Import of formulae (can we use an LLM to describe a particular formula? parameters etc.?) | ||
**** --> Do this on a small sub-set of arXiv papers to showcase the idea | |||
*** Import of paper-meta-data? (->Disambiguation) | *** Import of paper-meta-data? (->Disambiguation) | ||
** | **** Use zbMath data | ||
* LLMs for MaRDI portal | ** Next steps? | ||
*** --> Take 2..10 arXiv papers, extract formulas, add to MaRDI KG, try Moritz's formula search service | |||
*** --> Discuss results and see whether this is useful at all | |||
* LLMs for MaRDI portal | |||
** What is the overall plan? | ** What is the overall plan? | ||
** What is the status? | ** What is the status? | ||
*** Chat-Bot (LLM to query the portal) | *** Chat-Bot (LLM to query the portal) | ||
* How to integrate more of the cool Scholia stuff? (Simple example: number of citations of a paper, see e.g. https://scholia.portal.mardi4nfdi.de/work/Q25938997) | * How to integrate more of the cool Scholia stuff? (Simple example: number of citations of a paper, see e.g. https://scholia.portal.mardi4nfdi.de/work/Q25938997) | ||
** --> Define what "cool" Scholia stuff is | |||
** --> For the citation example: Use available services such as OpenCitations.net to get needed meta-data | |||
* Zenodo importer (for Math+ integration) | * Zenodo importer (for Math+ integration) | ||
** What is the plan? | ** What is the plan? | ||
*** Set-up workflow to harvest the Math+ Zenodo Community items | |||
* Workflows for periodic updates (for any source we have) | * Workflows for periodic updates (for any source we have) | ||
** --> Use the Zenodo example as demontrator | |||
* zbMath MSC Keyword import? (We only have the IDs) | * zbMath MSC Keyword import? (We only have the IDs) | ||
** --> Put the ID<->Keyword relatins in SQL database to avoid license issues | |||
* [https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/thread/IIA5LVHBYK45FSMLPIVZI6WXA5QSRPF4/ Wikidata graph split] | * [https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/thread/IIA5LVHBYK45FSMLPIVZI6WXA5QSRPF4/ Wikidata graph split] | ||
** --> If this happens, Scholia might become disfunctional on many of the queries | |||
* Licensing | |||
** Put a "general" this is our licensing strategy page on our Wiki | |||
* Author disambiguation | |||
* [https://openknowledgemaps.org/ OKMaps] | |||
* environmental footprint |
Latest revision as of 14:38, 26 February 2024
MaRDI TA5 Milestones Meeting 23.02.2024 @ ZIB
Goals of the meeting
- We have an idea about how to reach the official milestones
- Everybody is aware of the personal milestones
- Some (all) technical points are discussed / solved
- We have a plan about how to have better documentation
Agenda
- Welcome
- Mission clarification
- What IS our mission in TA5? Connect papers/software AND data-sets?
- --> Make it easy to access and find the data produced by MaRDI TAs 1-4
- What IS our mission in TA5? Connect papers/software AND data-sets?
- Milestone Planning
- What are our 2024 goals for MaRDI?
- --> Bring in content from TAs 1-4
- What are the official 2024 milestones?
- Who is doing what? (-> Personal Milestone Planning) PART1 - Presentations
- What are our 2024 goals for MaRDI?
- Open (technical) topics (see below)
- Documentation
- How to improve internal documentation?
- --> If possible, update the upstream documentation (e.g. MediaWiki) - and link it from our Wiki
- --> Use Rim as a test-person to check whether all needed information is documented on our Wiki
- How to improve documentation for external?
- --> Collect technical questions from other TAs and create documentation about it
- --> Start with a FAQ-like document (potentially link to more complex documentations from there)
- How to improve internal documentation?
- Outreach (to other SFBs, Math+, Libraries, ...)
- --> Connect better with: Math+, LifeDocs (Christoph Lehrenfeld), TU Darmstadt Library (Jens Freund)
(Technical) Topics to discuss
- How to define items? --> Create a property ("mardi-profile") for each item that can be used to identify an item's type (software, formula, publication, ...)
- How to define profile types?
- --> see Project:Profile types
- Formulae
- Which properties to use?
- --> Same as DLMF
- Which properties to use?
- Papers
- Current way of selecting papers in SPARQL queries by "has zbMath ID"? --> solved through the new "mardi-profile" property
- How to link from a paper, as in "cites software"? / "uses dataset"?
- --> use https://www.wikidata.org/wiki/Property:P4510 to link software to a publication if this software was used in the publication
- --> info available in swMath
- How to link to a paper, as in "This data-set / software was used in this paper" (Now: in software-item we use "is described in" and in )
- --> use the reverse 4510
- Datasets
- Which properties to use?
- --> Larissa made a first draft; compatibility should be checked with Zenodo items; then implement it
- How to link to a paper, as in "was used in paper"? (Is this necessary?)
- --> as before for software
- Which properties to use?
- Software items (How can we query all of them - "instance of X" - what is X?)
- "instance of software" is violating the WikiData hierarchy? (Software is quite high-level)
- --> Solved by using the new "mardi-profile" property
- "instance of software" is violating the WikiData hierarchy? (Software is quite high-level)
- How to define profile types?
- arXiv Importer
- What is the plan?
- --> Use zbMath data about arXiv paper meta-data (blocker: API is not yet giving out that information)
- Import of formulae (can we use an LLM to describe a particular formula? parameters etc.?)
- --> Do this on a small sub-set of arXiv papers to showcase the idea
- Import of paper-meta-data? (->Disambiguation)
- Use zbMath data
- Next steps?
- --> Take 2..10 arXiv papers, extract formulas, add to MaRDI KG, try Moritz's formula search service
- --> Discuss results and see whether this is useful at all
- What is the plan?
- LLMs for MaRDI portal
- What is the overall plan?
- What is the status?
- Chat-Bot (LLM to query the portal)
- How to integrate more of the cool Scholia stuff? (Simple example: number of citations of a paper, see e.g. https://scholia.portal.mardi4nfdi.de/work/Q25938997)
- --> Define what "cool" Scholia stuff is
- --> For the citation example: Use available services such as OpenCitations.net to get needed meta-data
- Zenodo importer (for Math+ integration)
- What is the plan?
- Set-up workflow to harvest the Math+ Zenodo Community items
- What is the plan?
- Workflows for periodic updates (for any source we have)
- --> Use the Zenodo example as demontrator
- zbMath MSC Keyword import? (We only have the IDs)
- --> Put the ID<->Keyword relatins in SQL database to avoid license issues
- Wikidata graph split
- --> If this happens, Scholia might become disfunctional on many of the queries
- Licensing
- Put a "general" this is our licensing strategy page on our Wiki
- Author disambiguation
- OKMaps
- environmental footprint