Project:ZenodoDocumentation: Difference between revisions

From MaRDI portal
Rim (talk | contribs)
mNo edit summary
Rim (talk | contribs)
mNo edit summary
 
(2 intermediate revisions by the same user not shown)
Line 2: Line 2:
= Overview =
= Overview =
The Zenodo importer retrieves metadata for Zenodo items in json format using the [https://developers.zenodo.org/#representation35 API], parses them, and upload them to the MaRDI portal.  
The Zenodo importer retrieves metadata for Zenodo items in json format using the [https://developers.zenodo.org/#representation35 API], parses them, and upload them to the MaRDI portal.  
The general steps executed by the importer are as follows:
# The Zenodo API is used to retrieve the Zenodo IDs of the items to be imported in the [https://github.com/MaRDI4NFDI/docker-importer/blob/41a59a26c1bd35e78fb23a13c04be7edbfb05b7c/mardi_importer/mardi_importer/zenodo/ZenodoSource.py#L84 pull() function].
# Each Zenodo ID is used to retrieve a json file with the relevant metadata to create a Zenodo item that is uploaded to the MaRDI graph ([https://github.com/MaRDI4NFDI/docker-importer/blob/41a59a26c1bd35e78fb23a13c04be7edbfb05b7c/mardi_importer/mardi_importer/publications/ZenodoResource.py#L32 code]).
# The metadata is parsed and processed appropriately.
# Claims are added as properties of the Zenodo item in the [https://github.com/MaRDI4NFDI/docker-importer/blob/41a59a26c1bd35e78fb23a13c04be7edbfb05b7c/mardi_importer/mardi_importer/publications/ZenodoResource.py#L190 create() function].
# The item is created and uploaded to the MarRDI portal if it doesn't already exists, and updated if it does.
Details on how to set up the repository and run the importer can be found [https://portal.mardi4nfdi.de/wiki/Project:RunImporter here].


= Execution =
= Execution =
Line 8: Line 18:
The importer can be run with one or more of the following optional arguments:
The importer can be run with one or more of the following optional arguments:


* communities: List[str] : a list of zenodo community IDs
* communities: List[str]: a list of Zenodo community IDs.
* resourceTypes: List[str]: a list of zenodo resource types (check here <nowiki>https://help.zenodo.org/guides/search/</nowiki> for valid values)
* resourceTypes: List[str]: a list of Zenodo resource types (check [https://help.zenodo.org/guides/search/ here] for valid values).
* orcid_id_file: str: path to a file contianing information about the authors. must contain a column ‘orcid’
* orcid_id_file: str: path to a comma separated csv file containing information about the authors. The file must contain a column ‘orcid’.
* customQ: str: see here <nowiki>https://help.zenodo.org/guides/search/</nowiki> how to build a custom query
* customQ: str: see [https://help.zenodo.org/guides/search/ here] how to build a custom query.
The arguments can currently directly be modified from the import.py script as parameters to the function ZenodoSource()
The arguments can currently directly be modified from the import.py script as parameters to the function [https://github.com/MaRDI4NFDI/docker-importer/blob/41a59a26c1bd35e78fb23a13c04be7edbfb05b7c/mardi_importer/mardi_importer/scripts/import.py#L69C23-L69C32 ZenodoSource()]. While the arguments are all optional, it is (very) recommended that they are used, otherwise the importer will attempt to import the entire Zenodo database.


!!!! note: is it better to have these as a config file passed as a parameter to the importer script or is it fine to add the params inside the import script
Note: this might later be changed so that the parameters are passed in a config file instead. Instructions will be updated then


= Updating code =
= Updating code =
The importer source code can be found and modified in the GitHub repository [https://github.com/MaRDI4NFDI/docker-importer here].
The importer source code can be found and modified in the GitHub repository [https://github.com/MaRDI4NFDI/docker-importer here].


to modify a zenodo resource (eg. add/modify properties): modify the code in
to modify a zenodo resource (eg. add/modify properties), modify the code in
  <code>/mardi_importer/mardi_importer/publications/ZenodoResource.py</code>
  <code>/mardi_importer/mardi_importer/publications/ZenodoResource.py</code>
to change how queries to the zenodo API are run, modify the code in<syntaxhighlight lang="bash">
to change how queries to the zenodo API are run, modify the code in<syntaxhighlight lang="bash">
/mardi_importer/mardi_importer/zenodo/ZenodoSource.py
/mardi_importer/mardi_importer/zenodo/ZenodoSource.py
</syntaxhighlight>
</syntaxhighlight>
*

Latest revision as of 17:00, 25 March 2025

Overview

The Zenodo importer retrieves metadata for Zenodo items in json format using the API, parses them, and upload them to the MaRDI portal.

The general steps executed by the importer are as follows:

  1. The Zenodo API is used to retrieve the Zenodo IDs of the items to be imported in the pull() function.
  2. Each Zenodo ID is used to retrieve a json file with the relevant metadata to create a Zenodo item that is uploaded to the MaRDI graph (code).
  3. The metadata is parsed and processed appropriately.
  4. Claims are added as properties of the Zenodo item in the create() function.
  5. The item is created and uploaded to the MarRDI portal if it doesn't already exists, and updated if it does.

Details on how to set up the repository and run the importer can be found here.

Execution

Execute the import.py script in the scripts folder:

python3 import.py --mode zenodo

The importer can be run with one or more of the following optional arguments:

  • communities: List[str]: a list of Zenodo community IDs.
  • resourceTypes: List[str]: a list of Zenodo resource types (check here for valid values).
  • orcid_id_file: str: path to a comma separated csv file containing information about the authors. The file must contain a column ‘orcid’.
  • customQ: str: see here how to build a custom query.

The arguments can currently directly be modified from the import.py script as parameters to the function ZenodoSource(). While the arguments are all optional, it is (very) recommended that they are used, otherwise the importer will attempt to import the entire Zenodo database.

Note: this might later be changed so that the parameters are passed in a config file instead. Instructions will be updated then

Updating code

The importer source code can be found and modified in the GitHub repository here.

to modify a zenodo resource (eg. add/modify properties), modify the code in

/mardi_importer/mardi_importer/publications/ZenodoResource.py

to change how queries to the zenodo API are run, modify the code in

/mardi_importer/mardi_importer/zenodo/ZenodoSource.py