Project:ZenodoDocumentation: Difference between revisions
No edit summary |
No edit summary |
||
| Line 2: | Line 2: | ||
= Overview = | = Overview = | ||
The Zenodo importer retrieves metadata for Zenodo items in json format using the [https://developers.zenodo.org/#representation35 API], parses them, and upload them to the MaRDI portal. | The Zenodo importer retrieves metadata for Zenodo items in json format using the [https://developers.zenodo.org/#representation35 API], parses them, and upload them to the MaRDI portal. | ||
The general steps executed by the importer are as follows: | |||
# The Zenodo API is used to retrieve the Zenodo IDs of the items to be imported in the [https://github.com/MaRDI4NFDI/docker-importer/blob/41a59a26c1bd35e78fb23a13c04be7edbfb05b7c/mardi_importer/mardi_importer/zenodo/ZenodoSource.py#L84 pull() function]. | |||
# Each Zenodo ID is used to retrieve a json file with the relevant metadata to create a Zenodo item that is uploaded to the MaRDI graph ([https://github.com/MaRDI4NFDI/docker-importer/blob/41a59a26c1bd35e78fb23a13c04be7edbfb05b7c/mardi_importer/mardi_importer/publications/ZenodoResource.py#L32 code]). | |||
# The metadata is parsed and processed appropriately. | |||
# Claims are added as properties of the Zenodo item in the [https://github.com/MaRDI4NFDI/docker-importer/blob/41a59a26c1bd35e78fb23a13c04be7edbfb05b7c/mardi_importer/mardi_importer/publications/ZenodoResource.py#L190 create() function]. | |||
# The item is created and uploaded to the MarRDI portal if it doesn't already exists, and updated if it does. | |||
Details on how to set up the repository and run the importer can be found [https://portal.mardi4nfdi.de/wiki/Project:RunImporter here]. | |||
= Execution = | = Execution = | ||
| Line 8: | Line 18: | ||
The importer can be run with one or more of the following optional arguments: | The importer can be run with one or more of the following optional arguments: | ||
* communities: List[str] : a list of Zenodo community IDs. | * communities: List[str]: a list of Zenodo community IDs. | ||
* resourceTypes: List[str]: a list of Zenodo resource types (check [https://help.zenodo.org/guides/search/ here] for valid values). | * resourceTypes: List[str]: a list of Zenodo resource types (check [https://help.zenodo.org/guides/search/ here] for valid values). | ||
* orcid_id_file: str: path to a file containing information about the authors. must contain a column ‘orcid’. | * orcid_id_file: str: path to a comma separated civ file containing information about the authors. The file must contain a column ‘orcid’. | ||
* customQ: str: see [https://help.zenodo.org/guides/search/ here] how to build a custom query. | * customQ: str: see [https://help.zenodo.org/guides/search/ here] how to build a custom query. | ||
The arguments can currently directly be modified from the import.py script as parameters to the function ZenodoSource() | The arguments can currently directly be modified from the import.py script as parameters to the function [https://github.com/MaRDI4NFDI/docker-importer/blob/41a59a26c1bd35e78fb23a13c04be7edbfb05b7c/mardi_importer/mardi_importer/scripts/import.py#L69C23-L69C32 ZenodoSource()]. While the arguments are all optional, it is (very) recommended that they are used, otherwise the importer will attempt to import the entire Zenodo database. | ||
Note: this might later be changed so that the parameters are passed in a config file instead. Instructions will be updated then | |||
= Updating code = | = Updating code = | ||
| Line 26: | Line 36: | ||
* | * | ||
Revision as of 16:56, 25 March 2025
Overview
The Zenodo importer retrieves metadata for Zenodo items in json format using the API, parses them, and upload them to the MaRDI portal.
The general steps executed by the importer are as follows:
- The Zenodo API is used to retrieve the Zenodo IDs of the items to be imported in the pull() function.
- Each Zenodo ID is used to retrieve a json file with the relevant metadata to create a Zenodo item that is uploaded to the MaRDI graph (code).
- The metadata is parsed and processed appropriately.
- Claims are added as properties of the Zenodo item in the create() function.
- The item is created and uploaded to the MarRDI portal if it doesn't already exists, and updated if it does.
Details on how to set up the repository and run the importer can be found here.
Execution
Execute the import.py script in the scripts folder:
python3import.py --mode zenodo
The importer can be run with one or more of the following optional arguments:
- communities: List[str]: a list of Zenodo community IDs.
- resourceTypes: List[str]: a list of Zenodo resource types (check here for valid values).
- orcid_id_file: str: path to a comma separated civ file containing information about the authors. The file must contain a column ‘orcid’.
- customQ: str: see here how to build a custom query.
The arguments can currently directly be modified from the import.py script as parameters to the function ZenodoSource(). While the arguments are all optional, it is (very) recommended that they are used, otherwise the importer will attempt to import the entire Zenodo database.
Note: this might later be changed so that the parameters are passed in a config file instead. Instructions will be updated then
Updating code
The importer source code can be found and modified in the GitHub repository here.
to modify a zenodo resource (eg. add/modify properties), modify the code in
/mardi_importer/mardi_importer/publications/ZenodoResource.py
to change how queries to the zenodo API are run, modify the code in
/mardi_importer/mardi_importer/zenodo/ZenodoSource.py