Answers to specific technical questions

From MaRDI portal
Revision as of 18:46, 16 March 2022 by Alvaro (talk | contribs) (Backup)

Sammelseite für offene Fragen (English / Deutsch).


Specific questions

Portal-Compose:

These are questions related to Readme.md in the project:

1. submodules init not necessary anymore? so it can be moved to Docker-Wikibase or completely removed from Readme

  • Thanks, removed from README

2. Add volume in dev-extensions for extension in readme.md in develop-locally section (Johannes)

3. readme.md for ci: why is it required to set default test-passwords in ci as the readme suggests, some tests seem to fail locally with default password ? Point out that the CI steps are defined in main.yml. Also point out that CI builds is triggered from GitHub itself and that the trigger usually on commit, and that this can be defined in the github environment.

  • "some tests seem to fail locally with default password" See question 5.
  • "CI steps are defined in main.yml": added that
  • "CI builds is triggered from GitHub": added that
  • "this can be defined in the github environment": that's already there isn't it?

4. Test locally: run_tests.sh. Add additional note in readme for this: Run compose-up locally then execute script

  • Thanks, added to README

5. Admin password doesn't change locally (from default password) although all containers deleted before and new password defined in env variable, so the local test for this also fails.

  • The short answer is: since the password is stored in the database the first time you do docker-compose up, you would have to delete the volumes (not necessarily the containers), e.g. `docker volume prune`
  • The long answer is: I have yet to document how to change passwords on another wiki page.

6. Deploy on the MaRDI server: notes seem ok, but incomplete, how is deployment done?

  • Please ask physikerwelt

7. Documentation for traefik missing.

  • Please ask physikerwelt or dajuno

8. Add hint in documentation of portal-compose on the linked repositories which create the custom portal containers

  • Added that

Portal-Examples:

- Check PR

  • Did that

- WB_wikidata_properties.ipynb recheck with creds. See in PR.

  • Did that

- Other scripts seem ok

- Reminder:, nochmal alle Scripte hier kurz durchgehen im Meeting. Question for discussion, should scripts for constant data updates running in cli be stored as ipynb format?

  • "should scripts for constant data updates running in cli be stored as ipynb?": No, these are prototypes. The real thing is in docker-importer.

docker-importer:

- The docker-importer is a docker-container which has functionalities for data-import (e.g. from swMATH, zbMATH), and can trigger the import of data in a wikibase container in the same docker-composition cyclically. The import scripts in the importer are written in python and examples for these scripts can be seen as Jupyter-Notebooks in the repository Portal-Examples. " Is this description correct ? If so, could you add it to the README.md ?

  • Thanks, added

- If the data-import will go live, the data-importer will be located in portal-compose files. The current state is WIP and therefore a custom compose with wikibase is provided. Is it correct ?

  • Yes

- https://github.com/MaRDI4NFDI/docker-importer/blob/main/doc/activity.drawio.svg shows that data is read from wikibase database, but not stored somewhere If this is correct, what are the plans and the current state on updating the fetched data in wikibase?

  • Please show me tomorrow

- The shellscript import.sh is meant to trigger the import process as soon as the crontab is activated ? The script import.sh is meant to invoke the pythonscript src/import.py and this starts the import mechanisms. The exact functionality of import.py is defined by the cli-parameters on invoke? Why is import.sh not calling the python script ?

  • That's the idea, but the import script is not finished yet.

- Could the docker importer already be located in portal-compose and with a flag defining that the cron-pattern is deactivated and an external shellscript which triggers the import manually in a next-step ?

  • No, the script is not finished.

Testing-Concept:

- Tests in Selenium are defined like this: https://github.com/MaRDI4NFDI/portal-compose/blob/main/test/MathExtensionsTest.py Would it make sense for the 'System-Testing' just to have a selenium-ci container running which tests against the url deployed portal.mardi4nfdi.de. This could be triggered with a shellscript similar like the local unit-tests after the deployment (and not by github CI). Some cases of course cant be validated by an 'extrernally' running script, but it can be sufficient as smoke tests to check basic functionality in the deployed portal. Have a flag which tags external testcases and an url switch should make it possible to reuse the already written testcases.

  • You could do that. You would have to add the container to the docker-compose that is deployed. However, some tests could write data, delete data etc. so look out.


docker-backup :

- What is the xml files backup of wikipages is it ? Are the wikipages themselves also in the msql-dumps ?

  • Backups are redundant, however there are some important differences regarding page revisions. See backup documentation linked below.

- In case you know, where are the private pages backed up?

  • That's a non-problem

- Could you rename images/files naming inconsistency in the docker-backup repo (is it files or all images?) i.e in line 46 in backup.sh?

- Is there any content of wikibase/mediawiki not considered for backup currently ?

- Discussion: Would it make sense to add a prune functionality for backups, so that some backups which are older then keep-days are kept ?

- Discussion: Would it also make sense to prune/delete obsolete logs from the other containers from this container ? (since they seem to require lots of space)

- Next Steps, Discussion: Mail or Notifications in Grafana/Monitoring on backups ? Also observing the size of backup and logging folders with the monitoring and maximum possible space with monitoring or maybe as mail content ?

~~js: eventually check backup folders on mardi01

MarDI Wikibase Import Fork :

- This is a fork with additional Readme.md infos regarding the MaRDI Portal, it is a mediawiki extension which is already referenced by source in the docker-wikibase, in case there are some future MaRDI customizations for the Wikibase importer? Is it correct?

- Are the wikibase-importer calls are already implemented in docker-importer ? If not, is it the idea to call the maintenance scripts over the python cli interface from here ?: https://github.com/MaRDI4NFDI/docker-importer/blob/main/src/importer/Importer.py

... smth like: python >> php extensions/WikibaseImport/maintenance/importEntities.php {some cmd from python, e.g. --entity P31 --do-not-recurse}

... and then just to capture the script exit codes by python ?

- Is the configuration set here https://github.com/MaRDI4NFDI/WikibaseImport/blob/master_mardi/extension.json ? So in example if referencing to another wikibase than wikidata, the urls would be changed ?

- In case you know, what is the behaviour of php maintenance/importEntities.php --all-properties when calling it twice on different states of the wikibase to import (example calling it for wikidata once a year) would this overwrite the previous properties in the portal-database, is it usable for syncing a wikibase ?

- ~Discussion: Usage example, in case you know, if using sparql like this https://w.wiki/Sjx (which is referring to wikidata items). Assuming the related wikidata properties have been imported to wikibase-portal and the query is done in mardi-portal-docker-query-service could there some namespace defined for the imported data, to prevent ID collisions ?

- Discussion: Is wikibase import fork necessary to realize federated queries ? Probably not, cause items could usually imported through http specification in sparql queries.

Docker-Wikibase :

- Dev-Dockerfile documentation missing

- N

docker-quickstatements :

- tbd wednesday



Future MaRDI Steps:

- Suggestions and Discussion on Future MaRDI Steps which not already have been explicitly mentioned in the other questions.

  • Perhaps I would do that on another page

Overall:

- Could you re-check everybody has admin-access for all repos in MaRDI4NFDI ?

  • Done

- Also for Traefik dashboard https://traefik.portal.mardi4nfdi.de if this relates to aot

  • Please ask physikerwelt or dajuno



See also: Technical_introduction

General stuff

Backup

Backup and restore
How to configure automatically and manually backup the data, and restore a backup.

Testing

Testing concept
General guidelines about testing the MaRDI Portal.
Selenium, see also Deployment
Selenium container documentation

How to import data in the portal (in development)

Import process overview
UML action diagram showing import process from zwMath.
Import properties from Wikidata
How to to import items and properties from Wikidata into the Portal.
Read data from zbMath
There are millions of references to papers in the zbMath database. We just need (for now) those related to the list of mathematical software that has been imported into the MaRDI-Portal.
Populate the portal using the Wikibase API
Import the data read from zbMath into the data structure setup from properties imported from Wikidata.

Where to ask for help when new wikibase features are needed

WBSG projects and WBSG Rhizome Loomio (requires login)
The Wikibase Stakeholder Group coordinates a variety of projects for the broader Wikibase ecosystem.

Passwords

How to change passwords
Change wiki passwords and database passwords.