Project talk:OpenMLDatamodels: Difference between revisions
From MaRDI portal
started |
No edit summary |
||
Line 8: | Line 8: | ||
* [[Property:P1474|dataset version]] | * [[Property:P1474|dataset version]] | ||
** The property description page is essentially empty. Should this be specific to OpenML or generic? | ** The property description page is essentially empty. Should this be specific to OpenML or generic? | ||
*** Answer: We can keep it generic in my opinion | |||
** Some of the other properties may change with the version number (certainly the checksum, for instance) — how to handle that? | ** Some of the other properties may change with the version number (certainly the checksum, for instance) — how to handle that? | ||
*** Answer: My plan was to always update it to the newest version, including all properties | |||
* [[Property:P43|author name string]] | * [[Property:P43|author name string]] | ||
** keep track of order in the author list, as per [[Property:P146|series ordinal]], so as to facilitate conversion to [[Property:P16|author]] statements | ** keep track of order in the author list, as per [[Property:P146|series ordinal]], so as to facilitate conversion to [[Property:P16|author]] statements |
Revision as of 16:12, 21 March 2024
Initial feedback on the data model for OpenML dataset items
The following remarks are based on this version of the documentation page and this version of the sample item.
On the documentation page, each statement type should link to the respective property. For a generally useful approach to sharing Wikibase data models, see the corresponding pages on some WikiProjects over on Wikidata, e.g. here. It is also advisable to create and document the necessary properties in advance in order to facilitate their discussion.
Some specific points regarding individual properties:
- dataset version
- The property description page is essentially empty. Should this be specific to OpenML or generic?
- Answer: We can keep it generic in my opinion
- Some of the other properties may change with the version number (certainly the checksum, for instance) — how to handle that?
- Answer: My plan was to always update it to the newest version, including all properties
- The property description page is essentially empty. Should this be specific to OpenML or generic?
- author name string
- keep track of order in the author list, as per series ordinal, so as to facilitate conversion to author statements
- default target attribute
- The property description page is essentially empty and needs to be fleshed out.
- checksum
- Depends on version, so should be coordinated with that ( see above)
- has feature
- The property description page is essentially empty and needs to be fleshed out.
- number of binary features
- The property description page is essentially empty and needs to be fleshed out.
- number of classes
- The property description page is essentially empty and needs to be fleshed out.
- number of features
- The property description page is essentially empty and needs to be fleshed out.
- number of instances
- The property description page is essentially empty and needs to be fleshed out.
- number of instances with missing values
- The property description page is essentially empty and needs to be fleshed out.
- number of missing values
- The property description page is essentially empty and needs to be fleshed out.
- number of numeric features
- The property description page is essentially empty and needs to be fleshed out.
- number of symbolic features
- The property description page is essentially empty and needs to be fleshed out.