Project talk:OpenMLDatamodels

From MaRDI portal
Revision as of 17:05, 21 March 2024 by Daniel (talk | contribs) (started)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Initial feedback on the data model for OpenML dataset items

The following remarks are based on this version of the documentation page and this version of the sample item.

On the documentation page, each statement type should link to the respective property. For a generally useful approach to sharing Wikibase data models, see the corresponding pages on some WikiProjects over on Wikidata, e.g. here. It is also advisable to create and document the necessary properties in advance in order to facilitate their discussion.

Some specific points regarding individual properties:

  • dataset version
    • The property description page is essentially empty. Should this be specific to OpenML or generic?
    • Some of the other properties may change with the version number (certainly the checksum, for instance) — how to handle that?
  • author name string
    • keep track of order in the author list, as per series ordinal, so as to facilitate conversion to author statements
  • default target attribute
    • The property description page is essentially empty and needs to be fleshed out.
  • checksum
    • Depends on version, so should be coordinated with that ( see above)
  • has feature
    • The property description page is essentially empty and needs to be fleshed out.
  • number of binary features
    • The property description page is essentially empty and needs to be fleshed out.
  • number of classes
    • The property description page is essentially empty and needs to be fleshed out.
  • number of features
    • The property description page is essentially empty and needs to be fleshed out.
  • number of instances
    • The property description page is essentially empty and needs to be fleshed out.
  • number of instances with missing values
    • The property description page is essentially empty and needs to be fleshed out.
  • number of missing values
    • The property description page is essentially empty and needs to be fleshed out.
  • number of numeric features
    • The property description page is essentially empty and needs to be fleshed out.
  • number of symbolic features
    • The property description page is essentially empty and needs to be fleshed out.

--Daniel (talk) 16:05, 21 March 2024 (CET)