WhichDog: A crowdsourced dataset including candidate set-based labelling (Q6704497)

From MaRDI portal
!
WARNING

This is the item page for this Wikibase entity, intended for internal use and editing purposes.

Dataset published at Zenodo repository.
Language Label Description Also known as
default for all languages
No label defined
    English
    WhichDog: A crowdsourced dataset including candidate set-based labelling
    Dataset published at Zenodo repository.

      Statements

      0 references
      A dataset with crowdsourced labels for aggregation and supervised classification. It contains 400 images of dogs from the Stanford Dogs dataset (http://vision.stanford.edu/aditya86/ImageNetDogs/). Images of dogs that belong to 32 different breeds (classes) are included. Annotators were asked to provide two types of labelling: full labelling (each labeler is allowed to provide a single label for each image) and candidate labelling (each labeler is allowed to provide a set of candidate labels for each image). It includes a total of 61227 annotations (30628 full and 30599 candidate) obtained from a set of 1028 different labelers. The labels were collected through the online crowdsourcing platform Amazon mTurk thanks to funds provided by the Basque Government through the Elkartek program (KK-2018/00071). The assignments were designed as sequences of 64 images that were given to the annotators. Each image in the sequence was provided together with a specific subset of possible labels (with the number of options ranging from 4 to 32), and a instruction for the annotator to perform a specific type of labelling (full or candidate). Each labeler performed at least one assignment. Not all the labelers completed the 64 annotations in their assignments. The file whichdog.zip contains a folder(images) with the 400 images of dogs, a text file (breed_names.txt) that indicates the names of the different breeds and their assigned label(a number in the interval from 0 to 31) and a CSV file(whichdog_all_annots.csv) that contains the information about the annotations. Each row of the CSV file represents a single annotation, and each column shows: - image_id: ID number of the image. - is_candidate: indicates whether the requested labelling is full (0) or candidate (1). - labeler_id: ID number of the labeler. - time: time employed by the labeler to perform the annotation. - answer: label or set of labels provided by the labeler as annotation. - options: subset of possible labels shown to the labeler. - assignment_id: ID number of the assignment - sequence_point: number that indicates the point of the sequence of images of the assignment in which the annotation was provided. - class: ground truth label of the image.
      0 references
      22 September 2022
      0 references

      Identifiers

      0 references