Meta_Album_MD_MIX_Mini

From MaRDI portal
Dataset:6037282



OpenML44287MaRDI QIDQ6037282FDOQ6037282RO-CrateQ6037282

OpenML dataset with id 44287

Ihsan Ullah

Full work available at URL: https://api.openml.org/data/v1/download/22110987/Meta_Album_MD_MIX_Mini.arff

Upload date: 28 October 2022
Copyright license: Creative Commons Attribution-NonCommercial 4.0 International



Dataset Characteristics

Number of classes: 0
Number of features: 69 (numeric: 46, symbolic: 1 and in total binary: 1 )
Number of instances: 28,240
Number of instances with missing values: 28,240
Number of missing values: 665,053

Meta-Album OmniPrint-MD-mix Dataset (Mini)

* OmniPrint-MD-mix dataset consists of 28 240 images (128x128, RGB) from 706 categories. The images are synthesized with OmniPrint, and no further processing was done. The OmniPrint synthesis parameters are stated as follows: font size is 192, image size is 128, the strength of random perspective transformation is 0.04, left/right/top/bottom margins are all 20% of the image size, the strength of pre-rasterization elastic transformation is 0.035, random translation is activated both horizontally and vertically, rotation is within -60 and 60 degrees, horizontal shear is within -0.5 and 0.5, brightness is within 0.8333 and 1.2, contrast is within 0.8333 and 1.2, color enhancement is within 0.8333 and 1.2. The other parameters vary between images. We designed 20 settings, each setting is used to synthesize 2 images. All images/textures consists of photos taken by a personal mobile phone.


Dataset Details

![1]

Meta Album ID: OCR.MD_MIX Meta Album URL: https://meta-album.github.io/datasets/MD_MIX.html Domain ID: OCR Domain Name: Optical Character Recognition Dataset ID: MD_MIX Dataset Name: OmniPrint-MD-mix Short Description: Character images with a specific set of nuisance parameters \# Classes: 706 \# Images: 28240 Keywords: ocr Data Format: images Image size: 128x128

License (original data release): CC BY 4.0 License URL(original data release): https://creativecommons.org/licenses/by/4.0/

License (Meta-Album data release): CC BY 4.0 License URL (Meta-Album data release): https://creativecommons.org/licenses/by/4.0/

Source: OmniPrint Source URL: https://github.com/SunHaozhe/OmniPrint

Original Author: Haozhe Sun Original contact: sunhaozhe275940200@gmail.com

Meta Album author: Haozhe Sun Created Date: 25 June 2021 Contact Name: Haozhe Sun Contact Email: meta-album@chalearn.org Contact URL: https://meta-album.github.io/


Cite this dataset

``` @inproceedings{sun2021omniprint,

   title={OmniPrint: A Configurable Printed Character Synthesizer},
   author={Haozhe Sun and Wei-Wei Tu and Isabelle M Guyon},
   booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)},
   year={2021},
   url={https://openreview.net/forum?id=R07XwJPmgpl}

} ```


Cite Meta-Album

``` @inproceedings{meta-album-2022,

       title={Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification},
       author={Ullah, Ihsan and Carrion, Dustin and Escalera, Sergio and Guyon, Isabelle M and Huisman, Mike and Mohr, Felix and van Rijn, Jan N and Sun, Haozhe and Vanschoren, Joaquin and Vu, Phan Anh},
       booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
       url = {https://meta-album.github.io/},
       year = {2022}
   }

```


More

For more information on the Meta-Album dataset, please see the [NeurIPS 2022 paper] For details on the dataset preprocessing, please see the [supplementary materials] Supporting code can be found on our [GitHub repo] Meta-Album on Papers with Code [Meta-Album]


Other versions of this dataset**

[Micro]





ROCrate

What is a RO-Crate?

A RO-Crate is a standardized research object package used to bundle data together with rich machine-readable metadata. Each RO-Crate contains:

  • the files belonging to the dataset (e.g. CSVs, images, code, documentation)
  • a ro-crate-metadata.json file describing the content, provenance, and context
  • persistent identifiers and references to related research objects (e.g. software, publications)

This ensures that the dataset can be easily reused, cited, validated, and interpreted in a reproducible manner. More information can be found here.

Download

You can download a RO-Crate for this dataset here: Download RO-Crate

HINT: The RO-Crate is created dynamically, so it could take up to 30 seconds until the downloads starts.


This page was built for dataset: Meta_Album_MD_MIX_Mini