Low resolution scanned text dataset for optical character recognition

From MaRDI portal
Dataset:6717338



DOI10.5281/zenodo.3945525Zenodo3945525MaRDI QIDQ6717338FDOQ6717338

Dataset published at Zenodo repository.

Julian Gilbey, Carola-Bibiane Schönlieb

Publication date: 15 July 2020

Copyright license: Creative Commons Attribution 4.0 International



A collection of scanned pages of English text designed for testing low resolution OCR systems. There are 11 different pieces of text, each of which contains 5 pages of text. Each of these 55 pages is typeset in 18 different fonts and then scanned at 300 dpi, producing a total of 990 pages of scanned text. Downsampled 60 dpi and 75 dpi versions are included.







This page was built for dataset: Low resolution scanned text dataset for optical character recognition