Dataset relating to the study "Open government data: usage trends and metadata quality"
DOI10.5281/zenodo.4054743Zenodo4054743MaRDI QIDQ6704637FDOQ6704637
Dataset published at Zenodo repository.
Publication date: 5 March 2021
Copyright license: Creative Commons Attribution 4.0 International
Open Government Data (OGD) has the potential to support social and economic progress. However, this potential can be frustrated if this data remains unused. Although the literature suggests that OGD datasets metadata quality is one of the main factors affecting their use, to the best of our knowledge, no quantitative study provided evidence of this relationship. Considering about 400,000 datasets of 28 national, municipal, and international OGD portals, we have programmatically analyzed their usage, their metadata quality, and the relationship between the two. Our analysis has highlighted three main findings. First of all, regardless of their size, the software platform adopted, and their administrative and territorial coverage, most OGD datasets are underutilized. Second, OGD portals pay varying attention to the quality of their datasets metadata. Third, we did not find clear evidence that datasets usage is positively correlated to better metadata publishing practices. Finally, we have considered other factors, such as datasets category, and some demographic characteristics of the OGD portals, and analyzed their relationship with datasets usage, obtaining partially affirmative answers. The dataset consists of three zipped CSV files, containing the collected datasets usage data, full metadata, and computed quality values, for about 400,000 datasets belonging to the 8national, 4 international, and 16 US municipalities OGD portals considered in the study. Data collection occurred in the period: 2019-12-19 -- 2019-12-23. ________________________________________ Portal #DatasetsPlatform ________________________________________ US 261,514 CKAN France 39,412Other Colombia 9,795 Socrata IE 9,598 CKAN Slovenia 4,892 CKAN Poland 1,032 Other Latvia 336 CKAN Puerto Rico 178 Socrata New York, NY 2,771 Socrata Baltimore, MD 2,617 Socrata Austin, TX 2,353 Socrata Chicago, IL 1,368 Socrata San Francisco, CA 1,001Socrata Dallas, TX 1,001Socrata Los Angeles, CA 943Socrata Seattle, WA 718Socrata Providence, RI 288Socrata Honolulu, HI 244Socrata New Orleans, LA 215Socrata Buffalo, NY 213Socrata Nashville, TN 172Socrata Boston, MA 170 CKAN Albuquerque, NM 60 CKAN Albany, NY 50 Socrata HDX17,325 CKAN EUODP 14,058 CKAN NASA 9,664Socrata World Bank Finances 2,177Socrata ________________________________________ The three datasets share the same table structure: Table Fields portalid: portal identifier id: dataset identifier engine: identifier of the supporting portal platform: 1(CKAN), 2 (Socrata) admindomain: 1 (National), 2 (US), 3 (International) downloaddate: date of data collection views: number of total views for the dataset downloads: number of total downloads for the dataset overallq: overall quality values computed by applying the methodology presented by Neumaier et al. in [1] qvalues: json object containing the quality values computed for the 17 metrics presented in by Neumaier et al. [1] assessdate: date of quality assessment metadata: the overall datasets metadata downloaded via API from the portal according to the supporting platform schema [1]Neumaier, S.; Umbrich, J.; Polleres, A. Automated Quality Assessment of Metadata Across Open Data Portals.J. Data and Information Quality2016,8, 2:12:29. doi:10.1145/2964909
This page was built for dataset: Dataset relating to the study "Open government data: usage trends and metadata quality"