WikiLinkGraphs: A complete, longitudinal and multilanguage dataset of the Wikipedia link networks
DOI10.5281/zenodo.2539424Zenodo2539424MaRDI QIDQ6704199FDOQ6704199
Dataset published at Zenodo repository.
Alberto Montresor, Cristian Consonni, David Laniado
Publication date: 14 January 2019
Copyright license: Creative Commons Attribution 4.0 International
This dataset contains yearly snapshots of the Wikipedias internal link network for the 9 largest language edition (de, en, es, fr, it, nl, pl, ru, sv). The dataset spans over 17 years, from the creation of Wikipedia in 2001 to March 2018. The snapshots are taken on March 1st of every year. The graphs include the links extract from the wikitext of each page (i.e in the form wikilink). Links transcluded from templates are not included. Redirects are resolved to their target page. More detailed information and supporting datasets are available at: http://disi.unitn.it/~consonni/datasets/. IMPORTANT NOTICE Gzipped files are compressed two times by Zenodo, the MD5 provided by Zenodo and the SHA512 sums provided in the `.sha512sums.txt` files, match with the files compressed once. In other words, when you download a `.gz` file save it as `.gz.gz`, uncompress it once and it should match both the MD5 provided by Zenodo and the SHA512 sum provided by us. We have opened a bug report for this behavior on Zenodos repository at: https://github.com/zenodo/zenodo/issues/1705
This page was built for dataset: WikiLinkGraphs: A complete, longitudinal and multilanguage dataset of the Wikipedia link networks