Diachronic word embeddings from 19th-century newspapers digitised by the British Library (1800-1919)
DOI10.5281/zenodo.7181682Zenodo7181682MaRDI QIDQ6704400FDOQ6704400
Dataset published at Zenodo repository.
Barbara McGillivray, Nilo Pedrazzini
Publication date: 10 October 2022
Copyright license: Creative Commons Attribution 4.0 International
Word vectors related to the paperMachines in the media: semantic change in the lexiconof mechanization in 19th-century British newspapersby Nilo Pedrazzini and Barbara McGillivray (2022). The embeddings were trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and the following parameters: sg = True min_count = 1 window = 3 vector_size = 200 epochs = 5 The embeddingsare divided into periods of ten years each, with the vectors from each decade aligned to the ones from the most recent decade (1910s) using Orthogonal Procrustes. See related GitHub repository for the full documentation:https://github.com/Living-with-machines/DiachronicEmb-BigHistData Project webpage (Living with Machines):https://livingwithmachines.ac.uk/
This page was built for dataset: Diachronic word embeddings from 19th-century newspapers digitised by the British Library (1800-1919)