SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection
DOI10.5281/zenodo.3931969Zenodo3931969MaRDI QIDQ6704360FDOQ6704360
Dataset published at Zenodo repository.
Nina Tahmasebi, Simon Hengchen, Barbara McGillivray, Dominik Schlechtweg, Haim Dubossarsky
Publication date: 27 May 2020
Copyright license: Creative Commons Attribution 4.0 International
Authors Dominik Schlechtweg, Barbara McGillivray, Simon Hengchen, Haim Dubossarsky, and Nina Tahmasebi Description This data collection contains the post-evaluation data for SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection: the starting kit to download data, and examples for competing in the CodaLab challenge including baselines the true binary change scores of the targets for Subtask 1, and their true graded change scores for Subtask 2 (test_data_truth/), the scoring program used to score submissions against the true test data in the evaluation and post-evaluation phase (scoring_program/), the results of the evaluation phase including the final rankings of the participating teams by their best submission (results/rankings_teams.csv), the submitted files of each team (results/submissions/), an overview of the results for each submission ordered by team (results/submissions_results.csv), analysis plots (plots/) displaying the results: under per_target/ we provide the gold change scores and the normalized prediction error of target words plotted against their frequency and polysemy statistics, under per_team/ we provide the model predictions from the best submission per team (per subtask) plotted against frequency/polysemy statistics and performance on gold data (gray lines give the correlation with the respective variable in the gold data); we also provide plots of visualizing the teams prediction similarities. Some remarks: the paper referenced below remains the only source for the rankings between teams, some teams were disqualified, and are thus removed from the analyses and the rankings present in the paper, some teams have changed names, resulting in a discrepancy between team names under results/ and team names in the paper. The paper contains a key to match old names with new names. Test Data for SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection can be found using the links below: English German Latin Swedish Please find more information on the provided data in the paper referenced below. Reference Dominik Schlechtweg, Barbara McGillivray, Simon Hengchen, Haim Dubossarsky and Nina Tahmasebi. 2020. SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection. SemEval@COLING2020. The resources are freely available for education, research and other non-commercial purposes. @inproceedings{schlechtweg2020semeval, title = "{S}em{E}val-2020 {T}ask 1: {U}nsupervised {L}exical {S}emantic {C}hange {D}etection", author = "Schlechtweg, Dominik and McGillivray, Barbara and Hengchen, Simon and Dubossarsky, Haim and Tahmasebi, Nina", booktitle = "To appear in Proceedings of the 14th International Workshop on Semantic Evaluation", year = "2020", address = "Barcelona, Spain", publisher = "Association for Computational Linguistics"}
This page was built for dataset: SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection