DOI10.5281/zenodo.4501273Zenodo4501273MaRDI QIDQ6717248FDOQ6717248
Dataset published at Zenodo repository.
Paul Groth, Daniel Daza, Michael Cochez
Publication date: 4 February 2021
Copyright license: Creative Commons Attribution 4.0 International
This repository contains knowledge graphs based on the WN18RR and FB15k-237 datasets. We generate new training, validation, and test splits for theinductivesetting, where some entities are removed from the training set. The splits are used in the experiments described in the paperInductive Entity Representations from Text via Link Prediction. To generate inductive splits, we remove nodes so that no other node becomes isolated, and the number of edges of a particular relation type does not drop below 100. The following are statistics for the datasets. | | WN18RR-ind | FB15k-237-ind | |-----------|------------|---------------| | Relations | 11 | 237 | | | Training | | Entities | 32,755 | 11,633 | | Triples | 69,585 | 215,082 | | | Validation | | Entities | 4,094 | 1,454 | | Triples | 11,381 | 42,164 | | | Test | | Entities | 4,094 | 1,454 | | Triples | 12,087 | 52,870 | The splits for each dataset are called ind-train.tsv, ind-dev.tsv, and ind-test.tsv. We also include textual descriptions for each entity, as well as type information.
This page was built for dataset: Inductive WN18RR and FB15k-237