Datasets for the Hierarchy Transformer Encoders (HiTs)

From MaRDI portal
Dataset:6718576



DOI10.5281/zenodo.14036213Zenodo14036213MaRDI QIDQ6718576FDOQ6718576

Dataset published at Zenodo repository.

Ian Horrocks, Yuan He, Yuan Zhangdie, Jiaoyan Chen

Copyright license: Creative Commons Attribution 4.0 International



About Datasets for training and evaluating the Hierarchy Transformer encoders (HiTs) proposed in the paper titled: "Language Models as Hierarchy Encoders". Files withmulti suffix corresponds to Multi-hop Inference evaluaiton. Files with mixed suffix corresponds to Mixed-hop Prediction (and its transfer setting) evaluation. schemaorg, foodon, and doid are only involved in the transfer evaluation, but the datasets here for foodon and doid also give their training sets(see explanation in the paper for why we opted not to generate a trainning set for schemaorg). The previous version of this dataset collection has been marked deprecated because it seems that it contains broken files forsnomed. Huggingface Datasets We offer a convenient Huggingface Datasets entry, enabling users to load data directly using the load_dataset method. The datasets are available in formats of either entity triplets or labelled entity pairs. Please note that in this way, the original entity IDs are not retained. To map entities back to their original hierarchies, refer to this Zenodo release. Citation The relevant paper has been accepted at NeurIPS 2024 (to appear). Links GitHub repository: https://github.com/KRR-Oxford/HierarchyTransformers Models and Datasets on Huggingface Hub: https://huggingface.co/Hierarchy-Transformers Arxiv preprint: https://arxiv.org/abs/2401.11374 Contact Yuan He (yuan.he(at)cs.ox.ac.uk)







This page was built for dataset: Datasets for the Hierarchy Transformer Encoders (HiTs)