Relations from Italian Wikipedia using Unsupervised Information Extraction

From MaRDI portal
(Redirected from Dataset:6718850)



DOI10.5281/zenodo.5498034Zenodo5498034MaRDI QIDQ6718850FDOQ6718850

Dataset published at Zenodo repository.

Lucia Siciliani, Pasquale Lops, Pierpaolo Basile, Pierluigi Cassotti, Marco Degemmis

Publication date: 9 September 2021

Copyright license: Creative Commons Attribution 4.0 International



This dataset contains relationsextracted from the Italian Wikipedia by theWikiOIE framework. WikiOIE is based on UDPipe and the Universal Dependencies project for text processing. It easily allows customizing the information extraction (IE) approach to automatically extract triples (subject, predicate, object). This dataset contains relations extracted bytwo unsupervised IE methods. The former (simple) is based only on PoS-tag patterns; the latter (simpledep) also uses syntactic dependencies. The extraction process is provided in JSON format. More information and the Java code areavailable herehttps://github.com/pippokill/WikiOIE Pierluigi Cassotti, Lucia Siciliani, Pierpaolo Basile,Marco de Gemmis, and Pasquale Lops. 2021. Extracting relations from Italian Wikipedia using unsupervised information extraction. In Proceedings of the 11th Italian Information Retrieval Workshop 2021 (IIR 2021). CEUR-WS.







This page was built for dataset: Relations from Italian Wikipedia using Unsupervised Information Extraction