The benefit of combining a deep neural network architecture with ideal ratio mask estimation in computational speech segregation to improve speech intelligibility
DOI10.5281/zenodo.1202206Zenodo1202206MaRDI QIDQ6698422FDOQ6698422
Dataset published at Zenodo repository.
Abigail Anne Kressner, Thomas Bentsen, Torsten Dau, Tobias May
Publication date: 17 March 2018
Copyright license: Creative Commons Attribution 4.0 International
Contains all the data: Bentsen, T., T.May, A. A. Kresnner, and T. Dau. The benefit of combining a deep neural network architecture with ideal ratio mask estimation in computational speech segregation to improve speech intelligibility. PLOS ONE., in review. There are two folders: WRSs: the Word Recognition Scores (WRSs) from the listener study. The matrix has dimensions 9 conditions x 20 subjects. Data is ordered corresponding to the following condition order: UP, GMM, GMM (3 subbands), GMM (7 subbands), GMM (11 subbands), DNN (IBM); DNN (IBM, 40 ms); DNN (IRM); DNN (IRM, 40 ms) Masks: GMM-IBMs:IBMs and estimated IBMs for the modelsGMM, GMM (3 subbands), GMM (7 subbands), GMM (11 subbands) DNN-IBMs:IBMs and estimated IBMs for the modelsDNN (IBM); DNN (IBM, 40 ms) DNN-IRMs: IRMs and estimated IRMs for the models DNN (IRM); DNN (IRM, 40 ms)
This page was built for dataset: The benefit of combining a deep neural network architecture with ideal ratio mask estimation in computational speech segregation to improve speech intelligibility