The benefit of combining a deep neural network architecture with ideal ratio mask estimation in computational speech segregation to improve speech intelligibility

DOI10.5281/zenodo.1202206ZenodoMaRDI QIDQ6698422FDO

Authors Abigail Anne Kressner, Thomas Bentsen, Torsten Dau, Tobias May

Publication date 17 March 2018

Copyright license Creative Commons Attribution 4.0 International

Contains all the data: Bentsen, T., T.May, A. A. Kresnner, and T. Dau. The benefit of combining a deep neural network architecture with ideal ratio mask estimation in computational speech segregation to improve speech intelligibility. PLOS ONE., in review. There are two folders: WRSs: the Word Recognition Scores (WRSs) from the listener study. The matrix has dimensions 9 conditions x 20 subjects. Data is ordered corresponding to the following condition order: UP, GMM, GMM (3 subbands), GMM (7 subbands), GMM (11 subbands), DNN (IBM); DNN (IBM, 40 ms); DNN (IRM); DNN (IRM, 40 ms) Masks: GMM-IBMs:IBMs and estimated IBMs for the modelsGMM, GMM (3 subbands), GMM (7 subbands), GMM (11 subbands) DNN-IBMs:IBMs and estimated IBMs for the modelsDNN (IBM); DNN (IBM, 40 ms) DNN-IRMs: IRMs and estimated IRMs for the models DNN (IRM); DNN (IRM, 40 ms)

This page was built for dataset: The benefit of combining a deep neural network architecture with ideal ratio mask estimation in computational speech segregation to improve speech intelligibility