Speech/non-speech segmentation based on phoneme recognition features (Q2500916)

scientific article; zbMATH DE number 5050345

Language	Label	Description	Also known as
default for all languages	No label defined
English	Speech/non-speech segmentation based on phoneme recognition features	scientific article; zbMATH DE number 5050345

Statements

instance of

scholarly article

0 references

title

Speech/non-speech segmentation based on phoneme recognition features (English)

0 references

0 references

0 references

0 references

EURASIP Journal on Applied Signal Processing

0 references

publication date

28 August 2006

0 references

review text

Summary: This work assesses different approaches for speech and non-speech segmentation of audio data and proposes a new, high-level representation of audio signals based on phoneme recognition features suitable for speech/non-speech discrimination tasks. Unlike previous model-based approaches, where speech and non-speech classes were usually modeled by several models, we develop a representation where just one model per class is used in the segmentation process. For this purpose, four measures based on consonant-vowel pairs obtained from different phoneme speech recognizers are introduced and applied in two different segmentation-classification frameworks. The segmentation systems were evaluated on different broadcast news databases. The evaluation results indicate that the proposed phoneme recognition features are better than the standard mel-frequency cepstral coefficients and posterior probability-based features (entropy and dynamism). The proposed features proved to be more robust and less sensitive to different training and unforeseen conditions. Additional experiments with fusion models based on cepstral and the proposed phoneme recognition features produced the highest scores overall, which indicates that the most suitable method for speech/non-speech segmentation is a combination of low-level acoustic features and high-level recognition features.

0 references

MaRDI profile type

MaRDI publication profile

0 references

full work available at URL

https://doi.org/10.1155/asp/2006/90495

0 references