Audio classification in speech and music: A comparison between a statistical and a neural approach (Q1607684)

!

WARNING

This is the item page for this Wikibase entity, intended for internal use and editing purposes.

Please use the normal view instead:

Audio classification in speech and music: A comparison between a statistical and a neural approach

scientific article; zbMATH DE number 1779598

Language	Label	Description	Also known as
default for all languages	No label defined
English	Audio classification in speech and music: A comparison between a statistical and a neural approach	scientific article; zbMATH DE number 1779598

Statements

instance of

scholarly article

0 references

title

Audio classification in speech and music: A comparison between a statistical and a neural approach (English)

0 references

author

Alessandro Bugatti

0 references

Alessandra Flammini

0 references

Pierangelo Migliorati

0 references

published in

EURASIP Journal on Applied Signal Processing

0 references

publication date

14 October 2002

0 references

review text

Summary: We focus on the problem of audio classification in speech and music for multimedia applications. In particular, we present a comparison between two different techniques for speech/music discrimination. The first method is based on zero crossing rate and Bayesian classification. It is very simple from a computational point of view and gives good results in case of pure music or speech. The simulation results show that some performance degradation arises when the music segment contains also some speech superimposed on music, or strong rhythmic components. To overcome these problems, we propose a second method that uses more features and that is based on neural networks (specifically a multi-layer perceptron). In this case we obtain better performance, at the expense of a limited growth in the computational complexity. In practice, the proposed neural network is simple to implement if a suitable polynomial is used as the activation function, and a real-time implementation is possible even if low-cost embedded systems are used.

0 references

zbMATH Keywords

audio classification

0 references

speech/music discrimination

0 references

zero crossing rate

0 references

Bayesian classification

0 references

neural networks

0 references