Practical speech emotion recognition based on online learning: from acted data to elicited data (Q460386): Difference between revisions

Summary: We study the cross-database speech emotion recognition based on online learning. How to apply a classifier trained on acted data to naturalistic data, such as elicited data, remains a major challenge in today's speech emotion recognition system. We introduce three types of different data sources: first, a basic speech emotion dataset which is collected from acted speech by professional actors and actresses; second, a speaker-independent data set which contains a large number of speakers; third, an elicited speech data set collected from a cognitive task. Acoustic features are extracted from emotional utterances and evaluated by using maximal information coefficient (MIC). A baseline valence and arousal classifier is designed based on Gaussian mixture models. Online training module is implemented by using AdaBoost. While the offline recognizer is trained on the acted data, the online testing data includes the speaker-independent data and the elicited data. Experimental results show that by introducing the online learning module our speech emotion recognition system can be better adapted to new data, which is an important character in real world applications.

0 references

Identifiers

zbMATH Open document ID

1296.68157

0 references

DOI

10.1155/2013/265819

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

@@ Property / review text @@
+Summary: We study the cross-database speech emotion recognition based on online learning. How to apply a classifier trained on acted data to naturalistic data, such as elicited data, remains a major challenge in today's speech emotion recognition system. We introduce three types of different data sources: first, a basic speech emotion dataset which is collected from acted speech by professional actors and actresses; second, a speaker-independent data set which contains a large number of speakers; third, an elicited speech data set collected from a cognitive task. Acoustic features are extracted from emotional utterances and evaluated by using maximal information coefficient (MIC). A baseline valence and arousal classifier is designed based on Gaussian mixture models. Online training module is implemented by using AdaBoost. While the offline recognizer is trained on the acted data, the online testing data includes the speaker-independent data and the elicited data. Experimental results show that by introducing the online learning module our speech emotion recognition system can be better adapted to new data, which is an important character in real world applications.
+Normal rank
@@ Property / Mathematics Subject Classification ID @@
+T10
@@ Property / Mathematics Subject Classification ID: 68T10 / rank @@
+Normal rank
@@ Property / Mathematics Subject Classification ID @@
+T05
@@ Property / Mathematics Subject Classification ID: 68T05 / rank @@
+Normal rank
@@ Property / Mathematics Subject Classification ID @@
+T50
@@ Property / Mathematics Subject Classification ID: 68T50 / rank @@
+Normal rank
@@ Property / zbMATH DE Number @@
+6354585
@@ Property / zbMATH DE Number: 6354585 / rank @@
+Normal rank

Practical speech emotion recognition based on online learning: from acted data to elicited data (Q460386): Difference between revisions

Revision as of 12:39, 30 June 2023

Statements

Identifiers

Sitelinks

Mathematics(0 entries)