Pitch correlogram clustering for fast speaker identification (Q2570287): Difference between revisions

Summary: Gaussian mixture models (GMMs) are commonly used in text-independent speaker identification systems. However, for large speaker databases, their high computational run-time limits their use in online or real-time speaker identification situations. Two-stage identification systems, in which the database is partitioned into clusters based on some proximity criteria and only a single-cluster GMM is run in every test, have been suggested in literature to speed up the identification process. However, most clustering algorithms used have shown limited success, apparently because the clustering and GMM feature spaces used are derived from similar speech characteristics. This paper presents a new clustering approach based on the concept of a pitch correlogram that captures frame-to-frame pitch variations of a speaker rather than short-time spectral characteristics like cepstral coefficient, spectral slopes, and so forth. The effectiveness of this two-stage identification process is demonstrated on the IVIE corpus of 110 speakers. The overall system achieves a run-time advantage of 500\% as well as a 10\% reduction of error in overall speaker identification.

0 references

zbMATH Keywords

speaker identification

0 references

clustering

0 references

pitch

0 references

correlogram

0 references

Identifiers

zbMATH Open document ID

1107.68461

0 references

DOI

10.1155/S1110865704408026

0 references

Mathematics Subject Classification ID

68T10

0 references

zbMATH DE Number

2220178

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:2570287

Revision as of 02:28, 6 August 2023 Importer (talk \| contribs) Bots 7,049,768 edits ‎Created a new Item	Revision as of 06:55, 3 February 2024 Import240129110113 (talk \| contribs) Bots 7,163,963 edits Added link to MaRDI item. Newer edit →
links / mardi / name	links / mardi / name
		Publication:2570287