Acoustic MIMO signal processing (Q2509122)

The book is devoted to the problems of speech acquisition in complex acoustic environment. It is tutorial in nature with a detailed exposition of the latest achievements and state-of-the-art in the field. Open questions and further research directions are also explored. The main body of the book is organized into two parts. In the first part (Chap. 2 to Chap. 7), the theory of acoustic Multiple Input Multiple Output (MIMO) signal processing is developed. The second part (Chap. 8 to Chap. 11) focuses on various applications. The content of the book is organized as follows: Chapter 1 provides a brief introduction to the acoustic MIMO Signal Processing. Chapter 2 gives a complete overview of acoustic systems. It is discussed how to model an acoustic system according to the number of its inputs and outputs, and the acoustic MIMO models in both the time and frequency domains are introduced. An attempt to characterize acoustic channels is performed. Both single-channel and multichannel properties are discussed. Finally, facilities for directly measuring acoustic impulse responses and the image method for their simulation are explored. Chapter 3 derives the Wiener filter in the context of identification of acoustic MIMO systems when a reference signal is available. Some fundamental differences between the Single Input Single Output (SISO) and Multiple Input Single Output (MISO) cases are discussed. Also in this chapter, some basic adaptive algorithms such as the deterministic algorithm, Least Mean Square (LMS), Normalized LMS (NLMS), and sign algorithms are derived. Acoustic impulse responses are usually sparse. However, all classical algorithms such as the Normalized Least Mean Square (NLMS) and Recursive Least Square (RLS) adaptive filters, do not take this information into account. In Chap. 4, it is shown how this feature can be utilized to obtain adaptive algorithms with much better initial convergence and tracking abilities than the classical ones. Important sparse adaptive algorithms, such as Proportionate Normalized Least Mean Square (PNLMS), Improved PNLMS (IPNLMS), and exponentiated gradient algorithms, are explained. It is also shown how all these algorithms are related to each other. When things come to implementation, it is fundamental to have adaptive filters that are very efficient from an arithmetic complexity point of view. Because frequency-domain algorithms use essentially the Fast Fourier Transform (FFT) to compute the convolution and to update the coefficients of the filter, they are excellent candidates. Chapter 5 explains how frequency-domain algorithms can be derived rigorously from a recursive least-squares criterion, with a block size independent of the length of the adaptive filter. All classical algorithms (in single-channel and multichannel cases) can be obtained from this approach. In most cases, the speech inputs to an acoustic MIMO system are unknown and it is possible to use only the observations at the outputs to blindly identify the system. In Chap. 6, it is presented how blind identification of SIMO and MIMO systems can be performed, with emphasis on second-order-statistics based methods. For SIMO systems, a rich set of adaptive algorithms (Multichannel LMS (MCLMS), Multichannel Newton (MCN), Variable Step Size Unconstrained multichannel LMS (VSS-UMCLMS), Frequency Domain Normalized multichane LMS (FNMCLMS), etc.) are developed. For MIMO systems, different scenarios are analyzed and it is shown when the problem can be solved and when it cannot. One of the challenges in acoustic MIMO signal processing problems lies in the fact that there exist both co-channel and temporal interference in the outputs. Chapter 7 discusses the key issue of separating and suppressing co-channel and temporal interference. The conditions of separability for co-channel and temporal interference are presented. The three approaches to suppressing temporal interference: the direct inverse, Mean Square Error (MMSE), and Multichannel Inverse Theorem (MINT) methods, are illustrated. Acoustic signal processing for speech and audio has its debut with acoustic echo cancellation (AEC). It is well-known that in hands-free telephony, the acoustic coupling between the loudspeaker and the microphone generates echoes that can be extremely harmful. So AEC is required. Chapter 8 gives an overview on this important area of research. The multichannel aspect is emphasized since teleconferencing systems of the future will, without a doubt, have multiple loudspeakers and microphones. This chapter also discusses the concept of stereo audio bridging, which we believe will be a big part of the picture of the next-generation voice over IP systems. Chapter 9 consists of two parts, one on time delay estimation and the other on acoustic source localization. The former deals with the measurement of time difference of arrival (TDOA) between signals received by spatially separated microphones. It addresses a wide variety of techniques ranging from the simple cross-correlation method to the advanced blind system identification based algorithms, with significant attention being paid to combating the reverberation effects. The latter discusses how to employ the TDOA measurements to locate, in the acoustic wavefield, radiating sources that emit signals to microphones. It formulates the problem from an estimation-theory point of view and presents various location estimators, some of which have the potential to achieve the Cramer-Rao lower bound. Chapter 10 is devoted to the speech - enhancement / noise - reduction problem, which aims at estimating the speech of interest from its observations corrupted by additive noise. In an acoustic MIMO system, speech, enhancement can be achieved by processing the waveform received by a single microphone, but often it is advantageous to use multiple microphones. This chapter covers not only the well-recognized single-channel techniques such as Wiener filtering and spectral subtraction, but also the multichannel techniques such as adaptive noise cancellation and spatio-temporal filtering approaches. Finally, in Chap. 11, source separation and speech dereverberation is discussed. These two difficult problems are studied together in one chapter, because they are closely associated with each other in acoustic MIMO systems. Based on a review of the cocktail party effect some implications for the developing of more effective source separation algorithms are derived. Beamforcing and independent component analysis methods are compared for the source separation problem. A synergistic solution to source separation and speech dereverberation is presented based on blind identification of acoustic MIMO systems. The book will be usefull for a large audience, ranging from researchers and PhD students, to practicing engineers, either just approaching in the field or already having years of experience.

0 references

zbMATH Keywords

signal processing

0 references

signal filtering

0 references

signal detection

0 references

signal theory

0 references

reviewed by

Tzvetan Semerdjiev