PhD Theses
No. 17 - Arthur Wolf
Arthur Wolf: Entwurf und Evaluierung von Algorithmen für ein Innenraumkommunikationssystem
Shaker-Verlag, 2023
Commission
- Prof. Dr.-Ing. Gerhard Schmidt
(first reviewer) - Prof. Dr.-Ing. Tim Fingscheidt
(second reviewer) - Prof. Dr.-Ing. Nilesh Madhu
(third reviewer) - Prof. Dr.-Ing. Michael Höft
(examiner) - Prof. Dr.-Ing. Jan Trieschmann
(head of the examination board)
Abstract
The communication inside a car is often difficult due to the unfavorable seating position and the high background noise level. An in-car communication system (ICC) can improve the communication between passengers while driving. For this purpose, the voice is recorded with a close to the speaker mouth placed microphone and, after appropriate signal processing, it is played back via close to the listener positioned loudspeakers. The ICC system operates in a closed electro-acoustic loop. Therefore, the maximum system amplification is limited by the feedback of the playback signal. Because the system gain required for sufficient support is often in the range of the maximum gain, additional measures must be taken to suppress feedback. In addition to the speech and the feedback signals, the microphones also pick up the driving noises within the vehicle. If the noisy microphone signal is only amplified, the recorded interference will have a negative effect on the speech quality. These signal components should be reduced by suitable noise suppression to improve the ICC output signal.
For the passengers, the perceived system quality depends not only on the amplification level but also on the delay introduced by the ICC system between the direct sound and the system reproduction. This delay should be as low as possible and typically not exceed 15 ms. The acoustic localization of the speaker and the speech intelligibility of the playback is disturbed by a long delay. Due to psychoacoustic effects, the permissible delay depends heavily on the system gain and the arrangement of the loudspeakers in relation to the passengers.
In this work, the signal processing for an ICC system, which works robustly in a vehicle under real conditions, is presented. The focus is on runtime-optimized and computationally efficient algorithms for noise and feedback suppression. The delay at the listener’s ear is reduced to only 10 ms. With the feedback suppression measures described here, the ICC system can also be operated around maximum system gain without instabilities. The implementation of algorithms as real-time digital audio processing and the buildup of a real time demonstrator vehicle made it possible to evaluate the improvement in speech intelligibility achieved in the vehicle environment. In addition to the improvement in speech transmission recorded by measurements, the speech intelligibility and speech quality were confirmed in experiments by test subjects. The results of this evaluation show that with an active ICC system and the signal processing and measures presented here, the passengers on the rear seats (worst listening position) can hear and understand the driver just as well, as the front passenger (best listening position), even at high speeds.
No. 16 - Robbin Romijnders
Robbin Romijnders: Inertial Measurement Unit-Based Gait Event Detection in Healthy and Neurological Cohorts
Pdf-based submission (available freely via the MACAU system), 2023
Commission
- Prof. Dr.-Ing. Gerhard Schmidt
(first reviewer) - Prof. Dr. med. Walter Maetzler
(second reviewer) - Prof. Dr.-Ing. Jan Trieschmann
(examiner) - Prof. Dr. Martina Gerken
(head of the examination board)
Abstract
Walking impairments are common in elderly people and its prevalence increases with age. Walking impairments have devastating consequences and are associated with a loss of mobility, increased institutionalization, increased fall risk and decreased quality of life. Numerous disorders of both the central and peripheral nervous system can cause an impaired walking pattern. The objective quantification of walking is therefore of high clinical interest for clinicians, researchers and neurological patients.
Walking is made up from repetitive gait cycles, that can be divided in a stance phase, during which the foot is in contact with the ground, and a swing phase, during which the same foot is swinging forward. These phases are demarcated by gait events that are referred to as initial and final contact. The robust and accurate detection of these gait events is critical for any clinical gait analysis. Recent advances in wearable inertial sensor technology potentially allow the clinical gait analysis to shift to long-term continuous monitoring in the habitual environment. However, to date, the algorithms to extract gait events from inertial measurement unit (IMU) data have limited ecological validity as they have been validated mainly in clinical research settings with straight-line walking trials.
In this thesis a deep learning (DL)-based network is developed to determine gait events from IMU data from a shank- or foot-worn device. The DL network takes as input the raw IMU data predicts for each time step the probability that it corresponds to an initial or final contact. The algorithm is validated for walking at different self-selected speeds across multiple neurological diseases and both in clinical research settings and the habitual environment. The algorithms shows a high detection rate for initial and contacts, and a small time error when compared to reference events obtained with an optical motion capture system or pressure insoles.
Based on the excellent performance, it is concluded that the DL algorithm is well suited for continuous long-term monitoring of gait in the habitual environment.
No. 15 - Michael Brodersen
Michael Brodersen: Signalverarbeitung für Kommunikationssysteme von Atemschutzvollmasken
Pdf-based submission (available freely via the MACAU system), 2021
Commission
- Prof. Dr.-Ing. Gerhard Schmidt
(first reviewer) - Prof. Dr.-Ing. Peter Jax
(second reviewer) - Prof. Dr.-Ing. Michael Höft
(examiner) - Prof. Dr.-Ing. Ludger Klinkenbusch
(head of the examination board)
Abstract
Full-face masks are essential for fire fighters to ensure respiratory protection in smoke diving incidents. While such masks are absolutely necessary for protection purposes on the one hand, they impair the voice communication of fire fighters drastically on the other hand. For this reason mask integrated communication systems can be used to amplify the speech, therefore, to improve the communication intelligibility and quality. The communication system picks up the speech signal by a microphone in the mask, enhances it by a digital signalprocesser, and plays back the amplified signal by loudspeakers located on the outside of such masks, transmits the signal via a local wireless network to other communication systems and routes the signal to an attached tactical radio. The enhancement via microphone and loudspeakers is only possible to a limited extend, due to the disturbing breathing and ambient noise, and the resulting coupling feedback of the loudspeaker to the microphone.
To increase the speech intelligibility and solve the problems shown before, this work examines different algorithms to improve communication for masks based on digital signal processing. Since breathing noise is picked up by the microphone, it is detected and suppressed by a voice activity detection. This algorithm ensures that only speech components are played back. In addition the ambient noise is estimated and suppressed. Due to the fact that the microphone is located close to the loudspeaker, feedback is occurring and this is reduced by feedback cancelation. To enhance the functionality of the canceler a decorrelation stage is applied to the signal. After the microphone enhancement the signals are mixed to the dedicated output signals. The post processing is possible for each output signal and includes an exciter, an equalizer, a dynamic range control, and a hard limiter. The exciter regenerates lost signal components due to the attenuation through non-linear characteristics. Equalization filters are applied to improve the stability of the system on the one hand and to enhance the perceived quality of the output signals on the other hand.
All described processing steps are implemented on a 16-bit fixed point digital signal processor and optimized for efficiency. Finally possible evaluation scenarios for masks communication system are presented.
No. 14 - Simon Graf
Simon Graf: Design of Scenario-specific Features for Voice Activity Detection and Evaluation for Different Speech Enhancement Applications
To appear soon, 2023
Commission
- Prof. Dr.-Ing. Gerhard Schmidt
(first reviewer) - Prof. Dr.-Ing. Tim Fingscheidt
(second reviewer) - Prof. Dr.-Ing. Michael Höft
(examiner) - Prof. Dr.-Ing. Hermann Kohlstedt
(head of the examination board)
Abstract
Many technical applications nowadays make use of human speech. In situations where controlling a device by hand is not possible or inconvenient, voice can be employed instead. Important use cases can be found in automotive environments where distractions of the driver have to be reduced. Speech enabled applications allow for dictating messages, controlling devices by voice, or making phone calls out of the driving car via hands-free telephony. Even the communication between passengers inside the car can be facilitated using modern speech applications. In-car-communication (ICC) systems amplify the passenger’s speech and allow for convenient conversations at high travel speeds. Also outside the car, mobile speech applications, such as smartphones, become more and more ubiquitous.
The desired speech signal that is recorded by microphones is inevitably superposed by background noise. In automotive scenarios, primarily stationary noise components are observable. In contrast, smartphones can be employed at almost every location resulting in a much higher variability of noise scenarios. Distinguishing the desired speech from background noise is an essential prerequisite for many algorithms that are incorporated in speech applications. When speech is present in the signal, capturing and preserving these desired components is targeted. Contrariwise, the suppression of noise usually requires information on the background noise that can be gathered primarily during speech pauses.
Voice activity detection (VAD) aims on identifying presence of speech in a noisy signal. For this purpose, features are extracted: the signal is processed in such a way that certain distinctive properties of human speech are emphasized. Various features focusing on different speech properties have been introduced with the goal of telling apart speech and noise. A detector finally decides whether speech is present in the signal. Beyond this, automatic speech recognition systems may identify what is said usually incorporating a VAD.
In this thesis, many features for VAD are summarized and classified with respect to properties of human speech that are exploited. New features are introduced considering speech properties that are typically not taken into account. Since different features represent different aspects of human speech, a combination of multiple features is desirable. By considering advantages and drawbacks of each feature, the final detection result can be improved. Adequate feature combinations may increase the robustness against interferences.
In literature, the results of VAD algorithms are typically evaluated without considering a specific application. Different aspects of the detection are evaluated, however, they are not related to the final application’s performance. The evaluations in this thesis are therefore dedicated to the requirements of the target application. Some important applications are analyzed with respect to their dependency on VAD results. The importance of accurate VAD results is exemplified for algorithms in an ICC system and for the suppression of babble noise. These applications cover important use cases of VAD with particularly challenging yet contrary conditions. Tried and tested for these rather extreme cases, the approaches discussed in this thesis are well suited also for other applications with less strict constraints.
No. 13 - Eric Elzenheimer
Eric Elzenheimer: Analyse stimulationsevozierter Muskel- und Nervensignale mithilfe elektrischer und magnetischer Sensorik
Shaker-Verlag, 2022
Commission
- Prof. Dr.-Ing. Gerhard Schmidt
(first reviewer) - Priv.-Doz. Dr. med. Helmut Laufs
(second reviewer) - Univ.-Prof. Dr.-Ing. Daniel Baumgarten
(third reviewer) - Prof. Dr.-Ing. Michael Höft
(examiner) - Prof. Dr.-Ing. Eckhard Quandt
(head of the examination board)
Abstract
The prevalence of polyneuropathies (PNPs), neurological diseases, in people over 50 years old is 5.5 %. Such systemic diseases of the peripheral nerves can be categorized into inherited metabolic, acquired metabolic, immune-mediated, and toxic forms. Medical doctors must be able to differentiate among these forms to determine which type of therapy is needed. The burden on patients and the costs to health care providers may vary considerably, depending on the therapy administered. Motor nerve conduction studies assess compound muscle action potentials (CMAPs) by using neurography, from which neurophysiological variables are derived. These are used in addition to clinical evaluations to distinguish between the different etiologies. Despite the existing applications of neurography, current analytical strategies for PNP differentiation are inadequate for differential diagnostics, and improvements are needed. To overcome these problems, digital signal processing methods and approaches that can support medical doctors making clinical decisions are presented in this thesis. The focus of these efforts was to quantitatively describe pathological CMAP signal differences without additional effort so that diagnoses can be made in a timely manner. In this context, a system-theoretical signal model was also developed to describe various physiological and pathological processes in human nerves. This model enables realistic insights into the pathophysiology of polyneuropathies.
In principle, electrode-based neurography can be complemented by magnetic detection. The use of novel magnetic field sensors would require a more precise inspection in the field of neurophysiology. These sensors facilitate contactless data acquisition, advantageous when compared with conventional methods, which require electrodes. However, the pilot measurements of nerves and muscles presented in this study revealed some limitations, specifically for non-cryogenic magnetic field sensors. The observed disadvantages mainly resulted from the measurement bandwidth they were able to support and the available detection limit. Consequently, the use of these magnetic field sensors my be more suitable for other medical applications, for example cardiology is particularly noteworthy here since the signal with the highest field amplitude originates from the human heart. Finally, in a dedicated field study, the magnetic equivalent of a human R-wave was successfully detected within one minute for the first time by using a magnetoelectric ME sensor. This affirms the hypothesis that ME sensors are valuable in magnetic diagnostics, promoting further development of this particular sensor type. Finally, sensor-specific advancements combined with digital readout techniques could advance magnetic detection in neurophysiology.
In this collaborative engineering and neuroscience work, the research methods utilized provide a in-depth assessment of nerves and may therefore be valuable for performing diagnostic tests in the long term. The experiments and results presented in this research represent the foundation of technical concepts and analytical procedures necessary for a semiautomated disease classification system in clinical practice. An interdisciplinary team of researchers and an international manufacturer of neurography equipment have already joined forces to make such a system a reality in the form of a diagnostic tool.
More Articles ...
Page 1 of 4