The SUMS Copus


The "SUMS" (Speech Under Multiple Stressors) corpus allows investigating the effects of multiple stressors to speech. This corpus addresses the questions on which basis and in which way all the different stress types shine through and combine in the acoustic speech signal. We used several stressors that form a taxonomy of stress factors according to Murray et al. (1996). Pink noise served as an external stressor. A further stress factor, cognitive load, was created by asking quiz questions. Physiological stressors were induced by training on an ergometer and the application of a respirator mask (full face mask). The speech signals produced by German native speakers while answering the quiz and further reference questions. The results of our acoustic analysis allow drawing conclusions on if and how stress factors can be distinguished from each other, interfere with each other, and/or add up in the speech signal. Furthermore, we touch upon the issue whether measurable stress can increase ad infinitum or whether there is an upper limit for the manifestation of stress in speech.

The corpus will be annotated on altogether 9 levels:

  • Level 1: Orthographic annotation on the sentence level
  • Level 2: ProsodyPro-label on the sentence level
  • Level 3: Classification in intonation phrases
  • Level 4: ProsodyPro-label of the target IP
  • Level 5: Target word
  • Level 6: ProsodyPro-label target word
  • Level 7: Target phon
  • Level 8: ProsodyPro-label target phon
  • Level 9: Discontinuities


Some impressions (by Laura Tiedtke)


To find out whether individual parameters behave differently, they were first fed alone. Subsequently, several were induced together until all of the parameters mentioned interacted. In order to avoid a serialization effect, if it were present, the recordings were made in two different versions.

Group 1 Explanation Group 2
Condition 1 Without sound, without physical exertion, without mask Condition 8
Condition 2 With sound, without physical exertion, without mask Condition 7
Condition 3 Without sound, with physical exertion, without mask Condition 6
Condition 4 With noise, with physical exertion, without mask Condition 5
Condition 5 With noise, with physical exertion, with mask Condition 4
Condition 6 Without sound, with physical exertion, with mask Condition 3
Condition 7 With sound, without physical exertion, with mask Condition 2
Condition 8 Without sound, without physical exertion, with mask Condition 1


Key Features

The circumference of the corpus is approximately 30 minutes pure semi-spontaneous speech recordings of the subjects during the quiz paradigm. The individual durations vary depending on how long the question was and to what extent the test person has responded. In all, the corpus had 6 speakers of the standard German, of which one person is a woman. Five of the six speakers are North Germans, while one comes from the south of Germany. However, this speaker has been living in North Germany for many years. The age of the subjects was between 21 and 49 years. The average age was 35.14 years. Whereby the oldes participant had tob e taken out of the corpus due to technical problems. Thus the age oft he subjects was 21 to 42 years and the average age was 32.83 years.


State of the Corpus

You can find the current state of the corpus here.



The corpus can be used for non-commercial research purposes. Details can be found here.



Example 1. Example 2.


Creators of the Data Base

The data base was created as a joint work between Kiel University (CAU) and the University of Southern Denmark (SDU, Mads Clausen Institute). Involved researchers are:

  • Carina Marquard (CAU)
  • Oliver Niebuhr (SDU)
  • Gerhard Schmidt (CAU)


Corresponding Publications


C. Marquard, C. Baasch, M. Brodersen, O. Niebuhr, and G. Schmidt: Speech, Think, Act: A Phonetic Analysis of the Combinatorial Effects of Respiratory Mask, Physical and Cognitive Stress on Phonation and Articulation, Proc. DAGA, Kiel, Germany, 2017