<p dir="ltr"><b><u>Danish Sentence Test (DAST)</u></b></p><p><br></p><p dir="ltr">Abigail Anne Kressner [1,2,*], Kirsten Maria Jensen Rico [1,2], Johannes Kizach [1], Brian Kai Loong Man [1,3], Anja Kofoed Pedersen [4], Lars Bramsløw [3], Lise Bruun Hansen [3], Laura Winther Balling [4], Brent Kirkwood [5], & Tobias May [1]</p><p><br></p><p dir="ltr">1 Technical University of Denmark, Kgs. Lyngby, Denmark<br>2 Rigshospitalet, University Hospital of Copenhagen, Denmark|<br>3 Demant, Smørum, Denmark<br>4 WS Audiology, Lynge, Denmark<br>5 GN Hearing, Ballerup, Denmark<br>* Corresponding author: aakress@dtu.dk, abigail.anne.kressner@regionh.dk</p><p><br></p><p dir="ltr">This corpus is made up of audio and audio-visual recordings of 1200 linguistically balanced sentences, all of which are spoken by two female and two male talkers. The sentences were constructed using a novel, template-based method that facilitated control over both word frequency and sentence structure. The sentences were evaluated linguistically in terms of phonemic distributions, naturalness, and connotation, and thereafter, recorded, post-processed, and rated on their audio, visual, and pronunciation qualities. Sentences were then psychometrically evaluated for one of the talkers, both with the audio-only and audiovisual versions of each sentences. Green screen versions of the visual material can be made available upon request. Additional details are contained in the associated publications.</p><p><br></p><p><br></p><p dir="ltr"><b><u>Publications</u></b></p><p><br></p><p dir="ltr">Kressner, A. A., Jensen-Rico, K. M., Pedersen, A. K., Bramsløw, L., Kirkwood, B. (in preparation). An adaptive Danish Sentence Test (DAST) for measuring speech reception thresholds.</p><p><br></p><p dir="ltr">Man, B. K. L., Andersen, T. & Kressner, A. A. (in review). Measuring the audiovisual benefit in linguistically and psychoacoustically balanced sentences using the Audiovisual Danish Sentence Test. International Journal of Audiology.</p><p><br></p><p dir="ltr">Kressner, A. A., Jensen-Rico, K. M., Kofoed Pedersen, A., Bramsløw, L., & Kirkwood, B. (2025). Psychoacoustic characterisation of linguistically balanced, Danish sentences for speech-in-noise experiments. International Journal of Audiology, 1-9. doi: https://doi.org/10.1080/14992027.2025.2470378</p><p><br></p><p dir="ltr">Kressner, A. A., Rico, K. M. J., Kizach, J., Man, B. K. L., Pedersen, A. K., Bramsløw, L., Hansen, L. B., Balling, L. W., Kirkwood, B., May, T., “A corpus of audio-visual recordings of linguistically balanced, Danish sentences for speech-in-noise experiments”. In: Speech Communication (2024). doi: https://doi.org/10.1016/j.specom.2024.103141.</p><p><br></p><p><br></p><p dir="ltr"><b><u>License</u></b></p><p dir="ltr">This work is licensed under Attribution-NonCommercial-NoDerivatives 4.0 International. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/</p><p><br></p><p><br></p><p dir="ltr"><b><u>Contents</u></b></p><p dir="ltr">The following files are located in the top-level directory of this dataset:</p><p dir="ltr">• samples.zip - zip file containing the following subfolders, which each contain 1 mp4 file (i.e., one mp4 sample file with embedded audio): F1, F2, M1, and M2. Download this file to preview the stimuli and speakers before downloading the entire database.</p><p dir="ltr">• metadata.zip - zip file containing csv files with information about each sentence and list in the corpus.</p><p dir="ltr">o Corpus_DAST.csv - includes information about each sentence, the words contained within it; the specific template employed; the keywords in the sentence; the results of the screening of the audio (A), visual (V), and pronunciation (P) qualities for recordings from Female 1 (F1), Female 2 (F2), Male 1 (M1), and Male 2 (M2) (i.e., either 0 for not sufficient for speech-in-noise testing or 1 for sufficient); the mean naturalness score from the online survey; the number of survey participants who indicated that the sentence evoked discomfort; the number of survey responses; and the psychometric function properties for the acoustic (A) and audiovisual (AV) versions of each sentence. See publications for more details.</p><p dir="ltr">o List_F1A.csv - includes information about a set of balanced lists, the sentences they consist of, and the RMS adjustments that should be applied to each sentence in order to intelligibility normalise the sentences. The manuscript describing the creation of these lists, and the listener study to validate these lists, is in preparation and can be made available upon request.</p><p dir="ltr">o List_F1AV.csv - includes information about a set of balanced lists, the sentences they consist of, and the RMS adjustments that should be applied to each sentence in order to intelligibility normalise the AV sentences. See Man et al. (in review) for more details.</p><p dir="ltr">o List_F1AV_offsetsA.csv - includes information about the same lists as in List_F1AV.csv, but with offsets that are computed based on the psychometric function estimates of the audio-only sentences. See Man et al. (in review) for more details.</p><p dir="ltr">• speech.zip - zip file containing the speech material. It contains the following subfolders, which each contain 1200 wav files (i.e., one wav file for each sentence): F1, F2, M1, and M2.</p><p dir="ltr">• noise.zip - zip file containing the noise material. The noise folder contains a wav file with speech-shaped noise (SSN) for each talker (i.e., noise that has been shaped to have the same long-term spectrum of the associated talker). There is also one wav file with unmodulated noise.</p><p dir="ltr">• video.zip - zip file containing the video material. It contains the following subfolders, which each contain 1200 mp4 files (i.e., one mp4 file for each sentence): F1, F2, M1, and M2. Note that the audio is *not* embedded in these files, and instead included separately in wav files, in order to avoid distortions by the lossy audio codecs.</p><p><br></p><p dir="ltr"><b><u>Versions</u></b></p><p dir="ltr">• Version 4 (2025-09-08)</p><p dir="ltr">o Updated readme file to describe tracked changes</p><p dir="ltr">o Revised title from "Danish Sentence Test (DAST) Sentences" to "Danish Sentence Test (DAST)"</p><p dir="ltr">o Revised file organization to coincide with updates to the software</p><p dir="ltr">o Updated publication list</p><p dir="ltr">o Updated csv to include psychometric function estimates</p><p dir="ltr">• Version 3 (2025-05-06)</p><p dir="ltr">o Added publication list</p><p dir="ltr">o Updated the audio files to be the post-processed versions, as described in the above paper</p><p dir="ltr">• Version 2 (2024-05-06)</p><p dir="ltr">o Added DAST_sentences_samples.zip</p><p dir="ltr">• Version 1 (2023-10-09)</p><p dir="ltr">o First date online with preliminary files and the associated article in review</p><p><br></p>