"Voice Theft": Chances and risks of digital voice technology

Responsible: Volker Dellwo, Sascha Frühholz
Duration of the project: Dec. 2018 - June 2020

Project Description:

Contemporary voice processing technology offers novel chances and risks to digital infrastructures. In this project, we will address key issues on digital, cognitive- and neural-perceptual processing of manipulated voices on humans. From a ‘chances’ perspective we will study how digital voice manipulation can be used to enhance voice technology . From a ‘risks’ perspective we will study the fraud potential of manipulated voices in humans and machines. The results of our research will be fundamental in understanding the chances and risks in human-machine voice interaction and in the creation of safe digital voice technology. State-of-the-art acoustic-phonetic voice manipulation algorithms will be used to manipulate and understand the acoustic cues to personality in voice and - using behavioral and functional Magnetic Resonance Imaging (fMRI) techniques - the differences in human perception of natural and manipulated signals will be investigated. Our research will make a significant contribution in understanding the true chances, risks and possible threats of digital voice manipulations on industrial and social digital infrastructures in which voice identity is at stake and in understanding the trust that user have in such digital infrastructure. The project is critical to numerous industry sectors seeking to apply voice technology in the future for civil or forensic purposes. As such the project is fully in line with the central issues of the Digital Lives grant call and it has strong practical implications to the trust and ethics and digital economy and working life, two key areas in the National Research Program on Digitalization of the State Secretary for Education. The topic is also central for key strategic research decisions at UZH as part of the Communication Section in the Digital Society Initiative (www.dsi.uzh.ch) and it will play a central role in the inter-disciplinary creation of a Center for Voice Analysis at UZH.

Publications (or conference presentation):


E. Pellegrino, T. Kathiresan, C. Roswandovitz, S.Fruholz, V. Dellwo, Can prosody be the key to spot fake voices? Acoustic and automatic speaker verification analyses on digital and natural voices, 14-17 July 2019: International Conference of the International Association for Forensic Phonetics and Acoustics, Istanbul (TR)


Keywords: digital voice technology, automatic voice recognition, voice synthesis, human voice processing, neural voice decoding

Funding source(s): Swiss National Science Foundation