20th International Conference on Speech and Computer


Tanja Schultz

Cognitive Systems Lab, University of Bremen, and Carnegie Mellon University, Pittsburgh

Biosignal-based Spoken Communication

Abstract: Speech is a complex process emitting a wide range of biosignals, including, but not limited to, acoustics. These biosignals – stemming from the articulators, the articulator muscle activities, the neural pathways, and the brain itself – can be used to circumvent limitations of conventional speech processing in particular, and to gain insights into the process of speech production in general.

In my talk I will present ongoing research at the Cognitive Systems Lab (CSL), where we apply machine learning methods to process and interpret a variety of speech-related activities such as muscle and brain activities with the goal of creating biosignal-based speech processing devices for communication applications in everyday situations and of gaining a deeper understanding of spoken communication. Several applications will be described such as Silent Speech Interfaces that rely on articulatory muscle movements captured by electromyography to recognize and synthesize silently produced speech, Brain-to-text interfaces that use brain activity captured by electrocorticography to eventually recognize imagined speech, as well as brain computer interfaces based on near infrared spectroscopy.

Biography: Tanja Schultz received her doctoral and diploma degree in Informatics from University of Karlsruhe, Germany, in 2000 and 1995. Prior to these degrees she completed the state exam in Mathematics, Sports, Physical and Educational Science from Heidelberg University, Germany in 1989. She joined Carnegie Mellon University, Pittsburgh, PA in 2000 and is an adjunct Research Professor at the Language Technologies Institute. From 2007 to 2015 she was a Full Professor in Informatics at the Karlsruhe Institute of Technology (KIT) in Germany before she became a Professor for Cognitive Systems at the University of Bremen, Germany in April 2015. Since 2007, she directs the Cognitive Systems Lab, where her research activities include multilingual speech processing and the processing, recognition, and interpretation of biosignals for human-centered technologies and applications.

Dr. Schultz is an Associate Editor of ACM Transactions on Asian Language Information Processing (since 2010), serves on the Editorial Board of Speech Communication (since 2004), and was Associate Editor of IEEE Transactions on Speech and Audio Processing (2002-2004). She was President (2014-2015) and elected Board Member (2006-2013) of ISCA, and a General Co-Chair of Interspeech 2006. She was elevated to Fellow of ISCA (2016) and to member of the European Academy of Sciences and Arts (2017). Dr. Schultz was the recipient of the Otto Haxel Award (2013), the Alcatel Lucent Award for Technical Communication (2012), the PLUX Wireless Biosignals Award (2011), the Allen Newell Medal for Research Excellence (2002), and received the ISCA / EURASIP Speech Communication Best paper awards in 2001 and 2015.

Sebastian Möller

Quality and Usability Lab, TU Berlin, and German Research Center for Artificial Intelligence, DFKI

Quality Engineering of Speech and Language Services

Abstract: Speech- and language-based services have reached a high popularity, reaching an increasing group of users. In order to guarantee high acceptance on the long run, the quality and user experience has to be considered in the service development cycle in a systematic way. The term “quality engineering” has been used as an umbrella term for such systematic approaches.

In the talk, such approaches will be illustrated for two exemplary services. The first one is a spoken dialogue service where (synthetic) speech production, dialogue management, and speech perception need to be considered from a human quality point-of-view. Instrumental prediction of text-to-speech quality, dialogue simulation, as well as instrumental recognition of speech and speaker characteristics are building blocks to ensure high quality experience. The second service is a language translation service, in which automatic as well as human intelligence is combined in order to reach best possible outcome. Here, crowdsourcing approaches are used to translate and evaluate language data. For each of the two services, experimental data is presented which illustrates the state-of-the-art performance, but also open research questions which need to be answered to improve quality and user experience.

Biography: Sebastian Möller studied electrical engineering at the universities of Bochum (Germany), Orléans (France) and Bologna (Italy). From 1994 to 2005, he held the position of a scientific researcher at the Institute of Communication Acoustics (IKA), Ruhr-University Bochum, and worked on speech signal processing, speech technology, communication acoustics, as well as on speech communication quality aspects. From 2005 to 2015, he worked at Telekom Innovation Laboratories, an An-Institut of TU Berlin. He was appointed Full Professor for the subject "Quality and Usability" at TU Berlin in April 2007. From 2015 to 2017, he was Vice Dean for Research of the Faculty for Electrical Engineering and Computer Science at TU Berlin, and since April 2017, he serves as the Dean of this faculty. He also leads the research department "Language Technology" at the German Research Center for Artificial Intelligence, DFKI.

He received a Doctor-of-Engineering degree at Ruhr-University Bochum in 1999 for his work on the assessment and prediction of speech quality in telecommunications. In 2000, he was a guest scientist at the IDIAP in Martigny (Switzerland) where he worked on the quality of speech recognition systems. He gained the qualification needed to be a professor (venia legendi) at the Faculty of Electrical Engineering and Information Technology at Ruhr-University Bochum in 2004, with a book on the quality of telephone-based spoken dialogue systems. He worked as a Visiting Fellow/Visiting Professor at MARCS Auditory Laboratories, University of Western Sydney, at the Universidad de Granada (Spain), at the Ben Gurion University of the Negev in Be'er Sheva (Israel), and at NTNU in Trondheim (Norway). Since 2012, he is Adjunct Professor at the University of Canberra. His most recent book on "Quality Engineering" was published in 2010, and his co-edited book on "Quality of Experience: Advanced Concepts, Applications and Methods" in 2014.
Sebastian Möller was awarded the GEERS prize in 1998 for his interdisciplinary work on the analysis of infant cries for early hearing-impairment detection, the ITG prize of the German Association for Electrical, Electronic & Information Technologies (VDE) in 2001, the Lothar-Cremer prize of the German Acoustical Association (DEGA) in 2003, a Heisenberg fellowship of the German Research Foundation (DFG) in 2005, and the Johann Philipp Reis prize in 2009. Since 1997, he has taken part in the standardisation activities of the International Telecommunication Union (ITU-T) on transmission performance of telephone networks and terminals. He was acting as a Rapporteur for question Q.8/12 from 2001-2016. He headed the special interest group on speech acoustics of DEGA from 2009 to 2015, is board member of the ITG since 2015, and of the International Speech Communication Association (ISCA) since 2016. He served as General Chair for Interspeech 2015 in Dresden.

Dongheui Lee

Technical University of Munich

Robot learning through Physical Interaction and Human Guidance

Abstract: As a fundamental cornerstone in the development of intelligent robotic assistants, the research community on robot learning has addressed autonomous motor skill learning and control in complex task scenarios by working on a variety of fundamental sub-problems: movement primitive representation, reaction and adaptation, the link between perception and action, learning under supervision, and learning from self-practice. Imitation learning provides an efficient way to learn new skills through human guidance, which can reduce time and cost to program the robot. Robot learning architectures can provide a comprehensive framework for learning, recognition and reproduction of whole body motions. Also, the architecture can be integrated with different types of teaching modalities and be applied even in situations with incomplete measurement data. The inference mechanism can support not only to learn the robot's free body motion but also to learn physical interaction tasks, including human robot interaction. I will discuss incremental learning in different problem domains including the refinement of learned skills via heterogeneous learning modalities, enhancement of human-robot cooperation tasks over time, and improvement of stability in bipedal walking by iterative learning control. Empirical evaluation on several robotic systems will illustrate the effectiveness and applicability to learn control of high-dimensional anthropomorphic robots.

Biography: Dongheui Lee is Associate Professor of Human-centered Assistive Robotics at the TUM Department of Electrical and Computer Engineering. She is also director of a Human-centered assistive robotics group at the German Aerospace Center (DLR). Her research interests include human motion understanding, human robot interaction, machine learning in robotics, and assistive robotics. Prior to her appointment as Associate Professor, she was an Assistant Professor at TUM (2009-2017), Project Assistant Professor at the University of Tokyo (2007-2009), and a research scientist at the Korea Institute of Science and Technology (KIST) (2001-2004). After completing her B.S. (2001) and M.S. (2003) degrees in mechanical engineering at Kyung Hee University, Korea, she went on to obtain a PhD degree from the department of Mechano-Informatics, University of Tokyo, Japan in 2007. She was awarded a Carl von Linde Fellowship at the TUM Institute for Advanced Study (2011) and a Helmholtz professorship prize (2015). She is coordinator of both the euRobotics Topic Group on physical Human Robot Interaction and of the TUM Center of Competence Robotics, Autonomy and Interaction.