The difference between speech and voice ー Computational description and control of sentiment information embedded in speech ー

>>>>>>>   Presentation Slides  <<<<<<<<

The Department of Linguistics, Faculty of Arts, Chulalongkorn University cordially invites you to a special lecture:

The difference between speech and voice :
Computational description and control of sentiment information embedded in speech 

By Prof. Yoshinori Sagisaka

Applied Mathematics Department, Global Information and Telecommunication Institute and Language and Speech Science Research Labs., Waseda University.

Tuesday 5 March 2019, 10.00 – 11.30

Room 501/5-7 Maha Chakri Sirindhorn Building, Faculty of Arts, Chulalongkorn University


In this talk, recent research activities are introduced in sentiment correlation analyses among speech, language and color studied at our research laboratory. Following the findings of fundamental correlations between communicative speech prosody and its impressions expressed by language, communicative F0 pattern is calculated. Using Japanese sentences consisting of adverbs showing magnitude, adjectives and final particles, the possibility of F0 control for communicative speech synthesis is shown. The other experimental trial is also introduced to show that this sentiment mapping between impression (language) and prosody (speech) can also be observed between speech and color. These two research studies provide a new paradigm of cross-modal computational modeling of sentiment information processing between speech, language and image. Furthermore, they indicate a new possibility of redefinition of “linguistic information” to define the difference between voice and speech, which enables computational treatment of linguistic notions such as lexicon and semantics.


Yoshinori Sagisaka has been a professor of Waseda University since 2001. He has been working in speech and language science and engineering field for more than forty years. During this period, he has worked at NTT Electrical Communication Res. Labs. (1975-1986), ATR (1986-2007), Edinburgh University CSTR (1988), AT&T Bell Labs. (1993), Kobe University (1997-2001), NICT (2007-2012) and Waseda University (2000- ). His research interests cover speech synthesis, prosody modeling, speech recognition, speech perception and language processing. He has been engaged in quite a few international research activities including IEEE Signal Processing Society Committee Member (1990-1994), Speech Communication Journal Editorial Board (1993-2009), France Telecom CNET(Centre Nationale d’Etudes des Telecommunications) Conseiller Scientifique (1993), KTH (Royal Institute of Stockholm University) CTT (Speech Technology Center) International Advisory Committee Member (1993-), Computer Speech and Language Journal Editorial Board (1994-2009), Natural Language Engineering Journal Editorial Board (1994-2004), Permanent Council of International Conferences on Spoken Language Processing Member (1998-), Speech Communication Journal Chief Co-Editor(2001-2004), International Speech Communication Association Board (2007-2011), International Congress of Phonetic Science Board (2007-2015) and contributed as an international scientific committee member of many speech related conferences and workshops. In 2008, he initiated AESOP (Asian English Speech cOrpus Project) research consortium to promote research on the 2nd language studies and collaborations among Asian countries. He is continuing his research on scientific understanding of prosody control and cross-modal information expression reflecting human sentiment characteristics.