Cover image for Computational Paralinguistics : Emotion, Affect and Personality in Speech and Language Processing.
Computational Paralinguistics : Emotion, Affect and Personality in Speech and Language Processing.
Title:
Computational Paralinguistics : Emotion, Affect and Personality in Speech and Language Processing.
Author:
Schuller, Björn.
ISBN:
9781118706633
Personal Author:
Edition:
1st ed.
Physical Description:
1 online resource (345 pages)
Contents:
COMPUTATIONAL PARALINGUISTICS -- Contents -- Preface -- Acknowledgements -- List of Abbreviations -- Part I Foundations -- 1 Introduction -- 1.1 What is Computational Paralinguistics? A First Approximation -- 1.2 History and Subject Area -- 1.3 Form versus Function -- 1.4 Further Aspects -- 1.4.1 The Synthesis of Emotion and Personality -- 1.4.2 Multimodality: Analysis and Generation -- 1.4.3 Applications, Usability and Ethics -- 1.5 Summary and Structure of the Book -- References -- 2 Taxonomies -- 2.1 Traits versus States -- 2.2 Acted versus Spontaneous -- 2.3 Complex versus Simple -- 2.4 Measured versus Assessed -- 2.5 Categorical versus Continuous -- 2.6 Felt versus Perceived -- 2.7 Intentional versus Instinctual -- 2.8 Consistent versus Discrepant -- 2.9 Private versus Social -- 2.10 Prototypical versus Peripheral -- 2.11 Universal versus Culture-Specific -- 2.12 Unimodal versus Multimodal -- 2.13 All These Taxonomies - So What? -- 2.13.1 Emotion Data: The FAU AEC -- 2.13.2 Non-native Data: The C-AuDiT corpus -- References -- 3 Aspects of Modelling -- 3.1 Theories and Models of Personality -- 3.2 Theories and Models of Emotion and Affect -- 3.3 Type and Segmentation of Units -- 3.4 Typical versus Atypical Speech -- 3.5 Context -- 3.6 Lab versus Life, or Through the Looking Glass -- 3.7 Sheep and Goats, or Single Instance Decision versus Cumulative Evidence and Overall Performance -- 3.8 The Few and the Many, or How to Analyse a Hamburger -- 3.9 Reifications, and What You are Looking for is What You Get -- 3.10 Magical Numbers versus Sound Reasoning -- References -- 4 Formal Aspects -- 4.1 The Linguistic Code and Beyond -- 4.2 The Non-Distinctive Use of Phonetic Elements -- 4.2.1 Segmental Level: The Case of /r/ Variants -- 4.2.2 Supra-segmental Level: The Case of Pitch and Fundamental Frequency - and of Other Prosodic Parameters.

4.2.3 In Between: The Case of Other Voice Qualities, Especially Laryngealisation -- 4.3 The Non-Distinctive Use of Linguistics Elements -- 4.3.1 Words and Word Classes -- 4.3.2 Phrase Level: The Case of Filler Phrases and Hedges -- 4.4 Disfluencies -- 4.5 Non-Verbal, Vocal Events -- 4.6 Common Traits of Formal Aspects -- References -- 5 Functional Aspects -- 5.1 Biological Trait Primitives -- 5.1.1 Speaker Characteristics -- 5.2 Cultural Trait Primitives -- 5.2.1 Speech Characteristics -- 5.3 Personality -- 5.4 Emotion and Affect -- 5.5 Subjectivity and Sentiment Analysis -- 5.6 Deviant Speech -- 5.6.1 Pathological Speech -- 5.6.2 Temporarily Deviant Speech -- 5.6.3 Non-native Speech -- 5.7 Social Signals -- 5.8 Discrepant Communication -- 5.8.1 Indirect Speech, Irony, and Sarcasm -- 5.8.2 Deceptive Speech -- 5.8.3 Off-Talk -- 5.9 Common Traits of Functional Aspects -- References -- 6 Corpus Engineering -- 6.1 Annotation -- 6.1.1 Assessment of Annotations -- 6.1.2 New Trends -- 6.2 Corpora and Benchmarks: Some Examples -- 6.2.1 FAU Aibo Emotion Corpus -- 6.2.2 aGender Corpus -- 6.2.3 TUM AVIC Corpus -- 6.2.4 Alcohol Language Corpus -- 6.2.5 Sleepy Language Corpus -- 6.2.6 Speaker Personality Corpus -- 6.2.7 Speaker Likability Database -- 6.2.8 NKI CCRT Speech Corpus -- 6.2.9 TIMIT Database -- 6.2.10 Final Remarks on Databases -- References -- Part II Modelling -- 7 Computational Modelling of Paralinguistics: Overview -- References -- 8 Acoustic Features -- 8.1 Digital Signal Representation -- 8.2 Short Time Analysis -- 8.3 Acoustic Segmentation -- 8.4 Continuous Descriptors -- 8.4.1 Intensity -- 8.4.2 Zero Crossings -- 8.4.3 Autocorrelation -- 8.4.4 Spectrum and Cepstrum -- 8.4.5 Linear Prediction -- 8.4.6 Line Spectral Pairs -- 8.4.7 Perceptual Linear Prediction -- 8.4.8 Formants -- 8.4.9 Fundamental Frequency and Voicing Probability.

8.4.10 Jitter and Shimmer -- 8.4.11 Derived Low-Level Descriptors -- References -- 9 Linguistic Features -- 9.1 Textual Descriptors -- 9.2 Preprocessing -- 9.3 Reduction -- 9.3.1 Stopping -- 9.3.2 Stemming -- 9.3.3 Tagging -- 9.4 Modelling -- 9.4.1 Vector Space Modelling -- 9.4.2 On-line Knowledge -- References -- 10 Supra-segmental Features -- 10.1 Functionals -- 10.2 Feature Brute-Forcing -- 10.3 Feature Stacking -- References -- 11 Machine-Based Modelling -- 11.1 Feature Relevance Analysis -- 11.2 Machine Learning -- 11.2.1 Static Classification -- 11.2.2 Dynamic Classification: Hidden Markov Models -- 11.2.3 Regression -- 11.3 Testing Protocols -- 11.3.1 Partitioning -- 11.3.2 Balancing -- 11.3.3 Performance Measures -- 11.3.4 Result Interpretation -- References -- 12 System Integration and Application -- 12.1 Distributed Processing -- 12.2 Autonomous and Collaborative Learning -- 12.3 Confidence Measures -- References -- 13 'Hands-On': Existing Toolkits and Practical Tutorial -- 13.1 Related Toolkits -- 13.2 openSMILE -- 13.2.1 Available Feature Extractors -- 13.3 Practical Computational Paralinguistics How-to -- 13.3.1 Obtaining and Installing openSMILE -- 13.3.2 Extracting Features -- 13.3.3 Classification and Regression -- References -- 14 Epilogue -- Appendix -- A.1 openSMILE Feature Sets Used at Interspeech Challenges -- A.2 Feature Encoding Scheme -- References -- Index.
Abstract:
This book presents the methods, tools and techniques that are currently being used to recognise (automatically) the affect, emotion, personality and everything else beyond linguistics ('paralinguistics') expressed by or embedded in human speech and language. It is the first book to provide such a systematic survey of paralinguistics in speech and language processing. The technology described has evolved mainly from automatic speech and speaker recognition and processing, but also takes into account recent developments within speech signal processing, machine intelligence and data mining. Moreover, the book offers a hands-on approach by integrating actual data sets, software, and open-source utilities which will make the book invaluable as a teaching tool and similarly useful for those professionals already in the field. Key features: Provides an integrated presentation of basic research (in phonetics/linguistics and humanities) with state-of-the-art engineering approaches for speech signal processing and machine intelligence. Explains the history and state of the art of all of the sub-fields which contribute to the topic of computational paralinguistics. C overs the signal processing and machine learning aspects of the actual computational modelling of emotion and personality and explains the detection process from corpus collection to feature extraction and from model testing to system integration. Details aspects of real-world system integration including distribution, weakly supervised learning and confidence measures. Outlines machine learning approaches including static, dynamic and context‑sensitive algorithms for classification and regression. Includes a tutorial on freely available toolkits, such as the open-source 'openEAR' toolkit for emotion and affect recognition co-developed by one of the authors, and a listing of standard

databases and feature sets used in the field to allow for immediate experimentation enabling the reader to build an emotion detection model on an existing corpus.
Local Note:
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
Added Author:
Electronic Access:
Click to View
Holds: Copies: