
Speech in Mobile and Pervasive Environments.
Title:
Speech in Mobile and Pervasive Environments.
Author:
Rajput, Nitendra.
ISBN:
9781119961703
Personal Author:
Edition:
1st ed.
Physical Description:
1 online resource (309 pages)
Series:
Wireless Communications and Mobile Computing ; v.30
Wireless Communications and Mobile Computing
Contents:
Speech in Mobile and Pervasive Environments -- Contents -- About the Series Editors -- List of Contributors -- Foreword -- Preface -- Acknowledgments -- 1 Introduction -- 1.1 Application design -- 1.2 Interaction modality -- 1.3 Speech processing -- 1.4 Evaluations -- 2 Mobile Speech Hardware: The Case for Custom Silicon -- 2.1 Introduction -- 2.2 Mobile hardware: Capabilities and limitations -- 2.2.1 Looking inside a mobile device: Smartphone example -- 2.2.2 Processing limitations -- 2.2.3 Memory limitations -- 2.2.4 Power limitations -- 2.2.5 Silicon technology and mobile hardware -- 2.3 Profiling existing software systems -- 2.3.1 Speech recognition overview -- 2.3.2 Profiling techniques summary -- 2.3.3 Processing time breakdown -- 2.3.4 Memory usage -- 2.3.5 Power and energy breakdown -- 2.3.6 Summary -- 2.4 Recognizers for mobile hardware: Conventional approaches -- 2.4.1 Reduced-resource embedded recognizers -- 2.4.2 Network recognizers -- 2.4.3 Distributed recognizers -- 2.4.4 An alternative approach: Custom hardware -- 2.5 Custom hardware for mobile speech recognition -- 2.5.1 Motivation -- 2.5.2 Hardware implementation: Feature extraction -- 2.5.3 Hardware implementation: Feature scoring -- 2.5.4 Hardware implementation: Search -- 2.5.5 Hardware implementation: Performance and power evaluation -- 2.5.6 Hardware implementation: Summary -- 2.6 Conclusion -- Bibliography -- 3 Embedded Automatic Speech Recognition and Text-to-Speech Synthesis -- 3.1 Automatic speech recognition -- 3.2 Mathematical formulation -- 3.3 Acoustic parameterization -- 3.3.1 Landmark-based approach -- 3.4 Acoustic modeling -- 3.4.1 Unit selection -- 3.4.2 Hidden Markov models -- 3.5 Language modeling -- 3.6 Modifications for embedded speech recognition -- 3.6.1 Feature computation -- 3.6.2 Likelihood computation -- 3.7 Applications -- 3.7.1 Car navigation systems.
3.7.2 Smart homes -- 3.7.3 Interactive toys -- 3.7.4 Smartphones -- 3.8 Text-to-speech synthesis -- 3.9 Text to speech in a nutshell -- 3.10 Front end -- 3.11 Back end -- 3.11.1 Rule-based synthesis -- 3.11.2 Data-driven synthesis -- 3.11.3 Statistical parameteric speech synthesis -- 3.12 Embedded text-to-speech -- 3.13 Evaluation -- 3.14 Summary -- Bibliography -- 4 Distributed Speech Recognition -- 4.1 Elements of distributed speech processing -- 4.2 Front-end processing -- 4.2.1 Device requirements -- 4.2.2 Transmission issues in DSR -- 4.2.3 Back-end processing -- 4.3 ETSI standards -- 4.3.1 Basic front-end standard ES 201 108 -- 4.3.2 Noise-robust front-end standard ES 202 050 -- 4.3.3 Tonal-language recognition standard ES 202 211 -- 4.4 Transfer protocol -- 4.4.1 Signaling -- 4.4.2 RTP payload format -- 4.5 Energy-aware distributed speech recognition -- 4.6 ESR, NSR, DSR -- Bibliography -- 5 Context in Conversation -- 5.1 Context modeling and aggregation -- 5.1.1 An example of composer specification -- 5.2 Context-based speech applications: Conspeakuous -- 5.2.1 Conspeakuous architecture -- 5.2.2 B-Conspeakuous -- 5.2.3 Learning as a source of context -- 5.2.4 Implementation -- 5.2.5 A tourist portal application -- 5.3 Context-based speech applications: Responsive information architect -- 5.4 Conclusion -- Bibliography -- 6 Software: Infrastructure, Standards, Technologies -- 6.1 Introduction -- 6.2 Mobile operating systems -- 6.3 Voice over internet protocol -- 6.3.1 Implications for mobile speech -- 6.3.2 Sample speech applications -- 6.3.3 Access channels -- 6.4 Standards -- 6.5 Standards: VXML -- 6.6 Standards: VoiceFleXML -- 6.6.1 Brief overview of speech-based systems -- 6.6.2 System architecture -- 6.6.3 System architecture: VoiceFleXML interpreter -- 6.6.4 VoiceFleXML: Voice browser -- 6.6.5 A prototype implementation -- 6.7 SAMVAAD.
6.7.1 Background and problem setting -- 6.7.2 Reorganization algorithms -- 6.7.3 Minimizing the number of dialogs -- 6.7.4 Hybrid call-flows -- 6.7.5 Minimally altered call-flows -- 6.7.6 Device-independent call-flow characterization -- 6.7.7 SAMVAAD: Architecture, implementation and experiments -- 6.7.8 Splitting dialog call-flows -- 6.8 Conclusion -- 6.9 Summary and future work -- Bibliography -- 7 Architecture of Mobile Speech-Based and Multimodal Dialog Systems -- 7.1 Introduction -- 7.2 Multimodal architectures -- 7.3 Multimodal frameworks -- 7.4 Multimodal mobile applications -- 7.4.1 Mobile companion -- 7.4.2 MUMS -- 7.4.3 TravelMan -- 7.4.4 Stopman -- 7.5 Architectural models -- 7.5.1 Client-server systems -- 7.5.2 Dialog description systems -- 7.5.3 Generic model for distributed mobile multimodal speech systems -- 7.6 Distribution in the Stopman system -- 7.7 Conclusions -- Bibliography -- 8 Evaluation of Mobile and Pervasive Speech Applications -- 8.1 Introduction -- 8.1.1 Spoken interaction -- 8.1.2 Mobile-use context -- 8.1.3 Speech and mobility -- 8.2 Evaluation of mobile speech-based systems -- 8.2.1 User interface evaluation methodology -- 8.2.2 Technical evaluation of speech-based systems -- 8.2.3 Usability evaluations -- 8.2.4 Subjective metrics and objective metrics -- 8.2.5 Laboratory and field studies -- 8.2.6 Simulating mobility in the laboratory -- 8.2.7 Studying social context -- 8.2.8 Long- and short-term studies -- 8.2.9 Validity -- 8.3 Case studies -- 8.3.1 STOPMAN evaluation -- 8.3.2 TravelMan evaluation -- 8.3.3 Discussion -- 8.4 Theoretical measures for dialog call-flows -- 8.4.1 Introduction -- 8.4.2 Dialog call-flow characterization -- 8.4.3 (m,q,a)-characterization -- 8.4.4 (m,q,a)-complexity -- 8.4.5 Call-flow analysis using (m,q,a)-complexity -- 8.5 Conclusions -- Bibliography -- 9 Developing Regions.
9.1 Introduction -- 9.2 Applications and studies -- 9.2.1 VoiKiosk -- 9.2.2 HealthLine -- 9.2.3 The spoken web -- 9.2.4 TapBack -- 9.3 Systems -- 9.4 Challenges -- Bibliography -- Index.
Abstract:
In this book, the authors' address the issues related to speech processing on resource constrained mobile devices. These include speech recognition in noisy environments, specialized hardware for speech recognition and synthesis, the use of context to enhance recognition and user experience, and the emerging software standards required for interoperability. In addition, the book takes a multi-disciplinary look at these matters, while offering an insight into the opportunities and challenges of speech processing in mobile environs. In developing regions, speech-on-mobile is set to play a momentous role, socially and economically; the authors discuss how voice-based solutions and applications offer a compelling and natural solution in this setting. Key Features: Provides an overview of all speech technology related topics in the context of mobility Brings together the latest research in a logically connected way in a single volume Covers hardware, embedded recognition and synthesis, distributed speech recognition, software technologies, and contextual interfaces Discusses multimodal dialogue systems and their evaluation Introduces speech in mobile and pervasive environments for developing regions This book provides a comprehensive overview for beginners and experts alike. It can be used as a textbook for advanced undergraduate and postgraduate students in electrical engineering and computer science. Students, practitioners or researchers in the areas of mobile computing, speech processing, voice applications, human-computer interfaces, and information and communication technologies will also find this reference insightful. For experts in the above domains, this book complements their strengths. In addition, the book will serve as a guide to practitioners working in telecom-related industries.
Local Note:
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
Genre:
Added Author:
Electronic Access:
Click to View