Cover image for Parallel Programming with Intel Parallel Studio XE.
Parallel Programming with Intel Parallel Studio XE.
Title:
Parallel Programming with Intel Parallel Studio XE.
Author:
Blair-Chappell , Stephen.
ISBN:
9781118221136
Personal Author:
Edition:
1st ed.
Physical Description:
1 online resource (556 pages)
Contents:
Parallel Programming with Intel Parallel Studio XE -- Contents -- Foreword -- Introduction -- Part I: An Introduction to Parallelism -- Chapter 1: Parallelism Today -- The Arrival of Parallelism -- The Power Density Race -- The Emergence of Multi-Core and Many-Core Computing -- The Top Six Challenges -- Legacy Code -- Tools -- Education -- Fear of Many-Core Computing -- Maintainability -- Return on Investment -- Parallelism and the Programmer -- Types of Parallelism -- Intel's Family of Parallel Models -- Cilk Plus and Threading Building Blocks -- Domain-Specific Libraries -- Established Standards -- Research and Development -- Choosing the Right Parallel Constructs -- High-Level vs. Low-Level Constructs -- Data Parallelism vs. General Parallelism -- Examples of Mixing and Matching Parallel Constructs -- Parallel Programming Errors -- Data Races -- Determinacy Races -- Deadlocks -- Poor Load Balancing -- Threading/Tasking Overhead -- Synchronization Overhead -- Memory Errors -- Speedup and Scalability -- Calculating Speedup -- Predicting Scalability -- Parallelism and Real-Time Systems -- Hard and Soft Real-Time -- A Hard Real-Time Example using RTX -- Advice for Real-Time Programmers -- Summary -- Chapter 2: An Overview of Parallel Studio XE -- Why Parallel Studio XE? -- What's in Parallel Studio XE? -- Intel Parallel Studio XE -- Intel Parallel Advisor -- The Advisor Workflow -- Surveying the Site -- Annotating Code -- Checking Suitability -- Checking Correctness -- Replacing Annotations -- Intel Parallel Composer XE -- Intel C/C++ Optimizing Compiler -- Profile-Guided Optimization -- Cilk Plus -- OpenMP -- Intel Threading Building Blocks -- Intel Integrated Performance Primitives -- An Application Example -- IPP and Threading -- Intel Parallel Debugger Extension -- Intel Debugger -- Math Kernel Library.

VTune Amplifier XE -- Hotspot Analysis -- Concurrency Analysis -- Locks and Waits Analysis -- Dissassembly Source View -- Parallel Inspector XE -- Predefined Analysis Types -- Errors and Warnings -- Static Security Analysis -- Different Approaches to Using Parallel Studio XE -- Summary -- Chapter 3: Parallel Studio XE for the Impatient -- The Four-Step Methodology -- Example 1: Working with Cilk Plus -- Obtaining a Suitable Serial Program -- Running the Serial Example Program -- Creating the Project -- Running the Serial Version of the Code -- Step 1: Analyze the Serial Program -- Using Intel Parallel Amplifier XE for Hotspot Analysis -- Step 2: Implement Parallelism using Cilk Plus -- Step 3: Debug and Check for Errors -- Checking for Errors -- Narrowing the Scope of the Shared Variables -- Adding Cilk Plus Reducers -- Running the Corrected Application -- Step 4: Tune the Cilk Plus Program -- Example 2: Working with OpenMP -- Step 1: Analyze the Serial Program -- Step 2: Implement Parallelism using OpenMP -- Step 3: Debug and Check for Errors -- Making the Shared Variables Private -- Adding a Reduction Clause -- Step 4: Tune the OpenMP Program -- Improving the Load Balancing -- Summary -- Part II Using Parallel Studio XE -- Chapter 4: Producing Optimized Code -- Introduction -- The Example Application -- Optimizing Code in Seven Steps -- Using the Compiler's Reporting Features -- Step 1: Build with Optimizations Disabled -- Step 2: Use General Optimizations -- Using the General Options on the Example Application -- Generating Optimization Reports Using /Qopt-report -- Step 3: Use Processor-Specific Optimizations -- What Is Auto-Vectorization? -- Auto-Vectorization Guidelines -- Turning On Auto-Vectorization -- Enhancing Auto-Vectorization -- Building for Non-Intel CPUs.

Determining That Auto-Vectorization Has Happened -- When Auto-Vectorization Fails -- Helping the Compiler to Vectorize -- Step 4: Add Interprocedural Optimization -- Adding Interprocedural Optimization to the Example Application -- The Impact of Interprocedural Optimization on Auto-Vectorization -- Step 5: Use Profile-Guided Optimization -- Benefits of Profile-Guided Optimization -- The Profile-Guided Optimization Steps -- The Results -- Step 6: Tune Auto-Vectorization -- Activating Guided Auto-Parallelization -- An Example Session -- More on Auto-Vectorization -- Building Applications to Run on More Than One Type of CPU -- Additional Ways to Insert Vectorization -- Using Cilk Plus Array Notation -- Manual CPU Dispatch: Rolling Your Own CPU-Specific Code -- Source Code -- Summary -- Chapter 5: Writing Secure Code -- A Simple Security Flaw Example -- Understanding Static Security Analysis -- False Positives -- Static Security Analysis Workflow -- Conducting a Static Security Analysis -- Investigating the Results of the Analysis -- Working with Problem States -- The Build Specification -- Creating a Build Specification File by Injection -- Utility Options -- The Directory Structure of the Results -- Using Static Security Analysis in a QA Environment -- Regression Testing -- Metrics Tracking -- Source Code -- Summary -- Chapter 6: Where to Parallelize -- Different Ways of Profiling -- The Example Application -- Hotspot Analysis Using the Intel Compiler -- Profiling Steps -- An Example Session -- Overhead Introduced by Profiling -- Hotspot Analysis Using the Auto-Parallelizer -- Profiling Steps -- An Example Session -- Programming Guidelines for Auto-Parallelism -- Additional Options -- Helping the Compiler to Auto-Parallelize -- Hotspot Analysis with Amplifier XE -- Conducting a Default Analysis.

Finding the Right Loop to Parallelize -- Large or Long-Running Applications -- Reducing the Size of Data Collected -- Using the Pause and Resume APIs -- Source Code -- Summary -- Chapter 7: Implementing Parallelism -- C or C++, That Is the Question -- Taking a Simple Approach -- The Beauty of Lambda Functions -- Parallelizing Loops -- The for Loop -- The Cilk Plus cilk_for Loop -- The OpenMP for Loop -- The TBB for Loop -- Nested for Loops -- The for Loop with Reduction -- Cilk Plus Reduction -- OpenMP Reduction -- TBB Reduction -- The while Loop -- Cilk Plus -- OpenMP -- TBB -- Parallelizing Sections and Functions -- The Serial Version -- Cilk Plus -- OpenMP -- TBB -- Parallelizing Recursive Functions -- The Serial Version -- Cilk Plus -- OpenMP -- TBB -- Parallelizing Pipelined Applications -- Parallel Pipelined Patterns -- The Serial Version -- OpenMP -- TBB -- Parallelizing Linked Lists -- Serial Iteration of the Linked List -- Parallel Iteration of the Linked List -- Source Code -- Summary -- Chapter 8: Checking for Errors -- Parallel Inspector XE Analysis Types -- Detecting Threading Errors -- Types of Threading Problems -- Thread Information -- Potential Privacy Infringement -- Data Races -- Deadlocks -- An Example Application Involving Deadlocks -- Detecting Deadlocks -- Detecting Data Races -- Running the Threaded Program -- First Results of the Analysis -- Controlling the Right Level of Detail -- Testing All the Code Paths -- Avoiding Being Overwhelmed by the Amount of Data -- Using Suppression Files -- Fixing Data Races -- Using Cilk Plus -- Cilk Plus Reducers -- Cilk Plus Holders -- Using OpenMP -- Using Locks -- Using Critical Sections -- Using Atomic Operations -- Using a reduction Clause -- Using TBB -- Detecting Memory Errors -- Types of Memory Errors.

An Example Application for Memory Analysis -- Creating a Custom Analysis -- The Source Code -- Summary -- Chapter 9: Tuning Parallel Applications -- Introduction -- Defining a Baseline -- Ensuring Consistency -- Measuring the Performance Improvements -- Measuring the Baseline Using the Amplifier XE Command Line -- Identifying Concurrency Hotspots -- Thread Concurrency and CPU Usage -- Identifying Hotspots in the Code -- Analyzing the Timeline -- Questions to Answer -- Fixing the Critical Section Hotspot -- Analyzing an Algorithm -- Conducting Further Analysis and Tuning -- Using Other Viewpoints -- Using Locks and Waits Analysis -- Other Analysis Types -- Using the Intel Software Autotuning Tool -- Source Code -- Summary -- Chapter 10: Parallel Advisor-Driven Design -- Using Parallel Advisor -- Understanding the Advisor Workflow -- Finding Documentation -- Getting Started with the NQueens Example Program -- Surveying the Site -- Running a Survey Analysis -- The Survey Report -- Finding Candidate Parallel Regions -- The Survey Source Window -- How Survey Analysis Works -- Annotating Your Code -- Site Annotations -- Lock Annotations -- Adding Annotations -- Checking Suitability -- Running a Suitability Analysis -- The Suitability Report -- Parallel Choices -- Using the Suitability Report -- How Suitability Analysis Works -- Checking for Correctness -- Running a Correctness Analysis -- The Correctness Report -- The Correctness Source Window -- Understanding Common Problems -- Using the Correctness Report -- Correctness Analysis Limitation -- How Correctness Analysis Works -- Replacing Annotations -- The Summary Report -- Common Mappings -- Summary -- Chapter 11: Debugging Parallel Applications -- Introduction to the Intel Debugger -- The Parallel Debugger Workflow.

Using the Intel Debugger to Detect Data Races.
Abstract:
Optimize code for multi-core processors with Intel's Parallel Studio Parallel programming is rapidly becoming a "must-know" skill for developers. Yet, where to start? This teach-yourself tutorial is an ideal starting point for developers who already know Windows C and C++ and are eager to add parallelism to their code. With a focus on applying tools, techniques, and language extensions to implement parallelism, this essential resource teaches you how to write programs for multicore and leverage the power of multicore in your programs. Sharing hands-on case studies and real-world examples, the authors examine the challenges of each project and show you how to overcome them. Explores conversion of serial code to parallel Focuses on implementing Intel Parallel Studio Highlights the benefits of using parallel code Addresses error and performance optimization of code Includes real-world scenarios that illustrate the techniques of advanced parallel programming situations Parallel Programming with Intel Parallel Studio dispels any concerns of difficulty and gets you started creating faster code with Intel Parallel Studio.
Local Note:
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
Added Author:
Electronic Access:
Click to View
Holds: Copies: