
R High Performance Programming.
Title:
R High Performance Programming.
Author:
Lim, Aloysius.
ISBN:
9781783989270
Personal Author:
Physical Description:
1 online resource (201 pages)
Contents:
R High Performance Programming -- Table of Contents -- R High Performance Programming -- Credits -- About the Authors -- About the Reviewers -- www.PacktPub.com -- Support files, eBooks, discount offers, and more -- Why subscribe? -- Free access for Packt account holders -- Preface -- What this book covers -- What you need for this book -- Who this book is for -- Conventions -- Reader feedback -- Customer support -- Downloading the example code -- Errata -- Piracy -- Questions -- 1. Understanding R's Performance - Why Are R Programs Sometimes Slow? -- Three constraints on computing performance - CPU, RAM, and disk I/O -- R is interpreted on the fly -- R is single-threaded -- R requires all data to be loaded into memory -- Algorithm design affects time and space complexity -- Summary -- 2. Profiling - Measuring Code's Performance -- Measuring total execution time -- Measuring execution time with system.time() -- Repeating time measurements with rbenchmark -- Measuring distribution of execution time with microbenchmark -- Profiling the execution time -- Profiling a function with Rprof() -- The profiling results -- Profiling memory utilization -- Monitoring memory utilization, CPU utilization, and disk I/O using OS tools -- Identifying and resolving bottlenecks -- Summary -- 3. Simple Tweaks to Make R Run Faster -- Vectorization -- Use of built-in functions -- Preallocating memory -- Use of simpler data structures -- Use of hash tables for frequent lookups on large data -- Seeking fast alternative packages in CRAN -- Summary -- 4. Using Compiled Code for Greater Speed -- Compiling R code before execution -- Compiling functions -- Just-in-time (JIT) compilation of R code -- Using compiled languages in R -- Prerequisites -- Including compiled code inline -- Calling external compiled code -- Considerations for using compiled code -- R APIs.
R data types versus native data types -- Creating R objects and garbage collection -- Allocating memory for non-R objects -- Summary -- 5. Using GPUs to Run R Even Faster -- General purpose computing on GPUs -- R and GPUs -- Installing gputools -- Fast statistical modeling in R with gputools -- Summary -- 6. Simple Tweaks to Use Less RAM -- Reusing objects without taking up more memory -- Removing intermediate data when it is no longer needed -- Calculating values on the fly instead of storing them persistently -- Swapping active and nonactive data -- Summary -- 7. Processing Large Datasets with Limited RAM -- Using memory-efficient data structures -- Smaller data types -- Sparse matrices -- Symmetric matrices -- Bit vectors -- Using memory-mapped files and processing data in chunks -- The bigmemory package -- The ff package -- Summary -- 8. Multiplying Performance with Parallel Computing -- Data parallelism versus task parallelism -- Implementing data parallel algorithms -- Implementing task parallel algorithms -- Running the same task on workers in a cluster -- Running different tasks on workers in a cluster -- Executing tasks in parallel on a cluster of computers -- Shared memory versus distributed memory parallelism -- Optimizing parallel performance -- Summary -- 9. Offloading Data Processing to Database Systems -- Extracting data into R versus processing data in a database -- Preprocessing data in a relational database using SQL -- Converting R expressions to SQL -- Using dplyr -- Using PivotalR -- Running statistical and machine learning algorithms in a database -- Using columnar databases for improved performance -- Using array databases for maximum scientific-computing performance -- Summary -- 10. R and Big Data -- Understanding Hadoop -- Setting up Hadoop on Amazon Web Services -- Processing large datasets in batches using Hadoop.
Uploading data to HDFS -- Analyzing HDFS data with RHadoop -- Other Hadoop packages for R -- Summary -- Index.
Abstract:
This book is for programmers and developers who want to improve the performance of their R programs by making them run faster with large data sets or who are trying to solve a pesky performance problem.
Local Note:
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
Subject Term:
Genre:
Added Author:
Electronic Access:
Click to View