SlideShare une entreprise Scribd logo
1  sur  15
Télécharger pour lire hors ligne
Example Parallel Overview snow fork Summary
Parallel Computing with R
Péter Sólymos
Edmonton R User Group meeting, April 26, 2013
Example Parallel Overview snow fork Summary
Ovenbird example from 'detect' package
> str(oven)
'data.frame': 891 obs. of 11 variables:
$ count : int 1 0 0 1 0 0 0 0 0 0 ...
$ route : int 2 2 2 2 2 2 2 2 2 2 ...
$ stop : int 2 4 6 8 10 12 14 16 18 20 ...
$ pforest: num 0.947 0.903 0.814 0.89 0.542 ...
$ pdecid : num 0.575 0.562 0.549 0.679 0.344 ...
$ pagri : num 0 0 0 0 0.414 ...
$ long : num 609343 608556 607738 607680 607944 ...
$ lat : num 5949071 5947735 5946301 5944720 5943088 ...
$ observ : Factor w/ 4 levels "ARS","DW","RDW",..: 4 4 4 4 4 4 4 4 4 4 ...
$ julian : int 181 181 181 181 181 181 181 181 181 181 ...
$ timeday: int 2 4 6 8 10 12 14 16 18 20 ...
Example Parallel Overview snow fork Summary
NegBin GLM with bootstrap
> library(MASS)
> m <- glm.nb(count ~ pforest, oven)
> fun1 <- function(i) {
+ id <- sample.int(nrow(oven), nrow(oven), replace = TRUE)
+ coef(glm.nb(count ~ pforest, oven[id, ]))
+ }
> B <- 199
> system.time(bm <- sapply(1:B, fun1))
user system elapsed
26.79 0.02 27.11
> bm <- cbind(coef(m), bm)
> cbind(coef(summary(m))[, 1:2], `Boot. SE` = apply(bm, 1, sd))
Estimate Std. Error Boot. SE
(Intercept) -2.177 0.1277 0.1229
pforest 2.674 0.1709 0.1553
Example Parallel Overview snow fork Summary
Parallel bootstrap
> library(parallel)
> (cl <- makePSOCKcluster(3))
socket cluster with 3 nodes on host 'localhost'
> clusterExport(cl, "oven")
> tmp <- clusterEvalQ(cl, library(MASS))
> t0 <- proc.time()
> bm2 <- parSapply(cl, 1:B, fun1)
> proc.time() - t0
user system elapsed
0.00 0.00 11.06
> stopCluster(cl)
Example Parallel Overview snow fork Summary
High performance computing (HPC)
ˆ Parallel computing,
ˆ large memory and out-of-memory data,
ˆ interfaces for compiled code,
ˆ proling tools,
ˆ batch scheduling.
CRAN Task View: High-Performance and Parallel Computing with R
Example Parallel Overview snow fork Summary
Parallel computing
Embarassingly parallel problems:
ˆ bootstrap,
ˆ MCMC,
ˆ simulations.
Can be broken down into independent pieces.1
1Schmidberger et al. 2009 JSS: State of the Art in Parallel Computing with R
Example Parallel Overview snow fork Summary
Parallel computing
ˆ explicit (distributed memory),
ˆ implicit (shared memory),
ˆ grid,
ˆ Hadoop,
ˆ GPUs.
Example Parallel Overview snow fork Summary
Starting a cluster
 library(snow)
 cl - makeCluster(3, type = SOCK)
Cluster types:
ˆ SOCK, multicore
ˆ PVM, Parallel Virtual Machine
ˆ MPI, Message Passing Interface
ˆ NWS, NetWorkSpaces (multicore  grid)
Error: invalid connection
Example Parallel Overview snow fork Summary
Distribute stu, evaluate expressions
 clusterExport(cl, oven)
 clusterEvalQ(cl, library(MASS))
[[1]]
[1] MASS methods stats graphics
[5] grDevices utils datasets base
[[2]]
[1] MASS methods stats graphics
[5] grDevices utils datasets base
[[3]]
[1] MASS methods stats graphics
[5] grDevices utils datasets base
Example Parallel Overview snow fork Summary
Random Number Generation (RNG)
 library(rlecuyer)
 tmp - clusterEvalQ(cl, set.seed(1234))
 clusterEvalQ(cl, rnorm(5))
[[1]]
[1] -1.2071 0.2774 1.0844 -2.3457 0.4291
[[2]]
[1] -1.2071 0.2774 1.0844 -2.3457 0.4291
 snow:::clusterSetupRNG(cl)
[1] RNGstream
 clusterEvalQ(cl, rnorm(5))
[[1]]
[1] -1.14063 -0.49816 -0.76670 -0.04821 -1.09852
[[2]]
[1] 0.7050 0.4821 -1.2848 0.7198 0.7386
Important when calculating indices or doing simulations.
Example Parallel Overview snow fork Summary
Apply operations: split
 parallel:::parLapply
function (cl = NULL, X, fun, ...)
{
cl - defaultCluster(cl)
do.call(c, clusterApply(cl, x = splitList(X, length(cl)),
fun = lapply, fun, ...), quote = TRUE)
}
bytecode: 0x04c1eba8
environment: namespace:parallel
 snow:::splitList(1:10, length(cl))
[[1]]
[1] 1 2 3 4 5
[[2]]
[1] 6 7 8 9 10
Example Parallel Overview snow fork Summary
Apply operations: evaluate and combine
 f - function(i) i * 2
 (res - clusterApply(cl, snow:::splitList(1:10, length(cl)),
+ f))
[[1]]
[1] 2 4 6
[[2]]
[1] 8 10 12 14
[[3]]
[1] 16 18 20
 do.call(c, res)
[1] 2 4 6 8 10 12 14 16 18 20
Example Parallel Overview snow fork Summary
Apply operations: load balancing
 f - function(i) i * 2
 unlist(parallel:::parLapplyLB(cl, 1:10, f))
[1] 2 4 6 8 10 12 14 16 18 20
Example Parallel Overview snow fork Summary
Implicit parallelism
No need to distribute stu, only evaluate on child processes.
 mclapply(X, FUN, mc.cores)
Example Parallel Overview snow fork Summary
Summary
Parallel computing is not hard on a single computer.
Diculty comes in when using large, shared, and heterogeneous
resources.
 stopCluster(cl)

Contenu connexe

Tendances

SevillaR meetup: dplyr and magrittr
SevillaR meetup: dplyr and magrittrSevillaR meetup: dplyr and magrittr
SevillaR meetup: dplyr and magrittrRomain Francois
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak PROIDEA
 
App-o-Lockalypse now!
App-o-Lockalypse now!App-o-Lockalypse now!
App-o-Lockalypse now!Oddvar Moe
 
Практический опыт профайлинга и оптимизации производительности Ruby-приложений
Практический опыт профайлинга и оптимизации производительности Ruby-приложенийПрактический опыт профайлинга и оптимизации производительности Ruby-приложений
Практический опыт профайлинга и оптимизации производительности Ruby-приложенийOlga Lavrentieva
 
This is not your father's monitoring.
This is not your father's monitoring.This is not your father's monitoring.
This is not your father's monitoring.Mathias Herberts
 
OSTEP Chapter2 Introduction
OSTEP Chapter2 IntroductionOSTEP Chapter2 Introduction
OSTEP Chapter2 IntroductionShuya Osaki
 
MongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The Cloud
MongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The CloudMongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The Cloud
MongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The CloudMongoDB
 
Best Practices for Benchmarking and Performance Analysis in the Cloud (ENT305...
Best Practices for Benchmarking and Performance Analysis in the Cloud (ENT305...Best Practices for Benchmarking and Performance Analysis in the Cloud (ENT305...
Best Practices for Benchmarking and Performance Analysis in the Cloud (ENT305...Amazon Web Services
 
Artimon - Apache Flume (incubating) NYC Meetup 20111108
Artimon - Apache Flume (incubating) NYC Meetup 20111108Artimon - Apache Flume (incubating) NYC Meetup 20111108
Artimon - Apache Flume (incubating) NYC Meetup 20111108Mathias Herberts
 
Parallel Computing in R
Parallel Computing in RParallel Computing in R
Parallel Computing in Rmickey24
 
Kubernetes Tutorial
Kubernetes TutorialKubernetes Tutorial
Kubernetes TutorialCi Jie Li
 
Allison Kaptur: Bytes in the Machine: Inside the CPython interpreter, PyGotha...
Allison Kaptur: Bytes in the Machine: Inside the CPython interpreter, PyGotha...Allison Kaptur: Bytes in the Machine: Inside the CPython interpreter, PyGotha...
Allison Kaptur: Bytes in the Machine: Inside the CPython interpreter, PyGotha...akaptur
 
Tests unitaires pour PostgreSQL avec pgTap
Tests unitaires pour PostgreSQL avec pgTapTests unitaires pour PostgreSQL avec pgTap
Tests unitaires pour PostgreSQL avec pgTapRodolphe Quiédeville
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesAltinity Ltd
 
Nodejs性能分析优化和分布式设计探讨
Nodejs性能分析优化和分布式设计探讨Nodejs性能分析优化和分布式设计探讨
Nodejs性能分析优化和分布式设计探讨flyinweb
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 

Tendances (20)

SevillaR meetup: dplyr and magrittr
SevillaR meetup: dplyr and magrittrSevillaR meetup: dplyr and magrittr
SevillaR meetup: dplyr and magrittr
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
 
App-o-Lockalypse now!
App-o-Lockalypse now!App-o-Lockalypse now!
App-o-Lockalypse now!
 
Profiling Ruby
Profiling RubyProfiling Ruby
Profiling Ruby
 
Практический опыт профайлинга и оптимизации производительности Ruby-приложений
Практический опыт профайлинга и оптимизации производительности Ruby-приложенийПрактический опыт профайлинга и оптимизации производительности Ruby-приложений
Практический опыт профайлинга и оптимизации производительности Ruby-приложений
 
This is not your father's monitoring.
This is not your father's monitoring.This is not your father's monitoring.
This is not your father's monitoring.
 
OSTEP Chapter2 Introduction
OSTEP Chapter2 IntroductionOSTEP Chapter2 Introduction
OSTEP Chapter2 Introduction
 
MongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The Cloud
MongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The CloudMongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The Cloud
MongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The Cloud
 
Best Practices for Benchmarking and Performance Analysis in the Cloud (ENT305...
Best Practices for Benchmarking and Performance Analysis in the Cloud (ENT305...Best Practices for Benchmarking and Performance Analysis in the Cloud (ENT305...
Best Practices for Benchmarking and Performance Analysis in the Cloud (ENT305...
 
Artimon - Apache Flume (incubating) NYC Meetup 20111108
Artimon - Apache Flume (incubating) NYC Meetup 20111108Artimon - Apache Flume (incubating) NYC Meetup 20111108
Artimon - Apache Flume (incubating) NYC Meetup 20111108
 
Parallel Computing in R
Parallel Computing in RParallel Computing in R
Parallel Computing in R
 
Tracing and awk in ns2
Tracing and awk in ns2Tracing and awk in ns2
Tracing and awk in ns2
 
Db2
Db2Db2
Db2
 
Kubernetes Tutorial
Kubernetes TutorialKubernetes Tutorial
Kubernetes Tutorial
 
Allison Kaptur: Bytes in the Machine: Inside the CPython interpreter, PyGotha...
Allison Kaptur: Bytes in the Machine: Inside the CPython interpreter, PyGotha...Allison Kaptur: Bytes in the Machine: Inside the CPython interpreter, PyGotha...
Allison Kaptur: Bytes in the Machine: Inside the CPython interpreter, PyGotha...
 
C++ Optimization Tips
C++ Optimization TipsC++ Optimization Tips
C++ Optimization Tips
 
Tests unitaires pour PostgreSQL avec pgTap
Tests unitaires pour PostgreSQL avec pgTapTests unitaires pour PostgreSQL avec pgTap
Tests unitaires pour PostgreSQL avec pgTap
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
 
Nodejs性能分析优化和分布式设计探讨
Nodejs性能分析优化和分布式设计探讨Nodejs性能分析优化和分布式设计探讨
Nodejs性能分析优化和分布式设计探讨
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 

En vedette (12)

Lesson 10 Application Program Interface
Lesson 10 Application Program InterfaceLesson 10 Application Program Interface
Lesson 10 Application Program Interface
 
Fork CMS
Fork CMSFork CMS
Fork CMS
 
FORK Overview
FORK OverviewFORK Overview
FORK Overview
 
Git & GitHub
Git & GitHubGit & GitHub
Git & GitHub
 
Unix kernal
Unix kernalUnix kernal
Unix kernal
 
Linux Process & CF scheduling
Linux Process & CF schedulingLinux Process & CF scheduling
Linux Process & CF scheduling
 
System call (Fork +Exec)
System call (Fork +Exec)System call (Fork +Exec)
System call (Fork +Exec)
 
Part 04 Creating a System Call in Linux
Part 04 Creating a System Call in LinuxPart 04 Creating a System Call in Linux
Part 04 Creating a System Call in Linux
 
Chapter 3 - Processes
Chapter 3 - ProcessesChapter 3 - Processes
Chapter 3 - Processes
 
Linux Programming
Linux ProgrammingLinux Programming
Linux Programming
 
System call
System callSystem call
System call
 
System calls
System callsSystem calls
System calls
 

Similaire à Parallel Computing with R

Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)Cdiscount
 
Do snow.rwn
Do snow.rwnDo snow.rwn
Do snow.rwnARUN DN
 
RAPIDS: ускоряем Pandas и scikit-learn на GPU Павел Клеменков, NVidia
RAPIDS: ускоряем Pandas и scikit-learn на GPU  Павел Клеменков, NVidiaRAPIDS: ускоряем Pandas и scikit-learn на GPU  Павел Клеменков, NVidia
RAPIDS: ускоряем Pandas и scikit-learn на GPU Павел Клеменков, NVidiaMail.ru Group
 
Debugging Ruby
Debugging RubyDebugging Ruby
Debugging RubyAman Gupta
 
Replication MongoDB Days 2013
Replication MongoDB Days 2013Replication MongoDB Days 2013
Replication MongoDB Days 2013Randall Hunt
 
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Ontico
 
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014Amazon Web Services
 
Performance Tuning EC2 Instances
Performance Tuning EC2 InstancesPerformance Tuning EC2 Instances
Performance Tuning EC2 InstancesBrendan Gregg
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLCommand Prompt., Inc
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLMark Wong
 
Debugging Ruby Systems
Debugging Ruby SystemsDebugging Ruby Systems
Debugging Ruby SystemsEngine Yard
 
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介Masayuki Matsushita
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceBrendan Gregg
 
EKON22 Introduction to Machinelearning
EKON22 Introduction to MachinelearningEKON22 Introduction to Machinelearning
EKON22 Introduction to MachinelearningMax Kleiner
 
Complex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsComplex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsPeter Solymos
 
pstack, truss etc to understand deeper issues in Oracle database
pstack, truss etc to understand deeper issues in Oracle databasepstack, truss etc to understand deeper issues in Oracle database
pstack, truss etc to understand deeper issues in Oracle databaseRiyaj Shamsudeen
 
Profiling your Applications using the Linux Perf Tools
Profiling your Applications using the Linux Perf ToolsProfiling your Applications using the Linux Perf Tools
Profiling your Applications using the Linux Perf ToolsemBO_Conference
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with PrometheusShiao-An Yuan
 
LSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityBrendan Gregg
 

Similaire à Parallel Computing with R (20)

Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)
 
Do snow.rwn
Do snow.rwnDo snow.rwn
Do snow.rwn
 
RAPIDS: ускоряем Pandas и scikit-learn на GPU Павел Клеменков, NVidia
RAPIDS: ускоряем Pandas и scikit-learn на GPU  Павел Клеменков, NVidiaRAPIDS: ускоряем Pandas и scikit-learn на GPU  Павел Клеменков, NVidia
RAPIDS: ускоряем Pandas и scikit-learn на GPU Павел Клеменков, NVidia
 
Debugging Ruby
Debugging RubyDebugging Ruby
Debugging Ruby
 
Replication MongoDB Days 2013
Replication MongoDB Days 2013Replication MongoDB Days 2013
Replication MongoDB Days 2013
 
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
 
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
 
Performance Tuning EC2 Instances
Performance Tuning EC2 InstancesPerformance Tuning EC2 Instances
Performance Tuning EC2 Instances
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
 
Debugging Ruby Systems
Debugging Ruby SystemsDebugging Ruby Systems
Debugging Ruby Systems
 
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
 
EKON22 Introduction to Machinelearning
EKON22 Introduction to MachinelearningEKON22 Introduction to Machinelearning
EKON22 Introduction to Machinelearning
 
Complex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsComplex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutions
 
Javantura v2 - Replication with MongoDB - what could go wrong... - Philipp Krenn
Javantura v2 - Replication with MongoDB - what could go wrong... - Philipp KrennJavantura v2 - Replication with MongoDB - what could go wrong... - Philipp Krenn
Javantura v2 - Replication with MongoDB - what could go wrong... - Philipp Krenn
 
pstack, truss etc to understand deeper issues in Oracle database
pstack, truss etc to understand deeper issues in Oracle databasepstack, truss etc to understand deeper issues in Oracle database
pstack, truss etc to understand deeper issues in Oracle database
 
Profiling your Applications using the Linux Perf Tools
Profiling your Applications using the Linux Perf ToolsProfiling your Applications using the Linux Perf Tools
Profiling your Applications using the Linux Perf Tools
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with Prometheus
 
LSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF Observability
 

Dernier

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

Parallel Computing with R

  • 1. Example Parallel Overview snow fork Summary Parallel Computing with R Péter Sólymos Edmonton R User Group meeting, April 26, 2013
  • 2. Example Parallel Overview snow fork Summary Ovenbird example from 'detect' package > str(oven) 'data.frame': 891 obs. of 11 variables: $ count : int 1 0 0 1 0 0 0 0 0 0 ... $ route : int 2 2 2 2 2 2 2 2 2 2 ... $ stop : int 2 4 6 8 10 12 14 16 18 20 ... $ pforest: num 0.947 0.903 0.814 0.89 0.542 ... $ pdecid : num 0.575 0.562 0.549 0.679 0.344 ... $ pagri : num 0 0 0 0 0.414 ... $ long : num 609343 608556 607738 607680 607944 ... $ lat : num 5949071 5947735 5946301 5944720 5943088 ... $ observ : Factor w/ 4 levels "ARS","DW","RDW",..: 4 4 4 4 4 4 4 4 4 4 ... $ julian : int 181 181 181 181 181 181 181 181 181 181 ... $ timeday: int 2 4 6 8 10 12 14 16 18 20 ...
  • 3. Example Parallel Overview snow fork Summary NegBin GLM with bootstrap > library(MASS) > m <- glm.nb(count ~ pforest, oven) > fun1 <- function(i) { + id <- sample.int(nrow(oven), nrow(oven), replace = TRUE) + coef(glm.nb(count ~ pforest, oven[id, ])) + } > B <- 199 > system.time(bm <- sapply(1:B, fun1)) user system elapsed 26.79 0.02 27.11 > bm <- cbind(coef(m), bm) > cbind(coef(summary(m))[, 1:2], `Boot. SE` = apply(bm, 1, sd)) Estimate Std. Error Boot. SE (Intercept) -2.177 0.1277 0.1229 pforest 2.674 0.1709 0.1553
  • 4. Example Parallel Overview snow fork Summary Parallel bootstrap > library(parallel) > (cl <- makePSOCKcluster(3)) socket cluster with 3 nodes on host 'localhost' > clusterExport(cl, "oven") > tmp <- clusterEvalQ(cl, library(MASS)) > t0 <- proc.time() > bm2 <- parSapply(cl, 1:B, fun1) > proc.time() - t0 user system elapsed 0.00 0.00 11.06 > stopCluster(cl)
  • 5. Example Parallel Overview snow fork Summary High performance computing (HPC) ˆ Parallel computing, ˆ large memory and out-of-memory data, ˆ interfaces for compiled code, ˆ proling tools, ˆ batch scheduling. CRAN Task View: High-Performance and Parallel Computing with R
  • 6. Example Parallel Overview snow fork Summary Parallel computing Embarassingly parallel problems: ˆ bootstrap, ˆ MCMC, ˆ simulations. Can be broken down into independent pieces.1 1Schmidberger et al. 2009 JSS: State of the Art in Parallel Computing with R
  • 7. Example Parallel Overview snow fork Summary Parallel computing ˆ explicit (distributed memory), ˆ implicit (shared memory), ˆ grid, ˆ Hadoop, ˆ GPUs.
  • 8. Example Parallel Overview snow fork Summary Starting a cluster library(snow) cl - makeCluster(3, type = SOCK) Cluster types: ˆ SOCK, multicore ˆ PVM, Parallel Virtual Machine ˆ MPI, Message Passing Interface ˆ NWS, NetWorkSpaces (multicore grid) Error: invalid connection
  • 9. Example Parallel Overview snow fork Summary Distribute stu, evaluate expressions clusterExport(cl, oven) clusterEvalQ(cl, library(MASS)) [[1]] [1] MASS methods stats graphics [5] grDevices utils datasets base [[2]] [1] MASS methods stats graphics [5] grDevices utils datasets base [[3]] [1] MASS methods stats graphics [5] grDevices utils datasets base
  • 10. Example Parallel Overview snow fork Summary Random Number Generation (RNG) library(rlecuyer) tmp - clusterEvalQ(cl, set.seed(1234)) clusterEvalQ(cl, rnorm(5)) [[1]] [1] -1.2071 0.2774 1.0844 -2.3457 0.4291 [[2]] [1] -1.2071 0.2774 1.0844 -2.3457 0.4291 snow:::clusterSetupRNG(cl) [1] RNGstream clusterEvalQ(cl, rnorm(5)) [[1]] [1] -1.14063 -0.49816 -0.76670 -0.04821 -1.09852 [[2]] [1] 0.7050 0.4821 -1.2848 0.7198 0.7386 Important when calculating indices or doing simulations.
  • 11. Example Parallel Overview snow fork Summary Apply operations: split parallel:::parLapply function (cl = NULL, X, fun, ...) { cl - defaultCluster(cl) do.call(c, clusterApply(cl, x = splitList(X, length(cl)), fun = lapply, fun, ...), quote = TRUE) } bytecode: 0x04c1eba8 environment: namespace:parallel snow:::splitList(1:10, length(cl)) [[1]] [1] 1 2 3 4 5 [[2]] [1] 6 7 8 9 10
  • 12. Example Parallel Overview snow fork Summary Apply operations: evaluate and combine f - function(i) i * 2 (res - clusterApply(cl, snow:::splitList(1:10, length(cl)), + f)) [[1]] [1] 2 4 6 [[2]] [1] 8 10 12 14 [[3]] [1] 16 18 20 do.call(c, res) [1] 2 4 6 8 10 12 14 16 18 20
  • 13. Example Parallel Overview snow fork Summary Apply operations: load balancing f - function(i) i * 2 unlist(parallel:::parLapplyLB(cl, 1:10, f)) [1] 2 4 6 8 10 12 14 16 18 20
  • 14. Example Parallel Overview snow fork Summary Implicit parallelism No need to distribute stu, only evaluate on child processes. mclapply(X, FUN, mc.cores)
  • 15. Example Parallel Overview snow fork Summary Summary Parallel computing is not hard on a single computer. Diculty comes in when using large, shared, and heterogeneous resources. stopCluster(cl)