Parallel Computing with R and Python
- Course type
- STATISTICS
- Correspondant
- François PORTIER
- Unit
-
UE-MSD04 : Advanced Tools for Data Analysis & Computing
- Number of ECTS
- 2
- Course code
- MSD 04-2
- Distribution of courses
-
Heures de cours : 18
- Language of teaching
- English
Objectifs
– Detecting the slow parts of a script by using graphical tools for code profiling. Students will be able to detect the parts of a script where the code should be improved and where the memory allocations should be reduced.
– Improving the code performances using CPU parallel computation. Students will be able to use both of the forking and socket methods of parallel computation.
Plan
First, an introduction of code profiling is proposed (micro and macro profiling, memory monitoring). Then, the two standard methods for CPU parallel computations are presented (forking and socket).
In the R section, we will go through the basic tools in parallel programming, how to detect bottlenecks in their code, and how to perform simulations using parallelization.
With Python, we will cover basic ideas and common patterns in parallel computing, including embarrassingly parallel map, unstructured asynchronous submit, and large collections.
Prérequis
Knowledge of R and Python