Running R jobs quickly on many machines(转)
As we demonstrated in “A gentle introduction to parallel computing in R” one of the great things about R is how easy it is to take advantage of parallel processing capabilities to speed up calculation. In this note we will show how to move from running jobs multiple CPUs/cores to running jobs multiple machines (for even larger scaling and greater speedup). Using the technique on Amazon EC2 even turns your credit card into a supercomputer.
Colossus supercomputer : The Forbin Project
R itself is not a language designed for parallel computing. It doesn’t have a lot of great user exposed parallel constructs. What saves us is the data science tasks we tend to use R for are themselves are very well suited for parallel programming and many people have prepared very goodpragmatic libraries to exploit this. There are three main ways for a user to benefit from library supplied parallelism:
- Link against superior and parallel libraries such as the Intel BLAS library (supplied on Linux, OSX, and Windows as part of theMicrosoft R Open distribution of R). This replaces libraries you are already using with parallel ones, and you get a speed up for free (on appropriate tasks, such as linear algebra portions of lm()/glm()).
- Ship your modeling tasks out of R into an external parallel system for processing. This is strategy of systems such as rx methods from RevoScaleR, now Microsoft Open R, h2o methods from h2o.ai, orRHadoop.
- Use R’s
parallel
facility to ship jobs to cooperating R instances.This is the strategy used in “A gentle introduction to parallel computing in R” and many libraries that sit on top ofparallel
. This is essentially implementing remote procedure call through sockets or networking.
We are going to write more about the third technique.
The third technique is essentially very course grained remote procedure call. It depends on shipping copies of code and data to remote processes and then returning results. It is ill suited for very small tasks. But very well suited a reasonable number of moderate to large tasks. This is the strategy used by R’s parallel
library and Python‘s multiprocessing
library (though with Python multiprocessing
you pretty much need to bring in additional libraries to move from single machine to cluster computing).
This method may seem less efficient and less sophisticated than shared memory methods, but relying on object transmission means it is in principle very easy to extend the technique from a single machine to many machines (also called “cluster computing”). This is what we will demonstrate the R portion of here (in moving from a single machine to a cluster we necessarily bring in a lot of systems/networking/security issues which we will have to defer on).
Here is the complete R portion of the lesson. This assumes you already understand how to configure “ssh” or have a systems person who can help you with the ssh system steps.
Take the examples from “A gentle introduction to parallel computing in R” and instead of starting your parallel cluster with the command: “parallelCluster <- parallel::makeCluster(parallel::detectCores())
.”
Do the following:
Collect a list of addresses of machines you can ssh
. This is the hard part, depends on your operating system, and something you should get help with if you have not tried it before. In this case I am using ipV4 addresses, but when using Amazon EC2 I use hostnames.
In my case my list is:
- My machine (primary): “192.168.1.235”, user “johnmount”
- Another Win-Vector LLC machine: “192.168.1.70”, user “johnmount”
Notice we are not collecting passwords, as we are assuming we have set up proper “authorized_keys” and keypairs in the “.ssh
” configurations of all of these machines. We are calling the machine we are using to issue the overall computation “primary.”
It is vital you try all of these addresses with “ssh” in a terminal shell before trying them with R. Also the machine address you choose as “primary” must be an address the worker machines can use reach back to the primary machine (so you can’t use “localhost”, or use an unreachable machine as primary). Try ssh by hand back and forth from primary to all of these machines and from all of these machines back to your primary before trying to use ssh with R.
Now with the system stuff behind us the R part is as follows. Start your cluster with:
primary <- '192.168.1.235'
machineAddresses <- list(
list(host=primary,user='johnmount',
ncore=4),
list(host='192.168.1.70',user='johnmount',
ncore=4)
) spec <- lapply(machineAddresses,
function(machine) {
rep(list(list(host=machine$host,
user=machine$user)),
machine$ncore)
})
spec <- unlist(spec,recursive=FALSE) parallelCluster <- parallel::makeCluster(type='PSOCK',
master=primary,
spec=spec)
print(parallelCluster)
## socket cluster with 8 nodes on hosts
## ‘192.168.1.235’, ‘192.168.1.70’
And that is it. You can now run your job on many cores on many machines. For the right tasks this represents a substantial speedup. As always separate your concerns when starting: first get a trivial “hello world” task to work on your cluster, then get a smaller version of your computation to work on a local machine, and only after these throw your real work at the cluster.
As we have mentioned before, with some more system work you canspin up transient Amazon ec2 instances to join your computation. At this point your credit card becomes a supercomputer (though you do have to remember to shut them down to prevent extra expenses!).
转自:http://www.win-vector.com/blog/2016/01/running-r-jobs-quickly-on-many-machines/
Running R jobs quickly on many machines(转)的更多相关文章
- 社交网络分析的 R 基础:(四)循环与并行
前三章中列出的大多数示例代码都很短,并没有涉及到复杂的操作.从本章开始将会把前面介绍的数据结构组合起来,构成真正的程序.大部分程序是由条件语句和循环语句控制,R 语言中的条件语句(if-else)和 ...
- Graphics for R
https://cran.r-project.org/web/views/Graphics.html CRAN Task View: Graphic Displays & Dynamic Gr ...
- Configuring and Running Django + Celery in Docker Containers
Configuring and Running Django + Celery in Docker Containers Justyna Ilczuk Oct 25, 2016 0 Commen ...
- Python调用R编程——rpy2
在Python调用R,最常见的方式是使用rpy2模块. 简介 模块 The package is made of several sub-packages or modules: rpy2.rinte ...
- 配置 Sublime Text 3 作为Python R LaTeX Markdown IDE
配置 Sublime Text 3 作为Python R LaTeX Markdown IDE 配置 Sublime Text 3 作为Python IDE IDE的基本功能:代码提醒.补全:编译文件 ...
- Data Science With R In Visual Studio
R Projects Similar to Python, when we installed the data science tools we get an “R” section in our ...
- How-to: Do Statistical Analysis with Impala and R
sklearn实战-乳腺癌细胞数据挖掘(博客主亲自录制视频教程) https://study.163.com/course/introduction.htm?courseId=1005269003&a ...
- [SQL in Azure] Getting Started with SQL Server in Azure Virtual Machines
This topic provides guidelines on how to sign up for SQL Server on a Azure virtual machine and how t ...
- Scheduled Jobs with Custom Clock Processes in Java with Quartz and RabbitMQ
原文地址: https://devcenter.heroku.com/articles/scheduled-jobs-custom-clock-processes-java-quartz-rabbit ...
随机推荐
- Git托管
前面的话 本文将主要介绍如何使用Github来托管Git服务 SSH 大多数Git服务器都会选择使用SSH公钥来进行授权.系统中的每个用户都必须提供一个公钥用于授权 首先先确认一下是否已经有一个公钥了 ...
- 跟着刚哥梳理java知识点——IO(十五)
凡是与输入.输出相关的类.接口都定义在java.io包下 java.io.File类 1.File是一个类,可以有构造器创建其对象.此对象对应着一个文件或者一个目录. 2.File中的类,仅涉及到如何 ...
- 间谍网络——tarjan求SCC
洛谷传送门 看着这道题给人感觉就是tarjan求SCC,然而还得判断是否能控制全部间谍,这就得先从可以贿赂的点dfs一遍. 如果没有全部被标记了,就输出NO,再从没被标记的点里找最小的标号. 如果全被 ...
- 移动设备真机调试本地程序的Node.js【无需连wifi】
前提: 在某些场景下,我们需要调试我们的Node.js,这很简单,很多编辑器都集成了debug模式,但是某些场景下,我们想在移动设备上运行,在本地debug,这也行,只需要链接在同一个内网,通过ip ...
- Java实现Android,iOS设备实时监控
Java实现Android设备实时监控 设计思路: 第一,启动一个实时截图线程,负责实时截取Android设备屏幕,保存到本地路径. 第二,在JSP页面,定义一个img对象,实时更换img对象的src ...
- Linux配置tomcat (centos配置java环境 tomcat配置篇 总结三)
♣下载安装tomcat7 ♣设置启动和关闭 ♣设置用户名和密码 ♣发布java web项目 声明:这篇教程是建立在前两篇教程的基础上的,所以,还没安装工具和jdk,可以先看这个系列的前面两篇(去到文末 ...
- "fatal: protocol error: bad line length character: No This"
git clone 远程地址时候出现 "fatal: protocol error: bad line length character: No This" 错误 在stackov ...
- Swift、Objective-C 单例模式 (Singleton)
Swift.Objective-C 单例模式 (Singleton) 本文的单例模式分为严格单例模式和不严格单例模式.单例模式要求一个类有一个实例,有公开接口可以访问这个实例.严格单例模式,要求一个类 ...
- 【WPF】学习笔记(一)——做一个简单的电子签名板
参加实习(WPF)已经有两个多周的时间了,踩了一些坑,也算积累了一些小东西,准备慢慢拿出来分享一下.(●'◡'●) 这次呢就讲讲一个简单的电子签名板的实现. 先上张图(PS:字写得比较丑,不要太在意哈 ...
- DELL Precision Tower7910重装系统+开机出现GRUB界面如何处理
想给实验室的工作站重新装个Win7系统,因为以前并没装过工作站的系统,发现和普通的电脑装系统还是有些不一样的.主要的问题就在于主板的不同. 尝试了老毛桃U盘启动盘安装,结果在WinPE里面提示找不到硬 ...