My "Top 5 R Functions"（转）

In preparation for a R Workgroup meeting, I started thinking about what would be my "Top 5 R Functions". I ruled out the functions for basic mechanics - save, load, mean, etc. - they're obviously critical, but every programming language has them, so there's nothing especially "R" about them. I also ruled out the fancy statistical analysis functions like (g)lmer -- most people (including me) start using R because they want to run those analyses so it seemed a little redundant. I started using R because I wanted to do growth curve analysis, so it seems like a weak endorsement to say that I like R because it can do growth curve analysis. No, I like R because it makes (many) somewhat complex data operations really, really easy. Understanding how take advantage of these R functions is what transformed my view of R from purely functional (I need to do analysis X and R has functions for doing analysis X) to an all-purpose tool that allows me to do data processing, management, analysis, and visualization extremely quickly and easily. So, here are the 5 functions that did that for me:

subset() for making subsets of data (natch)
merge() for combining data sets in a smart and easy way
melt() for converting from wide to long data formats
dcast() for converting from long to wide data formats, and for making summary tables
ddply() for doing split-apply-combine operations, which covers a huge swath of the most tricky data operations

For anyone interested, I posted my R Workgroup notes on how to use these functions on RPubs. Side note: after a little configuration, I found it super easy to write these using knitr, "knit" them into a webpage, and post that page on RPubs.

Conspicuously missing from the above list is ggplot, which I think deserves a special lifetime achievement award for how it has transformed how I think about data exploration and data visualization. I'm planning that for the next R Workgroup meeting.

My "Top 5 R Functions"（转）的更多相关文章

Non-standard evaluation, how tidy eval builds on base R
As with many aspects of the tidyverse, its non-standard evaluation (NSE) implementation is not somet ...
在top命令下kill和renice进程
For common process management tasks, top is so great because it gives an overview of the most active ...
使用r.js来打包模块化的javascript文件
前面的话 r.js(下载)是requireJS的优化(Optimizer)工具,可以实现前端文件的压缩与合并,在requireJS异步按需加载的基础上进一步提供前端优化,减小前端文件大小.减少对服务器 ...
How-to: Do Statistical Analysis with Impala and R
sklearn实战-乳腺癌细胞数据挖掘(博客主亲自录制视频教程) https://study.163.com/course/introduction.htm?courseId=1005269003&a ...
基于R语言的时间序列指数模型
时间序列: (或称动态数列)是指将同一统计指标的数值按其发生的时间先后顺序排列而成的数列.时间序列分析的主要目的是根据已有的历史数据对未来进行预测.(百度百科) 主要考虑的因素: 1.长期趋势(Lon ...
基于R语言的ARIMA模型
A IMA模型是一种著名的时间序列预测方法,主要是指将非平稳时间序列转化为平稳时间序列,然后将因变量仅对它的滞后值以及随机误差项的现值和滞后值进行回归所建立的模型.ARIMA模型根据原序列是否平稳以及 ...
Create and format Word documents using R software and Reporters package
http://www.sthda.com/english/wiki/create-and-format-word-documents-using-r-software-and-reporters-pa ...
keep or remove data frame columns in R
You should use either indexing or the subset function. For example : R> df <- data.frame(x=1:5 ...
a note of R software write Function
Functionals “To become significantly more reliable, code must become more transparent. In particular ...

随机推荐

关于Form表单一些基础知识
1.两个重要属性: action:表单需要提交的服务器地址 method:表单提交数据使用的方法,get/post >>>get和post的区别 ①get传参使用URL传递,所有参数 ...
插入排序的优化非希尔【不靠谱地讲可以优化到O(nlogn)】 USACO 丑数
首先我们先介绍一下普通的插排,就是我们现在一般写的那种,效率是O(n^2)的. 普通的插排基于的思想就是找位置,然后插入进去,其他在它后面的元素全部后移,下面是普通插排的代码: #include< ...
Java与面向对象之随感(2)
我们知道Java语言的一大特性就是相比于c语言和c++语言,其更加安全.那么Java安全性的一个重要保证就是它取消了指针,并且坚决反对数组的出界(c++对当数组超出上限但是还进行读写操作时允许的!), ...
c# 逆波兰式实现计算器
语文不好,不太会组织语言,希望不要太在意. 如题,先简要介绍一下什么是逆波兰式通常我们在写数学公式的时候就是a+b+c这样,这种表达式称为中缀表达式,逆波兰式又称为后缀表达式,例如a+b 后缀 ...
前端代码组织优化--小demo（进阶你的思路）
事出必有因最近在看老项目的代码,一个富客户端的js代码,几千行的代码,全是function(){} var...的垂直布局,真的是要感动的哭了. 一开始都是这样,想实现什么功能,不管三七二十一,fu ...
metools，不花一分钱就能拥有自己的工具站点？
需要[加密/解密][编码/解码][生成二维码]的时候不用再进百度点广告~ 也不需要去收藏夹找网址~ 我的目的大概就是如此. 项目地址:https://github.com/yimogit/metool ...
appium执行iOS测试脚本并发问题
appium1.4.X+iOS9.X+xcode7.X: appium1.4.x+iOS9.x+xcode7.x,这一整套的配置做移动端自动化测试是测试人员常用的测试框架.关于,这一套测试框架的并发问 ...
AspNetPager 分页的详细用法（ASP.NET）
1.[添加AspNetPager.dll文件] 2.[使用方法] public static DataTable GetRecord(SystemModel.Pager mt, ref int Tot ...
关于css禁止文本复制属性
最近在做DHTMLX框架替换,新框架dhx的grid是不能选中内容复制的虽然相对来说是安全些的,但是客户体验度一定会大打折扣网页上禁止复制主要靠JavaScript来实现.<BODY onc ...
Java程序初始化的顺序
Java程序初始化的顺序 java程序初始化工作可以在许多不同的代码块中来完成(例如:静态代码块.构造函数等),他们执行的顺序如下: 父类静态变量父类静态代码块子类静态变量子类静态代码块父类非 ...

My "Top 5 R Functions"（转）

My "Top 5 R Functions"（转）的更多相关文章

随机推荐

热门专题