R Programming week 3-Loop functions
Looping on the Command Line
Writing for, while loops is useful when programming but not particularly easy when working interactively on the command line. There are some functions which implement looping to make life easier
lapply: Loop over a list and evaluate a function on each elementsapply: Same as lapply but try to simplify the result
apply: Apply a function over the margins of an array
tapply: Apply a function over subsets of a vector mapply: Multivariate version of lapply
An auxiliary function split is also useful, particularly in conjunction with lapply
lapply
lapply takes three arguments: (1) a list X; (2) a function (or the name of a function) FUN; (3) other arguments via its ... argument. If X is not a list, it will be coerced to a list using as.list.
## function (X, FUN, ...)
## {
## FUN <- match.fun(FUN)
## if (!is.vector(X) || is.object(X))
## X <- as.list(X)
## .Internal(lapply(X, FUN))
## }
## <bytecode: 0x7ff7a1951c00>
## <environment: namespace:base>
The actual looping is done internally in C code.
lapply always returns a list, regardless of the class of the input.
x <- list(a = 1:5, b = rnorm(10))
lapply(x, mean)
x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5)) lapply(x, mean)
> x <- 1:4 > lapply(x, runif)
lapply and friends make heavy use of anonymous function
> x <- list(a = matrix(1:4, 2, 2), b = matrix(1:6, 3, 2))
> x
$a
[,1] [,2]
[1,] 1 3
[2,] 2 4
$b
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
An anonymous function for extracting the first column of each matrix.
> lapply(x, function(elt) elt[,1])
$a
[1] 1 2
$b
[1] 1 2 3
sapply
> x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5))
> lapply(x, mean)
apply
apply is used to a evaluate a function (often an anonymous one) over the margins of an array.
It is most often used to apply a function to the rows or columns of a matrix
It can be used with general arrays, e.g. taking the average of an array of matrices
It is not really faster than writing a loop, but it works in one line!
> str(apply)
function (X, MARGIN, FUN, ...)
X is an array
MARGIN is an integer vector indicating which margins should be “retained”.
FUN is a function to be applied
... is for other arguments to be passed to FUN
> x <- matrix(rnorm(200), 20, 10)
> apply(x, 2, mean)
[1] 0.04868268 0.35743615 -0.09104379
[4] -0.05381370 -0.16552070 -0.18192493
[7] 0.10285727 0.36519270 0.14898850
[10] 0.26767260
col/row sums and means
For sums and means of matrix dimensions, we have some shortcuts.
rowSums = apply(x, 1, sum)
rowMeans = apply(x, 1, mean)
colSums = apply(x, 2, sum)
colMeans = apply(x, 2, mean)
The shortcut functions are much faster, but you won’t notice unless you’re using a large matrix.
Other Ways to Apply
Quantiles of the rows of a matrix.
> x <- matrix(rnorm(200), 20, 10)
> apply(x, 1, quantile, probs = c(0.25, 0.75))
mapply
mapply is a multivariate apply of sorts which applies a function in parallel over a set of arguments.
> str(mapply)
function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE,USE.NAMES = TRUE)
FUN is a function to apply ... contains arguments to apply over MoreArgs is a list of other arguments to FUN.
SIMPLIFY indicates whether the result should be simplified
The following is tedious to type
list(rep(1, 4), rep(2, 3), rep(3, 2), rep(4, 1))
Instead we can do
Vectorizing a Function
> noise <- function(n, mean, sd) {
+ rnorm(n, mean, sd)
+ }
> noise(5, 1, 2)
[1] 2.4831198 2.4790100 0.4855190 -1.2117759
[5] -0.2743532
> noise(1:5, 1:5, 2)
[1] -4.2128648 -0.3989266 4.2507057 1.1572738
[5] 3.7413584
Instant Vectorization
> mapply(noise, 1:5, 1:5, 2)
Which is the same as
list(noise(1, 1, 2), noise(2, 2, 2), noise(3, 3, 2), noise(4, 4, 2), noise(5, 5, 2))
tapply
tapply is used to apply a function over subsets of a vector. I don’t know why it’s called tapply.
> str(tapply) function (X, INDEX, FUN = NULL, ..., simplify = TRUE)
X is a vector
INDEX is a factor or a list of factors (or else they are coerced to factors)
FUN is a function to be applied
... contains other arguments to be passed FUN
simplify, should we simplify the result?
Take group means.
> x <- c(rnorm(10), runif(10), rnorm(10, 1))
> f <- gl(3, 10)
> f
[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3
[24] 3 3 3 3 3 3 3
Levels: 1 2 3
> tapply(x, f, mean)
1 2 3
0.1144464 0.5163468 1.2463678
Take group means without simplification.
> tapply(x, f, mean, simplify = FALSE)
$‘1‘
[1] 0.1144464
$‘2‘
[1] 0.5163468
$‘3‘
[1] 1.246368
Find group ranges.
> tapply(x, f, range)
$‘1‘
[1] -1.097309 2.694970
$‘2‘
[1] 0.09479023 0.79107293
$‘3‘
[1] 0.4717443 2.5887025
split
split takes a vector or other objects and splits it into groups determined by a factor or list of
factors.
> str(split)
function (x, f, drop = FALSE, ...)
x is a vector (or list) or data frame
f is a factor (or coerced to one) or a list of factors
drop indicates whether empty factors levels should be dropped
A common idiom is split followed by an lapply.
> lapply(split(x, f), mean)
Splitting a Data Frame
> library(datasets)
> head(airquality)
> s <- split(airquality, airquality$Month)
> lapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))
> sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))
> sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")], na.rm = TRUE))
Splitting on More than One Level
> x <- rnorm(10)
> f1 <- gl(2, 5)
> f2 <- gl(5, 2)
Interactions can create empty levels.
> str(split(x, list(f1, f2)))
split
Empty levels can be dropped
> str(split(x, list(f1, f2), drop = TRUE))
List of 6
$ 1.1: num [1:2] -0.378 0.445
$ 1.2: num [1:2] 1.4066 0.0166
$ 1.3: num -0.355
$ 2.3: num 0.315
$ 2.4: num [1:2] -0.907 0.723
$ 2.5: num [1:2] 0.732 0.360
欢迎关注

R Programming week 3-Loop functions的更多相关文章
- Coursera系列-R Programming第二周
博客总目录,记录学习R与数据分析的一切:http://www.cnblogs.com/weibaar/p/4507801.html --- 好久没发博客 且容我大吼一句 终于做完这周R Progra ...
- Coursera系列-R Programming第三周-词法作用域
完成R Programming第三周 这周作业有点绕,更多地是通过一个缓存逆矩阵的案例,向我们示范[词法作用域 Lexical Scopping]的功效.但是作业里给出的函数有点绕口,花费了我们蛮多心 ...
- 让reddit/r/programming炸锅的一个帖子,还是挺有意思的
这是原帖 http://www.reddit.com/r/programming/comments/358tnp/five_programming_problems_every_software_en ...
- R Programming week2 Functions and Scoping Rules
A Diversion on Binding Values to Symbol When R tries to bind a value to a symbol,it searches through ...
- [R] [Johns Hopkins] R Programming 作業 Week 2 - Air Pollution
Introduction For this first programming assignment you will write three functions that are meant to ...
- R Programming week2 Control Structures
Control Structures Control structures in R allow you to control the flow of execution of the program ...
- R Programming week 3-Debugging
Something’s Wrong! Indications that something’s not right message: A generic notification/diagnostic ...
- R Programming week1-Reading Data
Reading Data There are a few principal functions reading data into R. read.table, read.csv, for read ...
- R Programming week1-Data Type
Objects R has five basic or “atomic” classes of objects: character numeric (real numbers) integer co ...
随机推荐
- linux document and directory find
http://suchalin.blog.163.com/blog/static/55304677201062924959497/ Linux 查看文件夹大小及文件数量命令 2010-07-29 14 ...
- 小贝_mysql优化学习
mysql优化 简要: 1.数据库设计优化 2.sql语句优化 3.表切割 4.读写分离技术 一.数据库设计优化 1.表设计要符合三范式.当然,有时也须要适当的逆范式 2.什么是三范式 一范式: 具有 ...
- 动态输出html一些效果失效的处理
利用AJAX动态加载页面,实现无刷新加载,有时会出现一些问题.比如说,在一些jquery控件中,是利用在页面加载的时候,对一些带有特殊属性的元素进行处理,比如事件绑定什么的.假如是动态加载,此时页面早 ...
- C项目实践--网络协议和套接字编程
1.TCP/IP协议 TCP/IP协议是一组包括TCP协议和IP协议,UDP(User Datagram Protocol)协议,ICMP(Internet Control Message Proto ...
- Gradle 安装
Gradle介绍 Gradle是一个基于JVM的构建工具,它提供了: 像Ant一样,通用灵活的构建工具 可以切换的,基于约定的构建框架 强大的多工程构建支持 基于Apache Ivy的强大的依赖管理 ...
- return value, output parameter,
Return Value https://docs.microsoft.com/en-us/sql/t-sql/language-elements/return-transact-sql?view=s ...
- RecyclerView的基本用法
RecyclerView 是一个增强版的ListView,不仅可以实现和ListView同样的效果,还优化了ListView中存在的各种不足之处 ResyslerView 能够实现横向滚动,这是Lis ...
- 关于页面上输入框中 空格 、0 、NULL 的处理 示例
ep.setPositionNum(get("positionNum").toString()); ep.setClasstype(get("classtype" ...
- apache-ab并发负载压力测试 不错
ab -n 3000 -c 3000 http://www.test.com/ c 100 即:每次并发3000 个 n 10000 即: 共发送3000 个请求 ab -t 60 -c 100 ht ...
- bzoj3195 [Jxoi2012]奇怪的道路——状压DP
题目:https://www.lydsy.com/JudgeOnline/problem.php?id=3195 看到数据范围就应该想到状压呢... 题解(原来是这样):https://www.cnb ...