Looping on the Command Line

Writing for, while loops is useful when programming but not particularly easy when working interactively on the command line. There are some functions which implement looping to make life easier

lapply: Loop over a list and evaluate a function on each elementsapply: Same as lapply but try to simplify the result

apply: Apply a function over the margins of an array

tapply: Apply a function over subsets of a vector mapply: Multivariate version of lapply

An auxiliary function split is also useful, particularly in conjunction with lapply

lapply

lapply takes three arguments: (1) a list X; (2) a function (or the name of a function) FUN; (3) other arguments via its ... argument. If X is not a list, it will be coerced to a list using as.list.

## function (X, FUN, ...)

## {

## FUN <- match.fun(FUN)

## if (!is.vector(X) || is.object(X))

## X <- as.list(X)

## .Internal(lapply(X, FUN))

## }

## <bytecode: 0x7ff7a1951c00>

## <environment: namespace:base>

The actual looping is done internally in C code.

lapply always returns a list, regardless of the class of the input.

x <- list(a = 1:5, b = rnorm(10))

lapply(x, mean)

x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5)) lapply(x, mean)

> x <- 1:4 > lapply(x, runif)

lapply and friends make heavy use of anonymous function

> x <- list(a = matrix(1:4, 2, 2), b = matrix(1:6, 3, 2))

> x

$a

[,1] [,2]

[1,] 1 3

[2,] 2 4

$b

[,1] [,2]

[1,] 1 4

[2,] 2 5

[3,] 3 6

An anonymous function for extracting the first column of each matrix.

> lapply(x, function(elt) elt[,1])

$a

[1] 1 2

$b

[1] 1 2 3

sapply

> x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5))

> lapply(x, mean)

apply

apply is used to a evaluate a function (often an anonymous one) over the margins of an array.

It is most often used to apply a function to the rows or columns of a matrix

It can be used with general arrays, e.g. taking the average of an array of matrices

It is not really faster than writing a loop, but it works in one line!

> str(apply)

function (X, MARGIN, FUN, ...)

X is an array

MARGIN is an integer vector indicating which margins should be “retained”.

FUN is a function to be applied

... is for other arguments to be passed to FUN

> x <- matrix(rnorm(200), 20, 10)

> apply(x, 2, mean)

[1] 0.04868268 0.35743615 -0.09104379

[4] -0.05381370 -0.16552070 -0.18192493

[7] 0.10285727 0.36519270 0.14898850

[10] 0.26767260

col/row sums and means

For sums and means of matrix dimensions, we have some shortcuts.

rowSums = apply(x, 1, sum)

rowMeans = apply(x, 1, mean)

colSums = apply(x, 2, sum)

colMeans = apply(x, 2, mean)

The shortcut functions are much faster, but you won’t notice unless you’re using a large matrix.

Other Ways to Apply

Quantiles of the rows of a matrix.

> x <- matrix(rnorm(200), 20, 10)

> apply(x, 1, quantile, probs = c(0.25, 0.75))

mapply

mapply is a multivariate apply of sorts which applies a function in parallel over a set of arguments.

> str(mapply)

function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE,USE.NAMES = TRUE)

FUN is a function to apply ... contains arguments to apply over MoreArgs is a list of other arguments to FUN.

SIMPLIFY indicates whether the result should be simplified

The following is tedious to type

list(rep(1, 4), rep(2, 3), rep(3, 2), rep(4, 1))

Instead we can do

Vectorizing a Function

> noise <- function(n, mean, sd) {

+ rnorm(n, mean, sd)

+ }

> noise(5, 1, 2)

[1] 2.4831198 2.4790100 0.4855190 -1.2117759

[5] -0.2743532

> noise(1:5, 1:5, 2)

[1] -4.2128648 -0.3989266 4.2507057 1.1572738

[5] 3.7413584

Instant Vectorization

> mapply(noise, 1:5, 1:5, 2)

Which is the same as

list(noise(1, 1, 2), noise(2, 2, 2), noise(3, 3, 2), noise(4, 4, 2), noise(5, 5, 2))

tapply

tapply is used to apply a function over subsets of a vector. I don’t know why it’s called tapply.

> str(tapply) function (X, INDEX, FUN = NULL, ..., simplify = TRUE)

X is a vector

INDEX is a factor or a list of factors (or else they are coerced to factors)

FUN is a function to be applied

... contains other arguments to be passed FUN

simplify, should we simplify the result?

Take group means.

> x <- c(rnorm(10), runif(10), rnorm(10, 1))

> f <- gl(3, 10)

> f

[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3

[24] 3 3 3 3 3 3 3

Levels: 1 2 3

> tapply(x, f, mean)

1 2 3

0.1144464 0.5163468 1.2463678

Take group means without simplification.

> tapply(x, f, mean, simplify = FALSE)

$‘1‘

[1] 0.1144464

$‘2‘

[1] 0.5163468

$‘3‘

[1] 1.246368

Find group ranges.

> tapply(x, f, range)

$‘1‘

[1] -1.097309 2.694970

$‘2‘

[1] 0.09479023 0.79107293

$‘3‘

[1] 0.4717443 2.5887025

split

split takes a vector or other objects and splits it into groups determined by a factor or list of
factors.

> str(split)
function (x, f, drop = FALSE, ...)

x is a vector (or list) or data frame

f is a factor (or coerced to one) or a list of factors

drop indicates whether empty factors levels should be dropped

A common idiom is split followed by an lapply.

> lapply(split(x, f), mean)

Splitting a Data Frame

> library(datasets)

> head(airquality)

> s <- split(airquality, airquality$Month)

> lapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))

> sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))

> sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")], na.rm = TRUE))

Splitting on More than One Level

> x <- rnorm(10)

> f1 <- gl(2, 5)

> f2 <- gl(5, 2)

Interactions can create empty levels.

> str(split(x, list(f1, f2)))

split

Empty levels can be dropped

> str(split(x, list(f1, f2), drop = TRUE))

List of 6

$ 1.1: num [1:2] -0.378 0.445

$ 1.2: num [1:2] 1.4066 0.0166

$ 1.3: num -0.355

$ 2.3: num 0.315

$ 2.4: num [1:2] -0.907 0.723

$ 2.5: num [1:2] 0.732 0.360

欢迎关注

R Programming week 3-Loop functions的更多相关文章

  1. Coursera系列-R Programming第二周

    博客总目录,记录学习R与数据分析的一切:http://www.cnblogs.com/weibaar/p/4507801.html  --- 好久没发博客 且容我大吼一句 终于做完这周R Progra ...

  2. Coursera系列-R Programming第三周-词法作用域

    完成R Programming第三周 这周作业有点绕,更多地是通过一个缓存逆矩阵的案例,向我们示范[词法作用域 Lexical Scopping]的功效.但是作业里给出的函数有点绕口,花费了我们蛮多心 ...

  3. 让reddit/r/programming炸锅的一个帖子,还是挺有意思的

    这是原帖 http://www.reddit.com/r/programming/comments/358tnp/five_programming_problems_every_software_en ...

  4. R Programming week2 Functions and Scoping Rules

    A Diversion on Binding Values to Symbol When R tries to bind a value to a symbol,it searches through ...

  5. [R] [Johns Hopkins] R Programming 作業 Week 2 - Air Pollution

    Introduction For this first programming assignment you will write three functions that are meant to ...

  6. R Programming week2 Control Structures

    Control Structures Control structures in R allow you to control the flow of execution of the program ...

  7. R Programming week 3-Debugging

    Something’s Wrong! Indications that something’s not right message: A generic notification/diagnostic ...

  8. R Programming week1-Reading Data

    Reading Data There are a few principal functions reading data into R. read.table, read.csv, for read ...

  9. R Programming week1-Data Type

    Objects R has five basic or “atomic” classes of objects: character numeric (real numbers) integer co ...

随机推荐

  1. 使用OnScrollListener回调处理自己主动载入很多其它

    首先来分析下OnScrollListener的回调. new OnScrollListener() { boolean isLastRow = false; @Override public void ...

  2. ubuntu字符界面下显示中文和调整分辨率

    1.sudo apt-get install zhcon 2.vi /etc/zhcon.conf  修改下面两行 x_resolution 1024 y_resolution 768 完成这两步后在 ...

  3. 迭代器-iteration

    class CoffrrIterator implements Iterator<Coffee> { int cunt = size; public boolean hasNext() { ...

  4. oracle经典建表语句--scott建表

    create table EMP ( EMPNO ) PRIMARY KEY, ENAME ), JOB ), MGR ), HIREDATE DATE, SAL ,), COMM ,), DEPNO ...

  5. Cmake的介绍和使用 Cmake实践【转】

    本文转载自:http://www.cppblog.com/Roger/archive/2011/11/17/160368.html Cmake的介绍和使用 Cmake实践 Cmake优点: 1.    ...

  6. git pull ,git fetch ,git merge

    git pull 是git fetch与git merge的组合. 有时候拆开使用,会更加的安全. 比如想比较,本地分支,与线上分支的差别,就可以先 git fetch 这样就可以,git diff ...

  7. 【转】Java 并发编程:volatile的使用及其原理

    一.volatile的作用 在<Java并发编程:核心理论>一文中,我们已经提到过可见性.有序性及原子性问题,通常情况下我们可以通过Synchronized关键字来解决这些个问题,不过如果 ...

  8. 并不对劲的bzoj5020:loj2289:p4546:[THUWC2017]在美妙的数学王国中畅游

    题目大意 有一个n(\(n\leq 10^5\))个点的森林,每个点\(u\)上有个函数\(f_u(x)\),是形如\(ax+b\)或\(e^{ax+b}\)或\(sin(ax+b)\)的函数,保证当 ...

  9. WEB开发框架系列教程 (一)快速创建解决方案

    执行<华东信息辅助开发工具> 程序 打开程序界面如下图 输入用户名.密码进行登录 如果暂时还没有用户名和密码,点击注册提供机器码给管理员进行注册. 管理员QQ:93346562 下图是:点 ...

  10. python pip安装第三方模块

    一.pip工具使用 安装windows版本python,自带pip工具.2者路径相同. 如果设置了环境路径,可以直接在命令提示符窗口下尝试运行pip.如果没有设置环境路径,可以先cd命令到pip工具的 ...