Data Manipulation with dplyr in R

select

select(data,变量名)

The filter and arrange verbs

arrange

counties_selected <- counties %>%
select(state, county, population, private_work, public_work, self_employed) # Add a verb to sort in descending order of public_work
counties_selected %>%arrange(desc(public_work))

filter

counties_selected <- counties %>%
select(state, county, population) # Filter for counties in the state of California that have a population above 1000000
counties_selected %>%
filter(state == "California",
population > 1000000)
#筛选多个变量
filter(id %in% c("a","b","c"...)) 存在
filter(id %in% c("a","b","c"...)) 不存在

fct_relevel {forcats}

Reorder factor levels by hand

排序,order不好使的时候

f <- factor(c("a", "b", "c", "d"), levels = c("b", "c", "d", "a"))
fct_relevel(f)
fct_relevel(f, "a")
fct_relevel(f, "b", "a") # Move to the third position
fct_relevel(f, "a", after = 2) # Relevel to the end
fct_relevel(f, "a", after = Inf)
fct_relevel(f, "a", after = 3) # Revel with a function
fct_relevel(f, sort)
fct_relevel(f, sample)
fct_relevel(f, rev)

Filtering and arranging

 counties_selected <- counties %>%
select(state, county, population, private_work, public_work, self_employed)
>
> # Filter for Texas and more than 10000 people; sort in descending order of private_work
> counties_selected %>%filter(state=='Texas',population>10000)%>%arrange(desc(private_work))
# A tibble: 169 x 6
state county population private_work public_work self_employed
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 Texas Gregg 123178 84.7 9.8 5.4
2 Texas Collin 862215 84.1 10 5.8
3 Texas Dallas 2485003 83.9 9.5 6.4
4 Texas Harris 4356362 83.4 10.1 6.3
5 Texas Andrews 16775 83.1 9.6 6.8
6 Texas Tarrant 1914526 83.1 11.4 5.4
7 Texas Titus 32553 82.5 10 7.4
8 Texas Denton 731851 82.2 11.9 5.7
9 Texas Ector 149557 82 11.2 6.7
10 Texas Moore 22281 82 11.7 5.9
# ... with 159 more rows

Mutate

counties_selected <- counties %>%
select(state, county, population, public_work) # Sort in descending order of the public_workers column
counties_selected %>%
mutate(public_workers = public_work * population / 100) %>%arrange(desc(public_workers))
counties %>%
# Select the five columns
select(state, county, population, men, women) %>%
# Add the proportion_men variable
mutate(proportion_men = men / population) %>%
# Filter for population of at least 10,000
filter(population >= 10000) %>%
# Arrange proportion of men in descending order
arrange(desc(proportion_men))

The count verb

counties_selected %>%count(region,sort=TRUE)
counties_selected %>%count(state,wt=citizens,sort=TRUE)

Summarizing

# Summarize to find minimum population, maximum unemployment, and average income
counties_selected %>%summarize(
min_population=min(population),
max_unemployment=max(unemployment),
average_income=mean(income)
)
# Add a density column, then sort in descending order
counties_selected %>%
group_by(state) %>%
summarize(total_area = sum(land_area),
total_population = sum(population),
density=total_population/total_area) %>%arrange(desc(density))

发现了,归根到底是一种函数关系,看看该怎样处理这个函数比较简单,如果写不出来,可能和小学的时候应用题写不出来有关系

top_n

按照优先级来筛选

# Extract the most populated row for each state
counties_selected %>%
group_by(state, metro) %>%
summarize(total_pop = sum(population)) %>%
top_n(1, total_pop)

Selecting

Using the select verb, we can answer interesting questions about our dataset by focusing in on related groups of verbs.

The colon (

Data Manipulation with dplyr in R的更多相关文章

  1. Data manipulation primitives in R and Python

    Data manipulation primitives in R and Python Both R and Python are incredibly good tools to manipula ...

  2. Best packages for data manipulation in R

    dplyr and data.table are amazing packages that make data manipulation in R fun. Both packages have t ...

  3. The dplyr package has been updated with new data manipulation commands for filters, joins and set operations.(转)

    dplyr 0.4.0 January 9, 2015 in Uncategorized I’m very pleased to announce that dplyr 0.4.0 is now av ...

  4. java.sql.SQLException: Can not issue data manipulation statements with executeQuery().

    1.错误描写叙述 java.sql.SQLException: Can not issue data manipulation statements with executeQuery(). at c ...

  5. Can not issue data manipulation statements with executeQuery()错误解决

    转: Can not issue data manipulation statements with executeQuery()错误解决 2012年03月27日 15:47:52 katalya 阅 ...

  6. 数据库原理及应用-SQL数据操纵语言(Data Manipulation Language)和嵌入式SQL&存储过程

    2018-02-19 18:03:54 一.数据操纵语言(Data Manipulation Language) 数据操纵语言是指插入,删除和更新语言. 二.视图(View) 数据库三级模式,两级映射 ...

  7. Can not issue data manipulation statements with executeQuery().解决方案

    这个错误提示是说无法发行sql语句到指定的位置 错误写法: 正确写法: excuteQuery是查询语句,而我要调用的是更新的语句,所以这样数据库很为难到底要干嘛,实际我想用的是更新,但是我写成了查询 ...

  8. Can not issue data manipulation statements with executeQuery()的解决方案

     Can not issue data manipulation statements with executeQuery() 报错的解决方案: 把“ResultSet rs = statement. ...

  9. 【转】Hive Data Manipulation Language

    Hive Data Manipulation Language Hive Data Manipulation Language Loading files into tables Syntax Syn ...

随机推荐

  1. Mysql 两种引擎的区别

    MyISAM与InnoDB的区别是什么? 1. 存储结构 MyISAM:每个MyISAM在磁盘上存储成三个文件.第一个文件的名字以表的名字开始,扩展名指出文件类型..frm文件存储表定义.数据文件的扩 ...

  2. centos7网卡启动不了

    网上查了很多资料了解网卡启动不了的原因,今天总结一下几种网卡启动不了的解决方案,以备参考. systemctl restart network         //重启网卡 返回报错: Restart ...

  3. VISIO 的一些技巧

    1.复制绘图 如果格式改变,在“设计”选项卡里将“将主题运用于新建的形状”前面的√去掉

  4. Linux系统下的CPU、内存、IO、网络的压力测试

    本文转载自:小豆芽博客 一.对CPU进行简单测试: 1.通过bc命令计算特别函数 例:计算圆周率 echo "scale=5000; 4*a(1)" | bc -l -q MATH ...

  5. qt creator源码全方面分析(1)

    目录介绍 首先我们对软件源代码根目录下的各个重要文件(夹)做一个简单的介绍,对整体有一个大概的了解. 下面对目录及其内容做一个大概的初步的介绍,后面我尽量按照目录顺序进行依次介绍,当然可能会有一些交叉 ...

  6. py 二级习题(加密与解密)

    题目: 1.比如说,我想    “我喜欢月月”  这句话加密即:将字符串中的每个字符的unicode值全都向后移动三位,即unicode 值加3,然后输出. 2.将按照上述规则加密的文字解密即:将字符 ...

  7. bugkuCTF-管理员系统(IP伪造)

    题目地址:http://123.206.31.85:1003/ 登进去是一个管理员后台登录的样子 试了sql的万能密码,发现进不了,而且下面还报错了ip禁止 禁止了我们的ip,但是他本地的ip肯定没有 ...

  8. JavaScript 中的构造函数

    典型的面向对象编程语言(比如C++和Java),存在“类”(class)这个概念.所谓“类”就是对象的模板,对象就是“类”的实例.但是,在JavaScript语言的对象体系,不是基于“类”的,而是基于 ...

  9. vue自学入门-4(vue slot)

    vue自学入门-1(Windows下搭建vue环境) vue自学入门-2(vue创建项目) vue自学入门-3(vue第一个例子) vue自学入门-4(vue slot) vue自学入门-5(vuex ...

  10. Python论做游戏外挂,Python输过谁?

    玩过电脑游戏的同学对于外挂肯定不陌生,但是你在用外挂的时候有没有想过如何做一个外挂呢? 我打开了4399小游戏网,点开了一个不知名的游戏,唔,做寿司的,有材料在一边,客人过来后说出他们的要求,你按照菜 ...