R Programming week1-Data Type
Objects
R has five basic or “atomic” classes of objects:
character
numeric (real numbers)
integer
complex
logical (True/False)
The most basic object is a vector
A vector can only contain objects of the same class
BUT: The one exception is a list, which is represented as a vector but can contain objects of
different classes (indeed, that’s usually why we use them)
Empty vectors can be created with the vector() function.
Numbers
Numbers in R a generally treated as numeric objects (i.e. double precision real numbers)
If you explicitly want an integer, you need to specify the L suffix
Ex: Entering 1 gives you a numeric object; entering 1L explicitly gives you an integer.
There is also a special number Inf which represents infinity; e.g. 1 / 0; Inf can be used in
ordinary calculations; e.g. 1 / Inf is 0
The value NaN represents an undefined value (“not a number”); e.g. 0 / 0; NaN can also be
thought of as a missing value (more on that later)
Attributes
R objects can have attributes
names, dimnames
dimensions (e.g. matrices, arrays)
class
length
other user-defined attributes/metadata
Attributes of an object can be accessed using the attributes() function.
Creating Vectors
The c() function can be used to create vectors of objects.
Using the vector() function
> x <- vector("numeric", length = 10)
> x
[1] 0 0 0 0 0 0 0 0 0 0
Mixing Objects Mixing Objects
> y <- c(1.7, "a") ## character
> y <- c(TRUE, 2) ## numeric
> y <- c("a", TRUE) ## character
When different objects are mixed in a vector, coercion occurs so that every element in the vector is
of the same class.
Explicit Coercion
Objects can be explicitly coerced from one class to another using the as.* functions, if available.
> x <- 0:6
> class(x)
[1] "integer"
> as.numeric(x)
[1] 0 1 2 3 4 5 6
> as.logical(x)
[1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE
> as.character(x)
[1] "0" "1" "2" "3" "4" "5" "6"
Nonsensical coercion results in NAs.
> x <- c("a", "b", "c")
> as.numeric(x)
[1] NA NA NA
Warning message:
NAs introduced by coercion
> as.logical(x)
[1] NA NA NA
> as.complex(x)
[1] 0+0i 1+0i 2+0i 3+0i 4+0i 5+0i 6+0i
Lists
Lists are a special type of vector that can contain elements of different classes. Lists are a very
important data type in R and you should get to know them well.
> x <- list(1, "a", TRUE, 1 + 4i)
> x
[[1]]
[1] 1
[[2]]
[1] "a"
[[3]]
[1] TRUE
[[4]]
[1] 1+4i
Matrices Matrices
Matrices are vectors with a dimension attribute. The dimension attribute is itself an integer vector of length 2 (nrow, ncol)
> m <- matrix(nrow = 2, ncol = 3)
> m
[,1] [,2] [,3]
[1,] NA NA NA
[2,] NA NA NA
> dim(m)
[1] 2 3
> attributes(m)
$dim
[1] 2 3
Matrices (cont’d)
Matrices are constructed column-wise, so entries can be thought of starting in the “upper left” corner and running down the columns.
> m <- matrix(1:6, nrow = 2, ncol = 3)
> m
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
Matrices can also be created directly from vectors by adding a dimension attribute.
> m <- 1:10
> m
[1] 1 2 3 4 5 6 7 8 9 10
> dim(m) <- c(2, 5)
> m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
cbind-ing and rbind-ing cbind-ing and rbind-ing
Matrices can be created by column-binding or row-binding with cbind() and rbind().
> x <- 1:3
> y <- 10:12
> cbind(x, y)
x y
[1,] 1 10
[2,] 2 11
[3,] 3 12
> rbind(x, y)
[,1] [,2] [,3]
x 1 2 3
y 10 11 12
Factors
Factors are used to represent categorical data. Factors can be unordered or ordered. One can think
of a factor as an integer vector where each integer has a label.
Factors are treated specially by modelling functions like lm() and glm()
Using factors with labels is better than using integers because factors are self-describing; having
a variable that has values “Male” and “Female” is better than a variable that has values 1 and 2.
> x <- factor(c("yes", "yes", "no", "yes", "no"))
> x
[1] yes yes no yes no
Levels: no yes
> table(x)
x
no yes
2 3
> unclass(x)
[1] 2 2 1 2 1
attr(,"levels")
[1] "no" "yes"
The order of the levels can be set using the levels argument to factor(). This can be important
in linear modelling because the first level is used as the baseline level.
> x <- factor(c("yes", "yes", "no", "yes", "no"),
levels = c("yes", "no"))
> x
[1] yes yes no yes no
Levels: yes no
Missing Values Missing Values
Missing values are denoted by NA or NaN for undefined mathematical operations.
is.na() is used to test objects if they are NA
is.nan() is used to test for NaN
NA values have a class also, so there are integer NA, character NA, etc.
A NaN value is also NA but the converse is not true
> x <- c(1, 2, NA, 10, 3)
> is.na(x)
[1] FALSE FALSE TRUE FALSE FALSE
> is.nan(x)
[1] FALSE FALSE FALSE FALSE FALSE
> x <- c(1, 2, NaN, NA, 4)
> is.na(x)
[1] FALSE FALSE TRUE TRUE FALSE
> is.nan(x)
[1] FALSE FALSE TRUE FALSE FALSE
Data Frames
Data frames are used to store tabular data
They are represented as a special type of list where every element of the list has to have the
same length
Each element of the list can be thought of as a column and the length of each element of the list
is the number of rows
Unlike matrices, data frames can store different classes of objects in each column (just like lists);
matrices must have every element be the same class
Data frames also have a special attribute called row.names
Data frames are usually created by calling read.table() or read.csv()
Can be converted to a matrix by calling data.matrix()
> x <- data.frame(foo = 1:4, bar = c(T, T, F, F))
> x
foo bar
1 1 TRUE
2 2 TRUE
3 3 FALSE
4 4 FALSE
> nrow(x)
[1] 4
> ncol(x)
[1] 2
Names
R objects can also have names, which is very useful for writing readable code and self-describing
objects.
> x <- 1:3
> names(x)
NULL
> names(x) <- c("foo", "bar", "norf")
> x
foo bar norf
1 2 3
> names(x)
[1] "foo" "bar" "norf"
Summary
Data Types
atomic classes: numeric, logical, character, integer, complex \
vectors, lists
factors
missing values
data frames
names
R Programming week1-Data Type的更多相关文章
- R Programming week1-Reading Data
Reading Data There are a few principal functions reading data into R. read.table, read.csv, for read ...
- Coursera系列-R Programming第二周
博客总目录,记录学习R与数据分析的一切:http://www.cnblogs.com/weibaar/p/4507801.html --- 好久没发博客 且容我大吼一句 终于做完这周R Progra ...
- Coursera系列-R Programming第三周-词法作用域
完成R Programming第三周 这周作业有点绕,更多地是通过一个缓存逆矩阵的案例,向我们示范[词法作用域 Lexical Scopping]的功效.但是作业里给出的函数有点绕口,花费了我们蛮多心 ...
- salesforce 零基础开发入门学习(四)多表关联下的SOQL以及表字段Data type详解
建立好的数据表在数据库中查看有很多方式,本人目前采用以下两种方式查看数据表. 1.采用schema Builder查看表结构以及多表之间的关联关系,可以登录后点击setup在左侧搜索框输入schema ...
- include pointers as a primitive data type
Computer Science An Overview _J. Glenn Brookshear _11th Edition Many modern programming languages in ...
- 1月21日 Reference Data Type 数据类型,算法基础说明,二分搜索算法。(课程内容)
Reference Datat Types 引用参考数据类型 -> 组合数据类型 Array, Hash和程序员自定义的复合资料类型 组合数据的修改: 组合数据类型的变量,不是直接存值,而是存一 ...
- 【转载】salesforce 零基础开发入门学习(四)多表关联下的SOQL以及表字段Data type详解
salesforce 零基础开发入门学习(四)多表关联下的SOQL以及表字段Data type详解 建立好的数据表在数据库中查看有很多方式,本人目前采用以下两种方式查看数据表. 1.采用schem ...
- PHP 笔记一(systax/variables/echo/print/Data Type)
PHP stands for "Hypertext Preprocessor" ,it is a server scripting language. What Can PHP D ...
- JAVA 1.2(原生数据类型 Primitive Data Type)
1. Java的数据类型分为2类 >> 原生数据类型(primitive data type) >> 引用数据类型(reference data type) 3. 常量和变量 ...
随机推荐
- Gson转换Json串为对象报java.lang.NoClassDefFoundError
解决方法: 1.右键项目 ---> properties ----> java buildpath ---> order and export 2. 勾选 gson-x.x.x.ja ...
- PyTorch 60 分钟入门教程:数据并行处理
可选择:数据并行处理(文末有完整代码下载) 作者:Sung Kim 和 Jenny Kang 在这个教程中,我们将学习如何用 DataParallel 来使用多 GPU. 通过 PyTorch 使用多 ...
- Qt 学习之路 2(19):事件的接受与忽略(当重写事件回调函数时,时刻注意是否需要通过调用父类的同名函数来确保原有实现仍能进行!有好几个例子。为什么要这么做?而不是自己去手动调用这两个函数呢?因为我们无法确认父类中的这个处理函数有没有额外的操作)
版本: 2012-09-29 2013-04-23 更新有关accept()和ignore()函数的相关内容. 2013-12-02 增加有关accept()和ignore()函数的示例. 上一章我们 ...
- centOS安装mysql---glibc方式
写在前面: 首先,centos是自己集成mysql的.但是我要用的服务器人家没给装. 其次,centos是可以yum安装mysql的,我很高兴而且轻松的用yum把mysql安装上了.但是,运行的时候很 ...
- Android vector Path Data画图详解
SVG是一种矢量图格式,是Scalable Vector Graphics三个单词的首字母缩写.在xml文件中的标签是,画出的图形可以像一般的图片资源使用,例子如下: <vector xmlns ...
- java中一个字符串是另外一个字符串的字串
java中一个字符串是另外一个字符串的字串 String类中有一个方法 public boolean contains(Sting s)就是用来判断当前字符串是否含有参数指定的字符串例s1=“take ...
- BZOJ3160【万径人踪灭】 【FFT】
..恩 打了四五遍 不会也背出来了.. BZOJ3160 [听说时限紧?转C++的优势么?] 上AC代码 fft /*Problem: 3160 User: cyz666 Language: C++ ...
- MySQL 三种关联查询的方式: ON vs USING vs 传统风格
看看下面三个关联查询的 SQL 语句有何区别? 1SELECT * FROM film JOIN film_actor ON (film.film_id = film_actor.film_id) 2 ...
- Windows Hadoop安装
由于hadoop版本2.7.1对其他相关工具兼容较好,本文以此版本为例. 一.下载解压 各镜像站现已没有这个版本,所以去Apache官网下载 http://www.apache.org/dyn/clo ...
- css覆盖select样式并添加小箭头
.select { border-radius: 5px; border: 1px #F4A627 solid; -webkit-appearance: none;//清除默认样式 backgroun ...