[译]pandas中的iloc loc的区别？

loc 从特定的

gets rows (or columns) with particular labels from the index.

iloc gets rows (or columns) at particular positions in the index (so it only takes integers).

ix usually tries to behave like loc but falls back to behaving like iloc if a label is not present in the index.

It's important to note some subtleties that can make ix slightly tricky to use:

if the index is of integer type, ix will only use label-based indexing and not fall back to position-based indexing. If the label is not in the index, an error is raised.

if the index does not contain only integers, then given an integer, ix will immediately use position-based indexing rather than label-based indexing. If however ix is given another type (e.g. a string), it can use label-based indexing.

To illustrate the differences between the three methods, consider the following Series:

>>> s = pd.Series(np.nan, index=[49,48,47,46,45, 1, 2, 3, 4, 5])

>>> s

49   NaN

48   NaN

47   NaN

46   NaN

45   NaN

1    NaN

2    NaN

3    NaN

4    NaN

5    NaN

We'll look at slicing with the integer value 3.

In this case, s.iloc[:3] returns us the first 3 rows (since it treats 3 as a position) and s.loc[:3] returns us the first 8 rows (since it treats 3 as a label):

>>> s.iloc[:3] # slice the first three rows

49   NaN

48   NaN

47   NaN

>>> s.loc[:3] # slice up to and including label 3

49   NaN

48   NaN

47   NaN

46   NaN

45   NaN

1    NaN

2    NaN

3    NaN

>>> s.ix[:3] # the integer is in the index so s.ix[:3] works like loc

49   NaN

48   NaN

47   NaN

46   NaN

45   NaN

1    NaN

2    NaN

3    NaN

Notice s.ix[:3] returns the same Series as s.loc[:3] since it looks for the label first rather than working on the position (and the index for s is of integer type).

What if we try with an integer label that isn't in the index (say 6)?

Here s.iloc[:6] returns the first 6 rows of the Series as expected. However, s.loc[:6] raises a KeyError since 6 is not in the index.

>>> s.iloc[:6]

49   NaN

48   NaN

47   NaN

46   NaN

45   NaN

1    NaN

>>> s.loc[:6]

KeyError: 6

>>> s.ix[:6]

KeyError: 6

As per the subtleties noted above, s.ix[:6] now raises a KeyError because it tries to work like loc but can't find a 6 in the index. Because our index is of integer type ix doesn't fall back to behaving like iloc.

If, however, our index was of mixed type, given an integer ix would behave like iloc immediately instead of raising a KeyError:

>>> s2 = pd.Series(np.nan, index=['a','b','c','d','e', 1, 2, 3, 4, 5])

>>> s2.index.is_mixed() # index is mix of different types

True

>>> s2.ix[:6] # now behaves like iloc given integer

a NaN

b NaN

c NaN

d NaN

e NaN

1 NaN

Keep in mind that ix can still accept non-integers and behave like loc:

>>> s2.ix[:'c'] # behaves like loc given non-integer

a NaN

b NaN

c NaN

As general advice, if you're only indexing using labels, or only indexing using integer positions, stick with loc or iloc to avoid unexpected results - try not use ix.

Combining position-based and label-based indexing

Sometimes given a DataFrame, you will want to mix label and positional indexing methods for the rows and columns.

For example, consider the following DataFrame. How best to slice the rows up to and including 'c' and take the first four columns?

>>> df = pd.DataFrame(np.nan,

                      index=list('abcde'),

                      columns=['x','y','z', 8, 9])

>>> df

x   y   z   8   9

a NaN NaN NaN NaN NaN

b NaN NaN NaN NaN NaN

c NaN NaN NaN NaN NaN

d NaN NaN NaN NaN NaN

e NaN NaN NaN NaN NaN

In earlier versions of pandas (before 0.20.0) ix lets you do this quite neatly - we can slice the rows by label and the columns by position (note that for the columns, ix will default to position-based slicing since 4 is not a column name):

>>> df.ix[:'c', :4]

x   y   z   8

a NaN NaN NaN NaN

b NaN NaN NaN NaN

c NaN NaN NaN NaN

In later versions of pandas, we can achieve this result using iloc and the help of another method:

>>> df.iloc[:df.index.get_loc('c') + 1, :4]

x   y   z   8

a NaN NaN NaN NaN

b NaN NaN NaN NaN

c NaN NaN NaN NaN

get_loc() is an index method meaning "get the position of the label in this index". Note that since slicing with iloc is exclusive of its endpoint, we must add 1 to this value if we want row 'c' as well.

There are further examples in pandas' documentation here.

注：

https://stackoverflow.com/questions/31593201/pandas-iloc-vs-ix-vs-loc-explanation-how-are-they-different

[译]pandas中的iloc loc的区别？的更多相关文章

python pandas（ix & iloc &loc）
python pandas(ix & iloc &loc) loc——通过行标签索引行数据 iloc——通过行号索引行数据 ix——通过行标签或者行号索引行数据(基于loc和iloc ...
[译] Pandas中根据列的值选取多行数据
# 选取等于某些值的行记录用 == df.loc[df['column_name'] == some_value] # 选取某列是否是某一类型的数值用 isin df.loc[df['column ...
Pandas中Series与Dataframe的区别
1. Series Series通俗来讲就是一维数组,索引(index)为每个元素的下标,值(value)为下标对应的值例如: arr = ['Tom', 'Nancy', 'Jack', 'Ton ...
Pandas中merge和join的区别
可以说merge包含了join的操作,merge支持通过列或索引连表,而join只支持通过索引连表,只是简化了merge的索引连表的参数示例定义一个left的DataFrame left=pd.D ...
pandas中loc-iloc-ix的使用
转自:https://www.jianshu.com/p/d6a9845a0a34 Pandas中loc,iloc,ix的使用使用 iloc 从DataFrame中筛选数据 iloc 是基于“位置” ...
pandas中df.ix, df.loc, df.iloc 的使用场景以及区别
pandas中df.ix, df.loc, df.iloc 的使用场景以及区别: https://stackoverflow.com/questions/31593201/pandas-iloc-vs ...
pandas中DataFrame的ix，loc，iloc索引方式的异同
pandas中DataFrame的ix,loc,iloc索引方式的异同 1.loc: 按照标签索引,范围包括start和end 2.iloc: 在位置上进行索引,不包括end 3.ix: 先在inde ...
pandas-03 DataFrame()中的iloc和loc用法
pandas-03 DataFrame()中的iloc和loc用法简单的说: iloc,即index locate 用index索引进行定位,所以参数是整型,如:df.iloc[10:20, 3:5 ...
Pandas中Series和DataFrame的索引
在对Series对象和DataFrame对象进行索引的时候要明确这么一个概念:是使用下标进行索引,还是使用关键字进行索引.比如list进行索引的时候使用的是下标,而dict索引的时候使用的是关键字. ...

随机推荐

Python之邮件发送
Python的smtplib提供了一种很方便的途径用来发送电子邮件,它有SMTP协议进行简单的封装,可以使用SMTP对象的sendmail方法发送邮件,通过help()查看SMTP所提供的方法如下: ...
LINQ 组合查询和分页查询的使用
前端代码 <%@ Page Language="C#" AutoEventWireup="true" Debug="true" Cod ...
Array - Container With Most Water
/** * 此为暴力解法 * Find two lines, which together with x-axis forms a container, such that the container ...
2018.5.13 oracle遇到的问题
安装Oracle 11g 出现交换空间不够在计算机那里右键打开属性进入高级系统设置然后找到第一个设置找到高级然后更改一下自定义范围(云服务器是16-10000) 然后确定完成了. 快安装结束之后显 ...
appium---启动app
自动化测试是测试人员必备的一项技能,所谓的自动化就是通过代码完成了手工的操作,今天就总结下如何通过python启动app 环境条件 1.安装python:下载地址 2.安装JDK:下载地址 3.安装A ...
exportfs: /mnt/demo requires fsid= for NFS export
解决方法:/mnt/demo 10.0.1.57(fsid=0,rw,async) //加入fsid=0参数就可.
如何用纯 CSS 创作一个行驶中的火车 loader
效果预览在线演示按下右侧的"点击预览"按钮可以在当前页面预览,点击链接可以全屏预览. https://codepen.io/comehope/pen/RBLWzJ 可交互视频 ...
paper:synthesizable finit state machine design techniques using the new systemverilog 3.0 enhancements之fsm summary
主要是1.不要用1段式写FSM 2.不要用状态编码写one-hot FSM ,要用索引编码写one-hot FSM.
python里字典的用法介绍
一.什么是字典字典是python里的一种数据类型,特点是元素的无序性,和键key的唯一性.字典的创建方法是{key:values},字典里的键key只能是不可变的数据类型(整型,字符串或者是元组), ...
使用TensorFlow的卷积神经网络识别手写数字（2）-训练篇
import numpy as np import tensorflow as tf import matplotlib import matplotlib.pyplot as plt import ...

[译]pandas中的iloc loc的区别？

[译]pandas中的iloc loc的区别？的更多相关文章

随机推荐

热门专题