pandas-06 Series和Dataframe的排序操作

对pandas中的Series和Dataframe进行排序，主要使用sort_values()和sort_index()。

DataFrame.sort_values(by, axis=0, ascending=True, inplace=False, kind=‘quicksort’, na_position=‘last’)

by：列名，按照某列排序

axis：按照index排序还是按照column排序

ascending：是否升序排列

kind：选择排序算法{‘quicksort’, ‘mergesort’, ‘heapsort’}, 默认是‘quicksort’，也就是快排

na_position：nan排列的位置，是前还是后{‘first’, ‘last’}, 默认是‘last’

sort_index() 的参数和上面差不多。

实例：

import numpy as np

import pandas as pd

from pandas import Series, DataFrame

np.random.seed(666)

s1 = Series(np.random.randn(10))

print(s1)

'''

0    0.824188

1    0.479966

2    1.173468

3    0.909048

4   -0.571721

5   -0.109497

6    0.019028

7   -0.943761

8    0.640573

9   -0.786443

dtype: float64

'''

#  为series排序的两种方式，1 用 value 来排序， 2 用 index 来排序

s2 = s1.sort_values() # 按照 value 来排序

print(s2)

'''

7   -0.943761

9   -0.786443

4   -0.571721

5   -0.109497

6    0.019028

1    0.479966

8    0.640573

0    0.824188

3    0.909048

2    1.173468

dtype: float64

'''

# axis 设置轴 的方向， ascending 设置升降序

s2 = s1.sort_values(axis = 0, ascending=False)

print(s2)

'''

2    1.173468

3    0.909048

0    0.824188

8    0.640573

1    0.479966

6    0.019028

5   -0.109497

4   -0.571721

9   -0.786443

7   -0.943761

dtype: float64

'''

# 通过 对 index 进行排序

s2.sort_index()

print(s2)

'''

2    1.173468

3    0.909048

0    0.824188

8    0.640573

1    0.479966

6    0.019028

5   -0.109497

4   -0.571721

9   -0.786443

7   -0.943761

dtype: float64

'''

# 对于 dataframe 的排序

df1 = DataFrame(np.random.randn(40).reshape(8, 5), columns=['a', 'b', 'c', 'd', 'e'])

print(df1)

'''

          a         b         c         d         e

0  0.608870 -0.931012  0.978222 -0.736918 -0.298733

1 -0.460587 -1.088793 -0.575771 -1.682901  0.229185

2 -1.756625  0.844633  0.277220  0.852902  0.194600

3  1.310638  1.543844 -0.529048 -0.656472 -0.201506

4 -0.700616  0.687138 -0.026076 -0.829758  0.296554

5 -0.312680 -0.611301 -0.821752  0.897123  0.136079

6 -0.258655  1.110766 -0.188424 -0.041489 -0.984792

7 -1.352282  0.194324  0.267239 -0.426474  1.447735

'''

# 按照 columns 进行排序, 这种做法 和 对 series的 操作 差不多

print(df1['a'].sort_values())

'''

2   -1.756625

7   -1.352282

4   -0.700616

1   -0.460587

5   -0.312680

6   -0.258655

0    0.608870

3    1.310638

Name: a, dtype: float64

'''

# 将 dataframe 按照 其中 某个列进行排序, 参数ascending来控制 升降序

print(df1.sort_values('a'))

'''

          a         b         c         d         e

2 -1.756625  0.844633  0.277220  0.852902  0.194600

7 -1.352282  0.194324  0.267239 -0.426474  1.447735

4 -0.700616  0.687138 -0.026076 -0.829758  0.296554

1 -0.460587 -1.088793 -0.575771 -1.682901  0.229185

5 -0.312680 -0.611301 -0.821752  0.897123  0.136079

6 -0.258655  1.110766 -0.188424 -0.041489 -0.984792

0  0.608870 -0.931012  0.978222 -0.736918 -0.298733

3  1.310638  1.543844 -0.529048 -0.656472 -0.201506

'''

df2 = df1.sort_values('a')

print(df2)

'''

          a         b         c         d         e

2 -1.756625  0.844633  0.277220  0.852902  0.194600

7 -1.352282  0.194324  0.267239 -0.426474  1.447735

4 -0.700616  0.687138 -0.026076 -0.829758  0.296554

1 -0.460587 -1.088793 -0.575771 -1.682901  0.229185

5 -0.312680 -0.611301 -0.821752  0.897123  0.136079

6 -0.258655  1.110766 -0.188424 -0.041489 -0.984792

0  0.608870 -0.931012  0.978222 -0.736918 -0.298733

3  1.310638  1.543844 -0.529048 -0.656472 -0.201506

'''

# 对 df2 的 索引 进行排序, 又回到之前的原本的 df2

print(df2.sort_index())

'''

          a         b         c         d         e

0  0.608870 -0.931012  0.978222 -0.736918 -0.298733

1 -0.460587 -1.088793 -0.575771 -1.682901  0.229185

2 -1.756625  0.844633  0.277220  0.852902  0.194600

3  1.310638  1.543844 -0.529048 -0.656472 -0.201506

4 -0.700616  0.687138 -0.026076 -0.829758  0.296554

5 -0.312680 -0.611301 -0.821752  0.897123  0.136079

6 -0.258655  1.110766 -0.188424 -0.041489 -0.984792

7 -1.352282  0.194324  0.267239 -0.426474  1.447735

'''

pandas-06 Series和Dataframe的排序操作的更多相关文章

Python之Pandas中Series、DataFrame
Python之Pandas中Series.DataFrame实践 1. pandas的数据结构Series 1.1 Series是一种类似于一维数组的对象,它由一组数据(各种NumPy数据类型)以及一 ...
Python之Pandas中Series、DataFrame实践
Python之Pandas中Series.DataFrame实践 1. pandas的数据结构Series 1.1 Series是一种类似于一维数组的对象,它由一组数据(各种NumPy数据类型)以及一 ...
利用Python进行数据分析(7) pandas基础: Series和DataFrame的简单介绍
一.pandas 是什么 pandas 是基于 NumPy 的一个 Python 数据分析包,主要目的是为了数据分析.它提供了大量高级的数据结构和对数据处理的方法. pandas 有两个主要的数据结构 ...
利用Python进行数据分析(8) pandas基础: Series和DataFrame的基本操作
一.reindex() 方法:重新索引针对 Series 重新索引指的是根据index参数重新进行排序. 如果传入的索引值在数据里不存在,则不会报错,而是添加缺失值的新行. 不想用缺失值,可以用 ...
Pandas中Series和DataFrame的索引
在对Series对象和DataFrame对象进行索引的时候要明确这么一个概念:是使用下标进行索引,还是使用关键字进行索引.比如list进行索引的时候使用的是下标,而dict索引的时候使用的是关键字. ...
pandas基础: Series和DataFrame的简单介绍
一.pandas 是什么 pandas 是基于 NumPy 的一个 Python 数据分析包,主要目的是为了数据分析.它提供了大量高级的数据结构和对数据处理的方法. pandas 有两个主要的数据结构 ...
pandas学习series和dataframe基础
PANDAS 的使用一.什么是pandas? 1.python Data Analysis Library 或pandas 是基于numpy的一种工具,该工具是为了解决数据分析人物而创建的. 2.p ...
[Python] Pandas 中 Series 和 DataFrame 的用法笔记
目录 1. Series对象自定义元素的行标签使用Series对象定义基于字典创建数据结构 2. DataFrame对象自定义行标签和列标签使用DataFrame对象可以基于字典创建数据结构 ...
Python数据分析-Pandas（Series与DataFrame）
Pandas介绍: pandas是一个强大的Python数据分析的工具包,是基于NumPy构建的. Pandas的主要功能: 1)具备对其功能的数据结构DataFrame.Series 2)集成时间序 ...

随机推荐

一个按权重（weight）进行LB的算法
package netty; import com.google.common.collect.ImmutableList; import lombok.SneakyThrows; import ja ...
easyui datagrid生成序号列formatter
var opts1; $('#datagrid_1').datagrid({ columns: [ [{ field: 'myNo', title: '序号', align: 'center', wi ...
Openresty与Tengine
Tengine官方网站:http://tengine.taobao.org/index_cn.html OpenResty官方网站:http://openresty.org/ Openresty和Te ...
GBDT学习笔记
GBDT(Gradient Boosting Decision Tree,Friedman,1999)算法自提出以来,在各个领域广泛使用.从名字里可以看到,该算法主要涉及了三类知识,Gradient梯 ...
Consider defining a bean of type 'com.*.*.mapper.*.*Mapper' in your configuration.
@Mapper 不能加载的问题 Consider defining a bean of type 'com.*.*.mapper.*.*Mapper' in your configuration. 添 ...
resources-plugin-2.6.pom.part.lock (没有那个文件或目录)
由于自定义 maven 仓库没权限 /home/repository 自定义目录 [root@localhost Service]# cat /etc/group|grep jenkins jenk ...
Operation之变换操作符
buffer buffer方法的作用是缓冲组合, 第一个参数是缓冲时间, 第二个参数是缓冲个数, 第三个参数是线程该方法简单来说就是缓存Observable中发出的新元素, 当元素达到某个数量, 或 ...
为什么地址空间分配粒度为64K？Why is address space allocation granularity 64K?
您可能想知道为什么VirtualAlloc在64K边界分配内存,即使页面粒度为4K. 你有Alpha AXP处理器,感谢你. 在Alpha AXP上,没有“加载32位整数”指令.要加载32位整数,实际 ...
理解CNN中的感受野（receptive-field）
1. 阅读论文:Understanding the Effective Receptive Field in Deep Convolutional Neural Networks 理解感受野定义:r ...
将你的数据导入到json格式
不知道为什么大家那么偏爱json格式,清晰?跨平台?或许这都是它的优点吧,之前我都是将我的数据放到txt中,今后就用json吧.初步写了一个写入json的模板,就这么用吧. def get_qq_05 ...

pandas-06 Series和Dataframe的排序操作

pandas-06 Series和Dataframe的排序操作

pandas-06 Series和Dataframe的排序操作的更多相关文章

随机推荐

热门专题