pandas的corsstab

pandas.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, dropna=True, normalize=False)

index : array-like, Series, or list of arrays/Series

Values to group by in the rows

columns : array-like, Series, or list of arrays/Series

Values to group by in the columns

values : array-like, optional

Array of values to aggregate according to the factors. Requires aggfunc be specified.

aggfunc : function, optional

If specified, requires values be specified as well

rownames : sequence, default None

If passed, must match number of row arrays passed

colnames : sequence, default None

If passed, must match number of column arrays passed

margins : boolean, default False

Add row/column margins (subtotals)

dropna : boolean, default True

Do not include columns whose entries are all NaN

normalize : boolean, {‘all’, ‘index’, ‘columns’}, or {0,1}, default False

Normalize by dividing all values by the sum of values.

If passed ‘all’ or True, will normalize over all values.

If passed ‘index’ will normalize over each row.

If passed ‘columns’ will normalize over each column.

If margins is True, will also normalize margin values.

New in version 0.18.1.

In [1]:

import numpy as np

a = np.array(["foo", "foo", "foo", "foo", "bar", "bar","bar", "bar", "foo", "foo", "foo"], dtype=object)

a

In [2]:

b = np.array(["one", "one", "one", "two", "one", "one", "one", "two", "two", "two", "one"], dtype=object)

b

In [3]:

pd.crosstab(a,b)

Out[3]:

col_0	one	two
row_0
bar	3	1
foo	4	3

In [4]:

 pd.crosstab(a, b, rownames=['a'], colnames=['b'])

Out[4]:

b	one	two
a
bar	3	1
foo	4	3

In [5]

c = np.array(["dull", "dull", "shiny", "dull", "dull", "shiny","shiny", "dull", "shiny", "shiny", "shiny"],

               dtype=object)

c

In [6]:

import pandas as pd

pd.crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c'])

Out[6]:

b	one		two
c	dull	shiny	dull	shiny
a
bar	1	2	1	0
foo	2	2	1	2

In [7]:

foo1 = pd.Categorical(['a', 'b'], categories=['a', 'b', 'c'])

bar1= pd.Categorical(['d', 'e'], categories=['d', 'e', 'f'])

pd.crosstab(foo1, bar1,dropna='true')

# 'c' and 'f' are not represented in the data,

# and will not be shown in the output because

# dropna is True by default. Set 'dropna=False'

# to preserve categories with no data

Out[7]:

col_0	d	e	f
row_0
a	1	0	0
b	0	1	0
c	0	0	0

pandas的corsstab的更多相关文章

pandas基础-Python3
未完 for examples: example 1: # Code based on Python 3.x # _*_ coding: utf-8 _*_ # __Author: "LEM ...
10 Minutes to pandas
摘要一.创建对象二.查看数据三.选择和设置四.缺失值处理五.相关操作六.聚合七.重排(Reshaping) 八.时间序列九.Categorical类型十.画图十一 ...
利用Python进行数据分析(15) pandas基础: 字符串操作
字符串对象方法 split()方法拆分字符串: strip()方法去掉空白符和换行符: split()结合strip()使用: "+"符号可以将多个字符串连接起来: join( ...
利用Python进行数据分析(10) pandas基础: 处理缺失数据
数据不完整在数据分析的过程中很常见. pandas使用浮点值NaN表示浮点和非浮点数组里的缺失数据. pandas使用isnull()和notnull()函数来判断缺失情况. 对于缺失数据一般处理 ...
利用Python进行数据分析(12) pandas基础: 数据合并
pandas 提供了三种主要方法可以对数据进行合并: pandas.merge()方法:数据库风格的合并: pandas.concat()方法:轴向连接,即沿着一条轴将多个对象堆叠到一起: 实例方法c ...
利用Python进行数据分析(9) pandas基础: 汇总统计和计算
pandas 对象拥有一些常用的数学和统计方法. 例如,sum() 方法,进行列小计: sum() 方法传入 axis=1 指定为横向汇总,即行小计: idxmax() 获取最大值对应的索 ...
利用Python进行数据分析(8) pandas基础: Series和DataFrame的基本操作
一.reindex() 方法:重新索引针对 Series 重新索引指的是根据index参数重新进行排序. 如果传入的索引值在数据里不存在,则不会报错,而是添加缺失值的新行. 不想用缺失值,可以用 ...
利用Python进行数据分析(7) pandas基础: Series和DataFrame的简单介绍
一.pandas 是什么 pandas 是基于 NumPy 的一个 Python 数据分析包,主要目的是为了数据分析.它提供了大量高级的数据结构和对数据处理的方法. pandas 有两个主要的数据结构 ...
pandas.DataFrame对行和列求和及添加新行和列
导入模块: from pandas import DataFrame import pandas as pd import numpy as np 生成DataFrame数据 df = DataFra ...

随机推荐

阶段3 2.Spring_02.程序间耦合_6 工厂模式解耦
使用类加载器去加载文件定义getBean的方法运行测试方法报错. 在工厂类里面打印输出BeanPath 删除dao的实现类没有dao的实现类.再次运行程序.编译不报错.运行时报错以上就是工厂模 ...
java：struts框架4（Ajax）
1.Ajax: 先导入jar包: struts.xml: <?xml version="1.0" encoding="UTF-8"?> <!D ...
【JAVA系列】Google爬虫如何抓取JavaScript的？
公众号:SAP Technical 本文作者:matinal 原文出处:http://www.cnblogs.com/SAPmatinal/ 原文链接:[JAVA系列]Google爬虫如何抓取Java ...
cocos2dx[3.2](11) 新事件分发机制
在2.x中处理事件需要用到委托代理(delegate),相信学过2.x的触摸事件的同学,都知道创建和移除的流程十分繁琐. 而在3.x中由于加入了C++11的特性,而对事件的分发机制通过事件分发器Eve ...
使用movielens数据集动手实现youtube推荐候选集生成
综述之前在博客中总结过nce损失和YouTuBe DNN推荐;但大多都还是停留在理论层面,没有实践经验.所以笔者想借由此文继续深入探索YouTuBe DNN推荐,另外也进一步总结TensorFlow ...
BZOJ3129方程（SDOI2013）
https://blog.csdn.net/Maxwei_wzj/article/details/80152116 对变量有上界限制及下界限制.对于下界,可以从总数中减去即可,对于上界,容斥定理.
value（C# ）
上下文关键字 value 用在普通属性声明的 set 访问器中. 此关键字类似于方法的输入参数. 关键字 value 引用客户端代码尝试分配给属性的值. 在以下示例中,MyDerivedClass 有 ...
python 并发编程协程 gevent模块
一 gevent模块 gevent应用场景: 单线程下,多个任务,io密集型程序安装 pip3 install gevent Gevent 是一个第三方库,可以轻松通过gevent实现并发同步或异步 ...
spring boot-7.日志系统
日志系统分为两部分,一部分是日志抽象层,一部分是日志实现层.常见的日志抽象层JCL,SLF4J,JBoss-Logging,日志实现层有logback,log4j,log4j2,JUL.日志抽象层的功 ...
【转帖】大话Spring Cloud
springcloud(一):大话Spring Cloud 2017/05/01 http://www.ityouknow.com/springcloud/2017/05/01/simple-sp ...

pandas的corsstab

pandas的corsstab的更多相关文章

随机推荐

热门专题