pandas.DataFrame.join

自己弄了很久,一看官网。感觉自己宛如智障。不要脸了,直接抄

DataFrame.join(otheron=Nonehow='left'lsuffix=''rsuffix=''sort=False)

Join columns with other DataFrame either on index or on a key column. Efficiently Join multiple DataFrame objects by index at once by passing a list.

Parameters:

other : DataFrame, Series with name field set, or list of DataFrame

Index should be similar to one of the columns in this one. If a Series is passed, its name attribute must be set, and that will be used as the column name in the resulting joined DataFrame

on : column name, tuple/list of column names, or array-like

Column(s) in the caller to join on the index in other, otherwise joins index-on-index. If multiples columns given, the passed DataFrame must have a MultiIndex. Can pass an array as the join key if not already contained in the calling DataFrame. Like an Excel VLOOKUP operation

how : {‘left’, ‘right’, ‘outer’, ‘inner’}, default: ‘left’

How to handle the operation of the two objects.

  • left: use calling frame’s index (or column if on is specified)

  • right: use other frame’s index

  • outer: form union of calling frame’s index (or column if on is

    specified) with other frame’s index

  • inner: form intersection of calling frame’s index (or column if

    on is specified) with other frame’s index

lsuffix : string

Suffix to use from left frame’s overlapping columns

rsuffix : string

Suffix to use from right frame’s overlapping columns

sort : boolean, default False

Order result DataFrame lexicographically by the join key. If False, preserves the index order of the calling (left) DataFrame

Returns:

joined : DataFrame

See also

DataFrame.merge
For column(s)-on-columns(s) operations

Notes

on, lsuffix, and rsuffix options are not supported when passing a list of DataFrame objects

Examples

>>> caller = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5'],
... 'A': ['A0', 'A1', 'A2', 'A3', 'A4', 'A5']})
>>> caller
A key
0 A0 K0
1 A1 K1
2 A2 K2
3 A3 K3
4 A4 K4
5 A5 K5
>>> other = pd.DataFrame({'key': ['K0', 'K1', 'K2'],
... 'B': ['B0', 'B1', 'B2']})
>>> other
B key
0 B0 K0
1 B1 K1
2 B2 K2

Join DataFrames using their indexes.==》join on indexes

>>> caller.join(other, lsuffix='_caller', rsuffix='_other')
>>>     A key_caller    B key_other
0 A0 K0 B0 K0
1 A1 K1 B1 K1
2 A2 K2 B2 K2
3 A3 K3 NaN NaN
4 A4 K4 NaN NaN
5 A5 K5 NaN NaN

If we want to join using the key columns, we need to set key to be the index in both caller and other. The joined DataFrame will have key as its index.

>>> caller.set_index('key').join(other.set_index('key'))
>>>      A    B
key
K0 A0 B0
K1 A1 B1
K2 A2 B2
K3 A3 NaN
K4 A4 NaN
K5 A5 NaN

Another option to join using the key columns is to use the on parameter. DataFrame.join always uses other’s index but we can use any column in the caller. This method preserves the original caller’s index in the result.

>>> caller.join(other.set_index('key'), on='key')
>>>     A key    B
0 A0 K0 B0
1 A1 K1 B1
2 A2 K2 B2
3 A3 K3 NaN
4 A4 K4 NaN
5 A5 K5 NaN

Pandas中DataFrame数据合并、连接(concat、merge、join)之join的更多相关文章

  1. Pandas中DataFrame数据合并、连接(concat、merge、join)之concat

    一.concat:沿着一条轴,将多个对象堆叠到一起 concat(objs, axis=0, join='outer', join_axes=None, ignore_index=False, key ...

  2. Pandas中DataFrame数据合并、连接(concat、merge、join)之merge

    二.merge:通过键拼接列 类似于关系型数据库的连接方式,可以根据一个或多个键将不同的DatFrame连接起来. 该函数的典型应用场景是,针对同一个主键存在两张不同字段的表,根据主键整合到一张表里面 ...

  3. Python基础 | pandas中dataframe的整合与形变(merge & reshape)

    目录 行的union pd.concat df.append 列的join pd.concat pd.merge df.join 行列转置 pivot stack & unstack melt ...

  4. Spark与Pandas中DataFrame对比

      Pandas Spark 工作方式 单机single machine tool,没有并行机制parallelism不支持Hadoop,处理大量数据有瓶颈 分布式并行计算框架,内建并行机制paral ...

  5. Spark与Pandas中DataFrame对比(详细)

      Pandas Spark 工作方式 单机single machine tool,没有并行机制parallelism不支持Hadoop,处理大量数据有瓶颈 分布式并行计算框架,内建并行机制paral ...

  6. 将pandas的DataFrame数据写入MySQL数据库 + sqlalchemy

    将pandas的DataFrame数据写入MySQL数据库 + sqlalchemy import pandas as pd from sqlalchemy import create_engine ...

  7. 排序合并连接(sort merge join)的原理

    排序合并连接(sort merge join)的原理 排序合并连接(sort merge join)的原理     排序合并连接(sort merge join)       访问次数:两张表都只会访 ...

  8. Python3 Pandas的DataFrame数据的增、删、改、查

    Python3 Pandas的DataFrame数据的增.删.改.查 一.DataFrame数据准备 增.删.改.查的方法有很多很多种,这里只展示出常用的几种. 参数inplace默认为False,只 ...

  9. Pandas中DataFrame修改列名

    Pandas中DataFrame修改列名:使用 rename df = pd.read_csv('I:/Papers/consumer/codeandpaper/TmallData/result01- ...

随机推荐

  1. poj1741(入门点分治)

    题目链接:https://vjudge.net/problem/POJ-1741 题意:给出一棵树,求出树上距离不超过k的点对数量. 思路:点分治经典题.先找重心作为树根,然后求出子树中所有点到重心的 ...

  2. memcached命令行、Memcached数据导出和导入

    1.memcached命令行 telnet 127.0.0.1 11211set key2 0 30 2abSTOREDget key2VALUE key2 0 2abEND  如: set key3 ...

  3. linux 软连接和 硬链接的区别

    Linux软链接硬链接的区别   ln是linux中又一个非常重要命令,它的功能是为某一个文件在另外一个位置建立一个同步的链接.当我们需要在不同的目录,用到相同的文件时,我们不需要在每一个需要的目录下 ...

  4. 3d长方体

    html <div class="main"> <div class="a1">1</div> <div class= ...

  5. laravel-admin关联查询问题解决办法

    文档是这么说的: 按照文档上来,没有成功,网上找了好久,说是没有在模型中关联,关联之后的运行结果是这样的: 还是没有成功啊,仔细研究返现是这里写错了,whereHas后面跟的是model中的方法名,而 ...

  6. pycharm连接mysql是出现Connection to orm02@127.0.0.1 failed. [08001] Could not create connection to database server. Attempted reconnect 3 times. Giving up.

    下面这个问题反正我是遇到了,也是难为我好几天,于是我决定发一个教程出来给大家看看!希望能帮助你们 原因: 可能是数据库的版本与本机装的驱动不匹配导致的, 解决方案一: 在 url 后面街上一句 因为笔 ...

  7. sort()方法的用法,参数以及排序原理

    sort() 方法用于对数组的元素进行排序,并返回数组.默认排序顺序是根据字符串Unicode码点.语法:arrayObject.sort(sortby):参数sortby可选.规定排序顺序.必须是函 ...

  8. CSM(Certified Scrum Master) 敏捷认证是什么?

    Scrum 是用于开发和持续支持复杂产品的一个框架.Scrum 基于试验性过程控制理论,借鉴了精益思想.时间盒.模块化设计等,并完整地体现了敏捷宣言和敏捷原则.Scrum 采用一种迭代.增量式的方法来 ...

  9. JS基础_打印99乘法表

    <!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title> ...

  10. python3.3.2中的关键字(转)

    The following identifiers are used as reserved words, or keywords of the language, and cannot be use ...