Pandas中DataFrame数据合并、连接(concat、merge、join)之join
pandas.DataFrame.join
自己弄了很久,一看官网。感觉自己宛如智障。不要脸了,直接抄
DataFrame.
join
(other, on=None, how='left', lsuffix='', rsuffix='', sort=False)-
Join columns with other DataFrame either on index or on a key column. Efficiently Join multiple DataFrame objects by index at once by passing a list.
Parameters: other : DataFrame, Series with name field set, or list of DataFrame
Index should be similar to one of the columns in this one. If a Series is passed, its name attribute must be set, and that will be used as the column name in the resulting joined DataFrame
on : column name, tuple/list of column names, or array-like
Column(s) in the caller to join on the index in other, otherwise joins index-on-index. If multiples columns given, the passed DataFrame must have a MultiIndex. Can pass an array as the join key if not already contained in the calling DataFrame. Like an Excel VLOOKUP operation
how : {‘left’, ‘right’, ‘outer’, ‘inner’}, default: ‘left’
How to handle the operation of the two objects.
left: use calling frame’s index (or column if on is specified)
right: use other frame’s index
- outer: form union of calling frame’s index (or column if on is
-
specified) with other frame’s index
- inner: form intersection of calling frame’s index (or column if
-
on is specified) with other frame’s index
lsuffix : string
Suffix to use from left frame’s overlapping columns
rsuffix : string
Suffix to use from right frame’s overlapping columns
sort : boolean, default False
Order result DataFrame lexicographically by the join key. If False, preserves the index order of the calling (left) DataFrame
Returns: joined : DataFrame
See also
DataFrame.merge
- For column(s)-on-columns(s) operations
Notes
on, lsuffix, and rsuffix options are not supported when passing a list of DataFrame objects
Examples
>>> caller = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5'],
... 'A': ['A0', 'A1', 'A2', 'A3', 'A4', 'A5']})>>> caller
A key
0 A0 K0
1 A1 K1
2 A2 K2
3 A3 K3
4 A4 K4
5 A5 K5>>> other = pd.DataFrame({'key': ['K0', 'K1', 'K2'],
... 'B': ['B0', 'B1', 'B2']})>>> other
B key
0 B0 K0
1 B1 K1
2 B2 K2Join DataFrames using their indexes.==》join on indexes
>>> caller.join(other, lsuffix='_caller', rsuffix='_other')
>>> A key_caller B key_other
0 A0 K0 B0 K0
1 A1 K1 B1 K1
2 A2 K2 B2 K2
3 A3 K3 NaN NaN
4 A4 K4 NaN NaN
5 A5 K5 NaN NaNIf we want to join using the key columns, we need to set key to be the index in both caller and other. The joined DataFrame will have key as its index.
>>> caller.set_index('key').join(other.set_index('key'))
>>> A B
key
K0 A0 B0
K1 A1 B1
K2 A2 B2
K3 A3 NaN
K4 A4 NaN
K5 A5 NaNAnother option to join using the key columns is to use the on parameter. DataFrame.join always uses other’s index but we can use any column in the caller. This method preserves the original caller’s index in the result.
>>> caller.join(other.set_index('key'), on='key')
>>> A key B
0 A0 K0 B0
1 A1 K1 B1
2 A2 K2 B2
3 A3 K3 NaN
4 A4 K4 NaN
5 A5 K5 NaN
Pandas中DataFrame数据合并、连接(concat、merge、join)之join的更多相关文章
- Pandas中DataFrame数据合并、连接(concat、merge、join)之concat
一.concat:沿着一条轴,将多个对象堆叠到一起 concat(objs, axis=0, join='outer', join_axes=None, ignore_index=False, key ...
- Pandas中DataFrame数据合并、连接(concat、merge、join)之merge
二.merge:通过键拼接列 类似于关系型数据库的连接方式,可以根据一个或多个键将不同的DatFrame连接起来. 该函数的典型应用场景是,针对同一个主键存在两张不同字段的表,根据主键整合到一张表里面 ...
- Python基础 | pandas中dataframe的整合与形变(merge & reshape)
目录 行的union pd.concat df.append 列的join pd.concat pd.merge df.join 行列转置 pivot stack & unstack melt ...
- Spark与Pandas中DataFrame对比
Pandas Spark 工作方式 单机single machine tool,没有并行机制parallelism不支持Hadoop,处理大量数据有瓶颈 分布式并行计算框架,内建并行机制paral ...
- Spark与Pandas中DataFrame对比(详细)
Pandas Spark 工作方式 单机single machine tool,没有并行机制parallelism不支持Hadoop,处理大量数据有瓶颈 分布式并行计算框架,内建并行机制paral ...
- 将pandas的DataFrame数据写入MySQL数据库 + sqlalchemy
将pandas的DataFrame数据写入MySQL数据库 + sqlalchemy import pandas as pd from sqlalchemy import create_engine ...
- 排序合并连接(sort merge join)的原理
排序合并连接(sort merge join)的原理 排序合并连接(sort merge join)的原理 排序合并连接(sort merge join) 访问次数:两张表都只会访 ...
- Python3 Pandas的DataFrame数据的增、删、改、查
Python3 Pandas的DataFrame数据的增.删.改.查 一.DataFrame数据准备 增.删.改.查的方法有很多很多种,这里只展示出常用的几种. 参数inplace默认为False,只 ...
- Pandas中DataFrame修改列名
Pandas中DataFrame修改列名:使用 rename df = pd.read_csv('I:/Papers/consumer/codeandpaper/TmallData/result01- ...
随机推荐
- Linux多线程编程 - sleep 和 pthread_cond_timedwait
#include <stdio.h> #include <stdlib.h> int flag = 1; void * thr_fn(void * arg) { while ...
- Infix to Postfix Expression
Example : Infix : (A+B) * (C-D) ) Postfix: AB+CD-* 算法: 1. Scan the infix expression from left to rig ...
- odoo10实现单点登陆绕过登陆集成页面
背景:由于需要集成odoo平台在其他页面,需要绕开登陆. 解决办法:开辟一个自动登陆的路由用与集成页面. 1.修改web模块中controller/main.py文件,在class名字为Home中添加 ...
- bootstrap-table服务端分页操作
由于数据库查询的数据过多,所以采取服务端分页的操作,避免一次性加载的数据量过多,导致页面加载缓慢. 后端数据的封装格式json数据 rows里的数据是当前页的数据,total是总条数: { " ...
- dubbo分布式服务框架-study1
本文参考“如何给老婆解释RPC”一文进行的... 1.首先了解下dubbo: dubbo是一款高性能.轻量级的开源java RPC服务框架(RPC即远程过程调用,具体解释见:https://www.j ...
- php 正则替换特殊字符 和检测是否是中文
如果是只想输入中文的话,就这么写,要注意是分gb2312和utf-8的哦: gb2312:if(!preg_match("/^[".chr(0xa1)."-". ...
- winform中如何使用确认对话框
在系统中,常需要这样的功能,让用户确认一些信息:如下图: [退出系统]按钮关键代码如下: private void btnExit_Click(object sender, EventArgs e) ...
- Python格式化输出的三种方式
Python格式化输出的三种方式 一.占位符 程序中经常会有这样场景:要求用户输入信息,然后打印成固定的格式比如要求用户输入用户名和年龄,然后打印如下格式:My name is xxx,my age ...
- LinqToSQL2
扩展方法: 扩展方法是C#3.0的新特性,可以通过为已知类型添加新方法来到到扩展类型的目的,同时不需对此类型做任何改动. 重点记住的是,定义为静态方法的扩展方法只能在通过using指令显示地将名称空间 ...
- wcf可以返回的类型有哪些
Windows Communication Foundation (WCF) 使用 DataContractSerializer 作为其默认的序列化引擎以将数据转换到 XML 并将 XML 转换回数据 ...