Why is 'x' in ('x',) faster than 'x' == 'x'?
Question:
>>> timeit.timeit("'x' in ('x',)")
0.04869917374131205
>>> timeit.timeit("'x' == 'x'")
0.06144205736110564
Also works for multiple options, both seem to grow linearly:
>>> timeit.timeit("'x' in ('x', 'y')")
0.04866674801541748
>>> timeit.timeit("'x' == 'x' or 'x' == 'y'")
0.06565782838087131
>>> timeit.timeit("'x' in ('y', 'x')")
0.08975995576448526
>>> timeit.timeit("'x' == 'y' or 'x' == 'y'")
0.12992391047427532
Based on this, I think I should start using in
everywhere
instead of ==
!!
Answer:
As I mentioned to David Wolever, there's more to this than meets the eye; both methods dispatch to is
;
you can prove this by doing
min(Timer("x == x", setup="x = 'a' * 1000000").repeat(10, 10000))
#>>> 0.00045456900261342525
min(Timer("x == y", setup="x = 'a' * 1000000; y = 'a' * 1000000").repeat(10, 10000))
#>>> 0.5256857610074803
The first can only be so fast because it checks by identity.
To find out why one would take longer than the other, let's trace through execution.
They both start in ceval.c
,
from COMPARE_OP
since
that is the bytecode involved
TARGET(COMPARE_OP) {
PyObject *right = POP();
PyObject *left = TOP();
PyObject *res = cmp_outcome(oparg, left, right);
Py_DECREF(left);
Py_DECREF(right);
SET_TOP(res);
if (res == NULL)
goto error;
PREDICT(POP_JUMP_IF_FALSE);
PREDICT(POP_JUMP_IF_TRUE);
DISPATCH();
}
This pops the values from the stack (technically it only pops one)
PyObject *right = POP();
PyObject *left = TOP();
and runs the compare:
PyObject *res = cmp_outcome(oparg, left, right);
cmp_outcome
is
this:
static PyObject *
cmp_outcome(int op, PyObject *v, PyObject *w)
{
int res = 0;
switch (op) {
case PyCmp_IS: ...
case PyCmp_IS_NOT: ...
case PyCmp_IN:
res = PySequence_Contains(w, v);
if (res < 0)
return NULL;
break;
case PyCmp_NOT_IN: ...
case PyCmp_EXC_MATCH: ...
default:
return PyObject_RichCompare(v, w, op);
}
v = res ? Py_True : Py_False;
Py_INCREF(v);
return v;
}
This is where the paths split. The PyCmp_IN
branch
does
int
PySequence_Contains(PyObject *seq, PyObject *ob)
{
Py_ssize_t result;
PySequenceMethods *sqm = seq->ob_type->tp_as_sequence;
if (sqm != NULL && sqm->sq_contains != NULL)
return (*sqm->sq_contains)(seq, ob);
result = _PySequence_IterSearch(seq, ob, PY_ITERSEARCH_CONTAINS);
return Py_SAFE_DOWNCAST(result, Py_ssize_t, int);
}
Note that a tuple is defined as
static PySequenceMethods tuple_as_sequence = {
...
(objobjproc)tuplecontains, /* sq_contains */
};
PyTypeObject PyTuple_Type = {
...
&tuple_as_sequence, /* tp_as_sequence */
...
};
So the branch
if (sqm != NULL && sqm->sq_contains != NULL)
will be taken and *sqm->sq_contains
,
which is the function (objobjproc)tuplecontains
,
will be taken.
This does
static int
tuplecontains(PyTupleObject *a, PyObject *el)
{
Py_ssize_t i;
int cmp;
for (i = 0, cmp = 0 ; cmp == 0 && i < Py_SIZE(a); ++i)
cmp = PyObject_RichCompareBool(el, PyTuple_GET_ITEM(a, i),
Py_EQ);
return cmp;
}
...Wait, wasn't that PyObject_RichCompareBool
what
the other branch took? Nope, that was PyObject_RichCompare
.
That code path was short so it likely just comes down to the speed of these two. Let's compare.
int
PyObject_RichCompareBool(PyObject *v, PyObject *w, int op)
{
PyObject *res;
int ok;
/* Quick result when objects are the same.
Guarantees that identity implies equality. */
if (v == w) {
if (op == Py_EQ)
return 1;
else if (op == Py_NE)
return 0;
}
...
}
The code path in PyObject_RichCompareBool
pretty
much immediately terminates. For PyObject_RichCompare
,
it does
PyObject *
PyObject_RichCompare(PyObject *v, PyObject *w, int op)
{
PyObject *res;
assert(Py_LT <= op && op <= Py_GE);
if (v == NULL || w == NULL) { ... }
if (Py_EnterRecursiveCall(" in comparison"))
return NULL;
res = do_richcompare(v, w, op);
Py_LeaveRecursiveCall();
return res;
}
The Py_EnterRecursiveCall
/Py_LeaveRecursiveCall
combo
are not taken in the previous path, but these are relatively quick macros that'll short-circuit after incrementing and decrementing some globals.
do_richcompare
does:
static PyObject *
do_richcompare(PyObject *v, PyObject *w, int op)
{
richcmpfunc f;
PyObject *res;
int checked_reverse_op = 0;
if (v->ob_type != w->ob_type && ...) { ... }
if ((f = v->ob_type->tp_richcompare) != NULL) {
res = (*f)(v, w, op);
if (res != Py_NotImplemented)
return res;
...
}
...
}
This does some quick checks to call v->ob_type->tp_richcompare
which
is
PyTypeObject PyUnicode_Type = {
...
PyUnicode_RichCompare, /* tp_richcompare */
...
};
which does
PyObject *
PyUnicode_RichCompare(PyObject *left, PyObject *right, int op)
{
int result;
PyObject *v;
if (!PyUnicode_Check(left) || !PyUnicode_Check(right))
Py_RETURN_NOTIMPLEMENTED;
if (PyUnicode_READY(left) == -1 ||
PyUnicode_READY(right) == -1)
return NULL;
if (left == right) {
switch (op) {
case Py_EQ:
case Py_LE:
case Py_GE:
/* a string is equal to itself */
v = Py_True;
break;
case Py_NE:
case Py_LT:
case Py_GT:
v = Py_False;
break;
default:
...
}
}
else if (...) { ... }
else { ...}
Py_INCREF(v);
return v;
}
Namely, this shortcuts on left
... but only after doing
== right
if (!PyUnicode_Check(left) || !PyUnicode_Check(right))
if (PyUnicode_READY(left) == -1 ||
PyUnicode_READY(right) == -1)
All in all the paths then look something like this (manually recursively inlining, unrolling and pruning known branches)
POP() # Stack stuff
TOP() #
#
case PyCmp_IN: # Dispatch on operation
#
sqm != NULL # Dispatch to builtin op
sqm->sq_contains != NULL #
*sqm->sq_contains #
#
cmp == 0 # Do comparison in loop
i < Py_SIZE(a) #
v == w #
op == Py_EQ #
++i #
cmp == 0 #
#
res < 0 # Convert to Python-space
res ? Py_True : Py_False #
Py_INCREF(v) #
#
Py_DECREF(left) # Stack stuff
Py_DECREF(right) #
SET_TOP(res) #
res == NULL #
DISPATCH() #
vs
POP() # Stack stuff
TOP() #
#
default: # Dispatch on operation
#
Py_LT <= op # Checking operation
op <= Py_GE #
v == NULL #
w == NULL #
Py_EnterRecursiveCall(...) # Recursive check
#
v->ob_type != w->ob_type # More operation checks
f = v->ob_type->tp_richcompare # Dispatch to builtin op
f != NULL #
#
!PyUnicode_Check(left) # ...More checks
!PyUnicode_Check(right)) #
PyUnicode_READY(left) == -1 #
PyUnicode_READY(right) == -1 #
left == right # Finally, doing comparison
case Py_EQ: # Immediately short circuit
Py_INCREF(v); #
#
res != Py_NotImplemented #
#
Py_LeaveRecursiveCall() # Recursive check
#
Py_DECREF(left) # Stack stuff
Py_DECREF(right) #
SET_TOP(res) #
res == NULL #
DISPATCH() #
Now, PyUnicode_Check
and PyUnicode_READY
are
pretty cheap since they only check a couple of fields, but it should be obvious that the top one is a smaller code path, it has fewer function calls, only one switch statement and is just a bit thinner.
TL;DR:
Both dispatch to if
; the difference is just how much work they do to get there.
(left_pointer == right_pointer)in
just
does less.
Why is 'x' in ('x',) faster than 'x' == 'x'?的更多相关文章
- faster r-cnn 在CPU配置下训练自己的数据
因为没有GPU,所以在CPU下训练自己的数据,中间遇到了各种各样的坑,还好没有放弃,特以此文记录此过程. 1.在CPU下配置faster r-cnn,参考博客:http://blog.csdn.net ...
- r-cnn学习系列(三):从r-cnn到faster r-cnn
把r-cnn系列总结下,让整个流程更清晰. 整个系列是从r-cnn至spp-net到fast r-cnn再到faster r-cnn. RCNN 输入图像,使用selective search来构造 ...
- faster with MyISAM tables than with InnoDB or NDB tables
http://dev.mysql.com/doc/refman/5.7/en/partitioning-limitations.html Performance considerations. So ...
- situations where MyISAM will be faster than InnoDB
http://www.tocker.ca/categories/myisam Converting MyISAM to InnoDB and a lesson on variance I'm abou ...
- Faster RNNLM (HS/NCE) toolkit
https://github.com/kjw0612/awesome-rnn Faster Recurrent Neural Network Language Modeling Toolkit wit ...
- Faster R-CNN CPU环境搭建
操作系统: bigtop@bigtop-SdcOS-Hypervisor:~/py-faster-rcnn/tools$ cat /etc/issue Ubuntu LTS \n \l Python版 ...
- Why is processing a sorted array faster than an unsorted array?
这是我在逛 Stack Overflow 时遇见的一个高分问题:Why is processing a sorted array faster than an unsorted array?,我觉得这 ...
- Introducing the Accelerated Mobile Pages Project, for a faster, open mobile web
https://googleblog.blogspot.com/2015/10/introducing-accelerated-mobile-pages.html October 7, 2015 Sm ...
- 论文阅读之:Is Faster R-CNN Doing Well for Pedestrian Detection?
Is Faster R-CNN Doing Well for Pedestrian Detection? ECCV 2016 Liliang Zhang & Kaiming He 原文链接 ...
- 如何才能将Faster R-CNN训练起来?
如何才能将Faster R-CNN训练起来? 首先进入 Faster RCNN 的官网啦,即:https://github.com/rbgirshick/py-faster-rcnn#installa ...
随机推荐
- docker安装镜像
CMD 容器启动命令 CMD指令用于为执行容器提供默认值.每个Dockerfile只有一个CMD命令,如果指定了多个CMD命令,那么只有最后一条会被执行,如果启动容器的时候指定了运行的命令,则会覆盖掉 ...
- 滑块视图容器 swiper
属性名 类型 默认值 说明 indicator-dots Boolean false 是否显示面板指示点 autoplay Boolean false 是否自动切换 current Number 0 ...
- 高性能mysql-锁的调试
锁的调试分为俩部分,一是服务器级别的锁的调试.二是存储引擎级别的锁的调试 对于服务器级别的锁的调试: 服务器级别的锁的类型有表锁,全局锁,命名锁,字符锁 调试命令: Show processlist ...
- DLL补丁劫持制作
DLL: 由于输入表中只包含 DLL 名而没有它的路径名,因此加载程序必须在磁盘上搜索 DLL 文件.首先会尝试从当前程序所在的目录加载 DLL,如果没找到,则在Windows 系统目录中查找,最后是 ...
- HashMap的源码分析
hashMap的底层实现是 数组+链表 的数据结构,数组是一个Entry<K,V>[] 的键值对对象数组,在数组的每个索引上存储的是包含Entry的节点对象,每个Entry对象是一个单链表 ...
- Linux快速目录间切换cd pushd popd
1. cd - 当前目录和之前所在的目录之间的切换 2. cd + Alt . 用上次命令的最后一个目录路径 要用上上次命令的最后一个目录,就Alt+.两次就可以了 3. push ...
- Spring Boot 核心配置文件 bootstrap & application 详解。
用过 Spring Boot 的都知道在 Spring Boot 中有以下两种配置文件 bootstrap (.yml 或者 .properties) application (.yml 或者 .pr ...
- Eruda 一个被人遗忘的调试神器
Eruda 一个被人遗忘的调试神器 引言 日常工作中再牛逼的大佬都不敢说自己的代码是完全没有问题的,既然有问题,那就也就有调试,说到调试工具,大家可能对于 fiddler.Charles.chro ...
- 使用Chrome开发者工具调试Android端内网页(微信,QQ,UC,App内嵌页等)
使用Chrome开发者工具调试Android端内网页(微信,QQ,UC,App内嵌页等) 前言 移动端页面调试一直是好多朋友头疼的问题,iOS 由于其封闭的特性和整体较高的性能,整体适配相对好做,调试 ...
- 测试工具之RobotFramework关键字和快捷键
RF中关键字很多,即使经常使用也有些关键字没有使用过,所以我们就需要记住一些常用的关键字,在使用中本人整理了部分关键字.快捷键和部分RF的常识 1.F5 如果只记得关键字部分,可以通过F5呼出关键字查 ...