Potential Pythonic Pitfalls
Potential Pythonic Pitfalls
Monday, 11 May 2015
Table of Contents
Python is a very expressive language. It provides us with a large standard library and many builtins to get the job done quickly. However, many can get lost in the power that it provides, fail to make full use of the standard library, value one liners over clarity and misunderstand its basic constructs. This is a non-exhaustive list of a few of the pitfalls programmers new to Python fall into.
Not Knowing the Python Version
This is a recurring problem in StackOverflow questions. Many write perfectly working code for one version but they have a different version of Python installed on their system.[1] Make sure that you know the Python version you're working with. You can check via the following:
$ python --version
Python 2.7.9
Not using Pyenv
pyenv is a great tool for managing different Python versions. Unfortunately, it only works on *nix systems. On Mac OS, one can simply install it via brew install pyenv and on Linux, there is an automatic installer.
Obsessing Over One-Liners
Some get a real kick out of one liners. Many boast about their one-liner solutions even if they are less efficient than a multi-line solution.
What this essentially means in Python is convoluted comprehensions having multiple expressions. For example:
l = [m for a, b in zip(this, that) if b.method(a) != b for m in b if not m.method(a, b) and reduce(lambda x, y: a + y.method(), (m, a, b))]
To be perfectly honest, I made the above example up. But, I've seen plenty of people write code like it. Code like this will make no sense in a week's time. If you're trying to do something a little more complex that simply adding an item to a list or set with a condition then you're probably making a mistake.
One-Liners are not achievements, yes they can seem very clever but they are not achievements. Its like thinking that shoving everything into your closet is an actual attempt at cleaning your room. Good code is clean, easy to read and efficient.
Initializing a set the Wrong Way
This is a more subtle problem that can catch you off guard. set comprehensions are a lot like list comprehensions.
>>> { n for n in range(10) if n % 2 == 0 }
{0, 8, 2, 4, 6}
>>> type({ n for n in range(10) if n % 2 == 0 })
<class 'set'>
The above is one such example of a set comprehension. Sets are like lists in that they are containers. The difference is that a set cannot have any duplicate values and sets are unordered. Seeing set comprehensions people often make the mistake of thinking that {} initializes an empty set. It does not, it initializes an empty dict.
>>> {}
{}
>>> type({})
<class 'dict'>
If we wish to initialize an empty set, then we simply call set().
>>> set()
set()
>>> type(set())
<class 'set'>
Note how an empty set is denoted as set() but a set containing something is denoted as items surrounded by curly braces.
>>> s = set()
>>> s
set()
>>> s.add(1)
>>> s
{1}
>>> s.add(2)
>>> s
{1, 2}
This is rather counter intuitive, since you'd expect something like set([1, 2]).
Misunderstanding the GIL
The GIL (Global Interpreter Lock) means that only one thread in a Python program can be running at any one time. This implies that when we create a thread and expect to run in parallel it doesn't. What the Python interpreter is actually doing is quickly switching between different running threads. But this is an oversimplified version of what is actually happening. There are many instances in which things do run in parallel, like when using libraries that are essentially C extensions. But when running Python code, you don't get parallel execution most of the time. In other words, threads in Python are not like Threads in Java or C++.
Many will try to defend Python by saying that these are real threads.[2] This is indeed true, but does not change the fact that how Python handles threads is different from what you'd generally expect. This is the same case for a language like Ruby (which also has an interpreter lock).
The prescribed solution to this is using the multiprocessing module. The multiprocessing module provides you with the Process class which is basically a nice cover over a fork. However, a fork is much more expensive than a thread, so you might not always see the performance benefits since now the different processes have to do a lot of work to co-ordinate with each other.
However, this problem does not exist every implementation of Python. PyPy-stm for example is an implementation of Python that tries to get rid of the GIL (still not stable yet). Implementations built on top of other platforms like the JVM (Jython) or CLR (IronPython) do not have GIL problems.
All in all, be careful when using the Thread class, what you get might not be what you expect.
Using Old Style Classes
In Python 2 there are two types of classes, there's the "old style" classes, and there's the "new style" classes. If you're using Python 3, then you're using the "new style" classes by default. In order to make sure that you're using "new style" classes in Python 2, you need to inherit from object for any new class you create that isn't already inheriting from a builtin like int or list. In other words, your base class, the class that isn't inheriting from anything else, should always inherit from object.
class MyNewObject(object):
# stuff here
These "new style" classes fix some very fundamental flaws in the old style classes that we really don't need to get into. However, if anyone is interested they can find the information in the related documentation.
Iterating the Wrong Way
Its very common to see the following code from users who are relatively new to the language:
for name_index in range(len(names)):
print(names[name_index])
There is no need to call len in the above example, since iterating over the list is actually much simpler:
for name in names:
print(name)
Furthermore, there are a whole host of other tools at your disposal to make iteration easier. For example, zip can be used to iterate over two lists at once:[3]
for cat, dog in zip(cats, dogs):
print(cat, dog)
If we want to take into consideration both the index and the value list variable, we can use enumerate:[4]
for index, cat in enumerate(cats):
print(cat, index)
There are also many useful functions to choose from in itertools. Please note however, that using itertools functions is not always the right choice. If one of the functions in itertools offers a very convenient solution to the problem you're trying to solve, like flattening a list or creating a getting the permutations of the contents of a given list, then go for it. But don't try to fit it into some part of your code just because you want to.
The problem with itertools abuse happens so often that one highly respected Python contributor on StackOverflow has dedicated a significant part of their profile to it.[5]
Using Mutable Default Arguments
I've seen the following quite a lot:
def foo(a, b, c=[]):
# append to c
# do some more stuff
Never use mutable default arguments, instead use the following:
def foo(a, b, c=None):
if c is None:
c = []
# append to c
# do some more stuff
Instead of explaining what the problem is, its better to show the effects of using mutable default arguments:
In[2]: def foo(a, b, c=[]):
... c.append(a)
... c.append(b)
... print(c)
...
In[3]: foo(1, 1)
[1, 1]
In[4]: foo(1, 1)
[1, 1, 1, 1]
In[5]: foo(1, 1)
[1, 1, 1, 1, 1, 1]
The same c is being referenced again and again every time the function is called. This can have some very unwanted consequences.
Takeaway
These are just some of the problems that one might run into when relatively new at Python. Please note however, that this is far from a comprehensive list of the problems that one might run into. The other pitfalls however are largely to do with people using Python like Java or C++ and trying to use Python in a way that they are familiar with. So, as a continuation of this, try diving into things like Python's super function. Take a look at classmethod, staticmethod and __slots__.
Update
Last Updated on 12 May 2015 4:50 PM (GMT +6)
- Made the section on Misunderstanding the GIL
| [1] | Most people are taught Python using Python 2. However, when they go home and try things out themselves, they install Python 3 (quite a natural thing to install the latest version). |
| [2] | When people talk about real threads what they essentially mean is that these threads are real CPU threads, which are scheduled by the OS (Operating System). |
| [3] | https://docs.python.org/3/library/functions.html#zip |
| [4] | enumerate can be further configured to produce the kind of index you want.https://docs.python.org/3/library/functions.html#enumerate |
| [5] | http://stackoverflow.com/users/908494/abarnert |
Potential Pythonic Pitfalls的更多相关文章
- LeetCode Potential Thought Pitfalls
Problem Reason Reference Moving ZeroesSort Colors Corner cases Shortest Word Distance Thought: 2 p ...
- SPA UI-router
------------------------------------------------------------------------------------ SPA SPA(单页面应用): ...
- R数据分析:扫盲贴,什么是多重插补
好多同学跑来问,用spss的时候使用多重插补的数据集,怎么选怎么用?是不是简单的选一个做分析?今天写写这个问题. 什么时候用多重插补 首先回顾下三种缺失机制或者叫缺失类型: 上面的内容之前写过,这儿就 ...
- The lesser known pitfalls of allowing file uploads on your website
These days a lot of websites allow users to upload files, but many don’t know about the unknown pitf ...
- Visibility Graph Analysis of Geophysical Time Series: Potentials and Possible Pitfalls
Tasks: invest papers 3 篇. 研究主动权在我手里. I have to. 1. the benefit of complex network: complex networ ...
- 一些Python的惯用法和小技巧:Pythonic
Pythonic其实是个模糊的含义,没有确定的解释.网上也没有过多关于Pythonic的说明,我个人的理解是更加Python,更符合Python的行为习惯.本文主要是说明一些Python的惯用法和小技 ...
- python gui之tkinter界面设计pythonic设计
ui的设计,控件id的记录是一件比较繁琐的事情. 此外,赋值和读取数据也比较繁琐,非常不pythonic. 有没有神马办法优雅一点呢?life is short. 鉴于控件有name属性,通过dir( ...
- Watch out for these 10 common pitfalls of experienced Java developers & architects--转
原文地址:http://zeroturnaround.com/rebellabs/watch-out-for-these-10-common-pitfalls-of-experienced-java- ...
- [python]pythonic的字典常用操作
注意:dct代表字典,key代表键值 1.判断字典中某个键是否存在 实现 dct.has_key(key) #False 更Pythonic方法 key in dct #False 2.获取字典中的值 ...
随机推荐
- 自学Linux Shell4.3-处理数据文件sort grep gzip tar
点击返回 自学Linux命令行与Shell脚本之路 4.3-处理数据文件sort grep gzip tar ls命令用于显示文件目录列表,和Windows系统下DOS命令dir类似.当执行ls命令时 ...
- [luogu4201][bzoj1063]设计路线【树形DP】
题目描述 Z国坐落于遥远而又神奇的东方半岛上,在小Z的统治时代公路成为这里主要的交通手段.Z国共有n座城市,一些城市之间由双向的公路所连接.非常神奇的是Z国的每个城市所处的经度都不相同,并且最多只和一 ...
- 洛谷 P3297 [SDOI2013]逃考 解题报告
P3297 [SDOI2013]逃考 题意 给一个平面矩形,里面有一些有标号点,有一个是人物点,人物点会被最近的其他点控制,人物点要走出矩形,求人物点最少被几个点控制过. 保证一开始只被一个点控制,没 ...
- 洛谷 P2463 [SDOI2008]Sandy的卡片 解题报告
P2463 [SDOI2008]Sandy的卡片 题意 给\(n(\le 1000)\)串,定义两个串相等为"长度相同,且一个串每个数加某个数与另一个串完全相同",求所有串的最长公 ...
- Android 友盟SDK 终极解决报错:SocialSDK_QQZone_2.jar contains native libraries that
转自:http://bbs.umeng.com/thread-6552-1-2.html 报错信息:The library `SocialSDK_QQZone_2.jar` contains nati ...
- exec函数族的使用
作者:王姗姗,华清远见嵌入式学院讲师. exec用被执行的程序完全替换调用它的程序的影像.fork创建一个新的进程就产生了一个新的PID,exec启动一个新程序,替换原有的进程,因此这个新的被exec ...
- P1274 魔术数字游戏 naive搜索+剪枝
真的naive...... 我把所有能剪的枝都剪了才过的.否则就是TTT 还有个很神奇的事:数组作为参数传进递归函数时会造成上一层函数里的数组的改变.这个我TM调了一天. 下面奉上代码 #includ ...
- 【codevs4919】线段树练习4
题目大意:维护一个长度为 N 的序列,支持两种操作:区间加,区间查询有多少数是 7 的倍数. 题解:在每个线段树中维护一个权值数组 [0,6],由于个数显然支持区间可加性,因此可用线段树来维护. 代码 ...
- VB|xp风格:终于解决了“图片优化软件”在部分xp系统上无法启动的问题。
一年以来,图片优化软件一直存在一个“兼容”性问题. 因为之前的软件是在windows 2003系统上开发的,制作成安装文件后,经部分用户测试发现,在部分用户的xp系统上安装后,无法正常启动,只能听到p ...
- MATLAB:增加噪声,同时多次叠加噪声图和原图以及求平均图像(imnoise,imadd函数)
本次涉及了对原图像增加高斯噪声.多次叠加原图和高斯噪声图以及叠加后的平均图像. close all; %关闭当前所有图形窗口,清空工作空间变量,清除工作空间所有变量 clear all; clc; R ...