Python2和Python3比较分析

一直有看到网上有讨论Python2和Python3的比较，最近公司也在考虑是否在spark-python大数据开发环境中升级到python3。通过本篇博文记录Python2.7.13和Pthon3.5.3的各方面比较。

环境配置

这里继续使用我们在之前博文里配置的环境。

因为是比较Python2和Python3差异，所以单纯升级Python版本无法解决，我通过pyenv和virtualenv两个工具来实现隔离的测试环境。

参考文档：使用pyenv和virtualenv搭建python虚拟环境、使用 pyenv 可以在一个系统中安装多个python版本

配置的步骤如下：

最开始是更新Tkinter，不然后续要重新再来一次，不要问我为什么知道…

sudo yum install tkinter -y
sudo yum install tk-devel tcl-devel -y

更新pyenv依赖软件

sudo yum install readline readline-devel readline-static -y
yum install openssl openssl-devel openssl-static -y
yum install sqlite-devel -y
yum install bzip2-devel bzip2-libs -y

下载安装pyenv，并下载python2.7.13和python3.5.3

git clone https://github.com/yyuu/pyenv.git ~/.pyenv
chgmod 777 -R ~/.pyenv
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bash_profile
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bash_profile
echo 'eval "$(pyenv init -)"' >> ~/.bash_profile
exec $SHELL
source ~/.bash_profile 

pyenv  install --list
pyenv install -v 2.7.13
pyenv install -v 3.5.3

下载安装pyenv-virtualenv，并安装两个隔离环境

git clone https://github.com/yyuu/pyenv-virtualenv.git ~/.pyenv/plugins/pyenv-virtualenv   
echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bash_profile
source ~/.bash_profile
pyenv virtualenv 2.7.13 py2
pyenv virtualenv 3.5.3 py3

好，到此基本搞定两个隔离的python环境，测试如下，我们可以发现当前的python环境从centos7默认的2.7.5切换到2.7.13再切换到3.5。

[kejun@localhost ~]$ python -V
Python 2.7.5
[kejun@localhost ~]$ pyenv activate py2
(py2) [kejun@localhost ~]$ python -V
Python 2.7.13
(py2) [kejun@localhost ~]$ pyenv deactivate
[kejun@localhost ~]$ pyenv activate py3
(py3) [kejun@localhost ~]$ python -V
Python 3.5.

详细测试：

我们安装了常用的数据分析第三方工具包，并做了安装测试和样例测试，样例测试的脚本见最下。

分类	工具名	用途
数据收集	scrapy	网页采集，爬虫
数据收集	scrapy-redis	分布式爬虫
数据收集	selenium	web测试，仿真浏览器
数据处理	beautifulsoup	网页解释库，提供lxml的支持
数据处理	lxml	xml解释库
数据处理	xlrd	excel文件读取
数据处理	xlwt	excel文件写入
数据处理	xlutils	excel文件简单格式修改
数据处理	pywin32	excel文件的读取写入及复杂格式定制
数据处理	Python-docx	Word文件的读取写入
数据分析	numpy	基于矩阵的数学计算库
数据分析	pandas	基于表格的统计分析库
数据分析	scipy	科学计算库，支持高阶抽象和复杂模型
数据分析	statsmodels	统计建模和计量经济学工具包
数据分析	scikit-learn	机器学习工具库
数据分析	gensim	自然语言处理工具库
数据分析	jieba	中文分词工具库
数据存储	MySQL-python	mysql的读写接口库
数据存储	mysqlclient	mysql的读写接口库
数据存储	SQLAlchemy	数据库的ORM封装
数据存储	pymssql	sql server读写接口库
数据存储	redis	redis的读写接口
数据存储	PyMongo	mongodb的读写接口
数据呈现	matplotlib	流行的数据可视化库
数据呈现	seaborn	美观的数据可是湖库，基于matplotlib
工具辅助	jupyter	基于web的python IDE，常用于数据分析
工具辅助	chardet	字符检查工具
工具辅助	ConfigParser	配置文件读写支持
工具辅助	requests	HTTP库，用于网络访问

# encoding=utf-8
import sys
import platform
import traceback
import gc
import ctypes  


STD_OUTPUT_HANDLE= -11  
FOREGROUND_BLACK = 0x0  
FOREGROUND_BLUE = 0x01 # text color contains blue.  
FOREGROUND_GREEN= 0x02 # text color contains green.  
FOREGROUND_RED = 0x04 # text color contains red.  
FOREGROUND_INTENSITY = 0x08 # text color is intensified.  

class WinPrint:
    """
    提供给Windows打印彩色字体使用
    """

    std_out_handle = ctypes.windll.kernel32.GetStdHandle(STD_OUTPUT_HANDLE)  

    def set_cmd_color(self, color, handle=std_out_handle):  
        bool = ctypes.windll.kernel32.SetConsoleTextAttribute(handle, color)  
        return bool  

    def reset_color(self):  
        self.set_cmd_color(FOREGROUND_RED | FOREGROUND_GREEN | FOREGROUND_BLUE)  

    def print_red_text(self, print_text):  
        self.set_cmd_color(FOREGROUND_RED | FOREGROUND_INTENSITY)  
        print (print_text)  
        self.reset_color()  

    def print_green_text(self, print_text):  
        self.set_cmd_color(FOREGROUND_GREEN | FOREGROUND_INTENSITY)  
        print (print_text)  
        self.reset_color()  

class UnixPrint:
    """
    提供给Centos打印彩色字体
    """
    def print_red_text(self, print_text): 
        print('\033[1;31m%s\033[0m'%print_text)

    def print_green_text(self, print_text): 
        print('\033[1;32m%s\033[0m'%print_text)

py_env = "Python2" if sys.version.find("2.7") > -1 else "Python3"
sys_ver = "Windows" if platform.system().find("indows") > -1 else "Centos"
my_print = WinPrint() if platform.system().find("indows") > -1 else UnixPrint()


def check(sys_ver, py_env):
    """
    装饰器，统一输入输出
    顺便测试带参数的装饰器，非必须带参数
    """
    def _check(func):
        def __check():
            try:
                func()
                my_print.print_green_text(
                    "[%s,%s]: %s pass." % (sys_ver, py_env, func.__name__))
            except:
                traceback.print_exc()
                my_print.print_red_text(
                    "[%s,%s]: %s fail." % (sys_ver, py_env, func.__name__))
        return __check
    return _check


def make_requirement(filepath, filename):
    """
    处理pip requirements文件
    """
    result = []
    with open(filepath + "\\" + filename, "r") as f:
        data = f.readlines()

        for line in data:
            if line.find("==") > -1:
                result.append(line.split("==")[0] + "\n")
            else:
                result.append(line + "\n")
    with open(filepath + "\\" + filename.split(".")[0] + "-clean.txt",
              "w") as f1:
        f1.writelines(result)


@check(sys_ver, py_env)
def test_scrapy():
    from scrapy import signals
    from selenium import webdriver
    from scrapy.http import HtmlResponse
    from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
    from selenium.webdriver.common.keys import Keys
    from selenium.common.exceptions import NoSuchElementException
    from selenium.common.exceptions import TimeoutException
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait


@check(sys_ver, py_env)
def test_matplotlib():
    import matplotlib.pyplot as plt
    l = [1, 2, 3, 4, 5]
    h = [20, 14, 38, 27, 9]
    w = [0.1, 0.2, 0.3, 0.4, 0.5]
    b = [1, 2, 3, 4, 5]
    fig = plt.figure()
    ax = fig.add_subplot(111)
    rects = ax.bar(l, h, w, b)
    # plt.show()


@check(sys_ver, py_env)
def test_beautifulSoup():
    from bs4 import BeautifulSoup
    html_str = "<html><meta/><head><title>Hello</title></head><body onload=crash()>Hi all<p></html>"
    soup = BeautifulSoup(html_str, "lxml")
    # print (soup.get_text())


@check(sys_ver, py_env)
def test_lxml():
    from lxml import html
    html_str = "<html><meta/><head><title>Hello</title></head><body onload=crash()>Hi all<p></html>"
    html.fromstring(html_str)


@check(sys_ver, py_env)
def test_xls():
    import xlrd
    import xlwt
    from xlutils.copy import copy
    excel_book2 = xlwt.Workbook()
    del excel_book2
    excel_book1 = xlrd.open_workbook("1.xlsx")
    del excel_book1
    import docx
    doc = docx.Document("1.docx")
    # print (doc)
    del doc
    gc.collect()


@check(sys_ver, py_env)
def test_data_analysis():
    import pandas as pd
    import numpy as np
    data_list = np.array([x for x in range(100)])
    data_serial = pd.Series(data_list)
    # print (data_serial)
    from scipy import fft
    b = fft(data_list)
    # print (b)


@check(sys_ver, py_env)
def test_statsmodels():
    import statsmodels.api as sm
    data = sm.datasets.spector.load()
    data.exog = sm.add_constant(data.exog, prepend=False)
    # print data.exog


@check(sys_ver, py_env)
def test_sklearn():
    from sklearn import datasets
    iris = datasets.load_iris()
    data = iris.data
    # print(data.shape)


@check(sys_ver, py_env)
def test_gensim():
    import warnings
    warnings.filterwarnings(action='ignore', category=UserWarning, module='gensim')
    from gensim import corpora
    from collections import defaultdict
    documents = ["Human machine interface for lab abc computer applications",
                 "A survey of user opinion of computer system response time",
                 "The EPS user interface management system",
                 "System and human system engineering testing of EPS",
                 "Relation of user perceived response time to error measurement",
                 "The generation of random binary unordered trees",
                 "The intersection graph of paths in trees",
                 "Graph minors IV Widths of trees and well quasi ordering",
                 "Graph minors A survey"]
    stoplist = set('for a of the and to in'.split())
    texts = [[word for word in document.lower().split() if word not in stoplist]
             for document in documents]
    frequency = defaultdict(int)
    for text in texts:
        for token in text:
            frequency[token] += 1
    texts = [[token for token in text if frequency[token] > 1]
             for text in texts]
    dictionary = corpora.Dictionary(texts)
    dictionary.save('deerwester.dict')


@check(sys_ver, py_env)
def test_jieba():
    import jieba
    seg_list = jieba.cut("我来到了北京参观天安门。", cut_all=False)
    # print("Default Mode: " + "/ ".join(seg_list))  # 精确模式


@check(sys_ver, py_env)
def test_mysql():
    import MySQLdb as mysql
    #测试pet_shop连接
    db = mysql.connect(host="xx", user="yy", passwd="12345678", db="zz")
    cur = db.cursor()
    sql="select id from role;"
    cur.execute(sql)
    result = cur.fetchall()
    db.close()
    # print (result)


@check(sys_ver, py_env)
def test_SQLAlchemy():
    from sqlalchemy import Column, String, create_engine,Integer
    from sqlalchemy.orm import sessionmaker
    from sqlalchemy.ext.declarative import declarative_base
    engine = create_engine('mysql://xxx/yy',echo=False)
    DBSession = sessionmaker(bind=engine)
    Base = declarative_base()
    class rule(Base):
        __tablename__="role"
        id=Column(Integer,primary_key=True,autoincrement=True)
        role_name=Column(String(100))
        role_desc=Column(String(255))
    new_rule=rule(role_name="test_sqlalchemy",role_desc="forP2&P3")
    session=DBSession()
    session.add(new_rule)
    session.commit()
    session.close()


@check(sys_ver, py_env)
def test_redis():
    import redis 
    pool = redis.Redis(host='127.0.0.1', port=6379) 

@check(sys_ver, py_env)
def test_requests():
    import requests
    r=requests.get(url="http://www.cnblogs.com/kendrick/")
    # print (r.status_code)


@check(sys_ver, py_env)
def test_PyMongo():
    from pymongo import MongoClient
    conn=MongoClient("localhost",27017)

if __name__ == "__main__":
    print ("[%s,%s]  start checking..." % (sys_ver, py_env))
    test_scrapy()
    test_beautifulSoup()
    test_lxml()
    test_matplotlib()
    test_xls()
    test_data_analysis()
    test_sklearn()
    test_mysql()
    test_SQLAlchemy()
    test_PyMongo()
    test_gensim()
    test_jieba()
    test_redis()
    test_requests()
    test_statsmodels()
    print ("[%s,%s]  finish checking." % (sys_ver, py_env))

Python2和Python3比较分析的更多相关文章

Python2和Python3的差异
之前做Spark大数据分析的时候,考虑要做Python的版本升级,对于Python2和Python3的差异做了一个调研,主要对于语法和第三方工具包支持程度进行了比较. 基本语法差异核心类差异 Pyt ...
python2 与 python3的区别
python2 与 python3的区别几乎所有的python2程序都需要一些修改才能正常的运行在python3的环境下.为了简化这个转换过程,Python3自带了一个2to3的实用脚本.这个脚本会 ...
python2 与 python3 语法区别
python2 与 python3 语法区别概述# 原稿地址:使用 2to3 将代码移植到 Python 3 几乎所有的Python 2程序都需要一些修改才能正常地运行在Python 3的环境下.为 ...
python2 与 python3的区别总结
python2 与 python3的区别总结几乎所有的Python 2程序都需要一些修改才能正常地运行在Python 3的环境下.为了简化这个转换过程,Python 3自带了一个叫做2to3的 ...
Python2 和Python3 的差异总结
一.基本语法差异 1.1 核心类差异 Python3对Unicode字符的原生支持 Python2中使用 ASCII 码作为默认编码方式导致string有两种类型str和unicode,Python3 ...
初学者学习python2还是python3？
如果你是一个初学者,或者你以前接触过其他的编程语言,你可能不知道,在开始学习python的时候都会遇到一个比较让人很头疼的问题:版本问题!!是学习python2 还是学习 python3 ?这是非常让 ...
python2和python3的区别（转）
基本语法差异核心类差异 Python3对Unicode字符的原生支持 Python2中使用 ASCII 码作为默认编码方式导致string有两种类型str和unicode,Python3只支持uni ...
python2与python3的区别齐全【整理】
本文链接:https://blog.csdn.net/pangzhaowen/article/details/80650478 展开一.核心类差异1. Python3 对 Unicode 字符的原生 ...
Python2与Python3兼容
Python2与Python3兼容 python3写的代码如何也能在pyhon2上跑?请无论如何加上这一句,python3没有啥影响 from __future__ import absolute_i ...

随机推荐

android sensor架构
Android Sensor 架构深入剖析作者:倪键树,华清远见嵌入式学院讲师. 1.Android sensor架构 Android4.0系统内置对传感器的支持达13种,它们分别是:加速度传感器 ...
SQL Server扫盲系列——镜像篇
为方便查看,并以专题形式展示,所以我会把一些文章整合起来.本部分为SQL Server镜像系列: 本文出处:http://blog.csdn.net/dba_huangzj/article/detai ...
Ubuntu安装JDK与环境变量配置
Ubuntu安装JDK与环境变量配置一.getconf LONG_BIT 查看系统位数,并下载相应的jdk.我的系统是32位的,所以下载的jdk是:jdk-8u77-linux-i586.gz.并且 ...
经过一段的努力，终于成为CSDN博客专家，感谢大家支持
感谢CSDN提供这么好的一个技术学习平台,通过各路大神的博客我成长了许多,同时也感谢支持我的朋友们,我会继续努力,用心去写好博客.还请继续关注我~ 谢谢!
How to Enable Trace or Debug for APIs executed as SQL Script Outside of the Applications ?
In this Document Goal Solution 1: How do you enable trace for an API when executed from a SQL ...
DBA_基本Bash语法汇总
一.变量 1.变量命名可使用英文字母.数字和下划线,必须以英文字母开头,区分大小写. 2.每个shell都拥有自己的变量定义,彼此互不影响. 3.变量直接以等号赋值,注意等号两边不可留空,若等号右侧有 ...
Android For JNI(五)——C语言多级指针，结构体，联合体，枚举，自定义类型
Android For JNI(五)--C语言多级指针,结构体,联合体,枚举,自定义类型我们的C已经渐渐的步入正轨了,基础过去之后,就是我们的NDK和JNI实战了一.多级指针指针的概念我们在前面 ...
Eclipse搭建Android环境失败的解决方案
今天在Eclipse上搭建Android开发环境,不仅在安装ADT的过程中老是出错,而且Android SDK下载后,打开SDK Manager时也无法链接到网页下载tools,网上查了好多方法,试了 ...
【Android 应用开发】GitHub 优秀的 Android 开源项目
原文地址为http://www.trinea.cn/android/android-open-source-projects-view/,作者Trinea 主要介绍那些不错个性化的View,包括Lis ...
什么是网络套接字（Socket）？
什么是网络套接字(Socket)?一时还真不好回答,而且网络上也有各种解释,莫衷一是.下文将以本人所查阅到的资料来说明一下什么是Socket. Socket定义 Socket在维基百科的定义: A n ...

Python2和Python3比较分析

环境配置

详细测试：

Python2和Python3比较分析的更多相关文章

随机推荐

热门专题