python中decode

这是因为遇到了非法字符——尤其是在某些用C/C++编写的程序中，全角空格往往有多种不同的实现方式，比如\xa3\xa0，或者\xa4\x57，这些字符，看起来都是全角空格，但它们并不是“合法”的全角空格（真正的全角空格是\xa1\xa1），因此在转码的过程中出现了异常。

这样的问题很让人头疼，因为只要字符串中出现了一个非法字符，整个字符串——有时候，就是整篇文章——就都无法转码。

解决办法：

s.decode(‘gbk’, ‘ignore’).encode(‘utf-8′)

因为decode的函数原型是decode([encoding],
[errors='strict'])，可以用第二个参数控制错误处理的策略，默认的参数就是strict，代表遇到非法字符时抛出异常；

如果设置为ignore，则会忽略非法字符；

如果设置为replace，则会用?取代非法字符；

如果设置为xmlcharrefreplace，则使用XML的字符引用。

python文档

decode( [encoding[, errors]])

Decodes the string using the codec registered for encoding.
encoding defaults to the default string encoding. errors may be
given to set a different error handling scheme. The default is
’strict’, meaning that encoding errors raise UnicodeError. Other
possible values are ‘ignore’, ‘replace’ and any other name
registered via codecs.register_error, see section 4.8.1.

python中decode的更多相关文章

python中decode和encode的区别
#-*-coding:utf-8 import sys ''' *首先要搞清楚,字符串在Python内部的表示是unicode编码,因此,在做编码转换时,通常需要以unicode作为中间编码, 即先将 ...
Python中decode与encode的区别
摘抄: 字符串在Python内部的表示是Unicode编码,因此,在做编码转换时,通常需要以unicode作为中间编码,即先将其他编码的字符解码(decode)成unicode,再从unicode编码 ...
python中------decode解码出现的0xca问题解决方法
一.错误: 解决方法: #源代码 data = sk.recv(1024) print(str(data,'gbk')) #修改代码 data = sk.recv(1024) print(str(da ...
Python中的编码问题（encoding与decode、str与bytes）
1 引言在文件读写及字符操作时,我们经常会出现下面这几种错误: TypeError: write() argument must be str, not bytes AttributeError: ...
【转】【Python】 python中的编码问题报错 'ascii' codec can't decode 及 URL地址获取中文
1.unicode.gbk.gb2312.utf-8的关系 http://www.pythonclub.org/python-basic/encode-detail 这篇文章写的比较好,utf-8是u ...
【学习笔记】--- 老男孩学Python，day7 python中is 和 == 的区别 encode decode
is比较的是id(内存地址)是不是一样,==比较的是值是不是一样 Python中,万物皆对象!万物皆对象!万物皆对象!(很重要,重复3遍) 每个对象包含3个属性,id,type,value id就是对 ...
【转记录】python中的encode以及decode
字符串编码常用类型:utf-8,gb2312,cp936,gbk等. python中,我们使用decode()和encode()来进行解码和编码在python中,使用unicode类型作为编码的基础 ...
python的str，unicode对象的encode和decode方法, Python中字符编码的总结和对比bytes和str
python_2.x_unicode_to_str.py a = u"中文字符"; a.encode("GBK"); #打印: '\xd6\xd0\xce\xc ...
Python中出现 SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 12-13: truncated \UXXXXXXXX escape
Python中出现 SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 12-13: t ...

随机推荐

.NET Core/.NET之Stream简介
之前写了一篇C#装饰模式的文章提到了.NET Core的Stream, 所以这里尽量把Stream介绍全点. (都是书上的内容) .NET Core/.NET的Streams 首先需要知道, Syst ...
Java KeyTool command
Create a new key: keytool -genkey -alias keyAlias -keyalg RSA -validity 1000 -keystore d:\keyPath\k ...
Python内置函数(43)——type
英文文档: class type(object) class type(name, bases, dict) With one argument, return the type of an obje ...
如何排查CPU飙升的Java问题
1. JPS 查看jvm进程 2. 显示线程列表 ps -mp pid -o THREAD,tid,time 找到了耗时最高的线程tid 3. tid转换成16进制 printf "%x\n ...
请求方式：request和 get、post、put
angular 的 http 多了 Request, Headers, Response ,这些都是游览器的"新特性" Fetch API. Fetch API 和以前的 xmlh ...
kubernetes入门（03）kubernetes的基本概念
一.Pod 在Kubernetes集群中,Pod是创建.部署和调度的基本单位.一个Pod代表着集群中运行的一个进程,它内部封装了一个或多个应用的容器.在同一个Pod内部,多个容器共享存储.网络IP,以 ...
mysql的账户管理
mysql中账户管理:1 查看所有用户: 所有用户及权限信息都存储在mysql数据库中的user表中查看user表的结构 desc user\G; 主要字段: host: 表示允许访问的主机 use ...
JVM 掌握要点
重读JVM jvm系列:jvm知识点总览 1. 认识Java虚拟机默认Hotspot实现 2. 类加载机制知道双亲委派模型编译为class javac → 装载 class ClassLoade ...
Spark:reduceByKey函数的用法
reduceByKey函数API: def reduceByKey(partitioner: Partitioner, func: JFunction2[V, V, V]): JavaPairRDD[ ...
java中抽象类和接口之间的异同点
抽象类接口声明方式 abstratc class ClassName interface ClassName 包含内容构造方法,普通方法,抽象方法.static方法 .变量常量全局常量.抽 ...

python中decode

python中decode的更多相关文章

随机推荐

热门专题