python实现爬虫遇到编码问题:

error:UnicodeEncodeError: 'gbk' codec can't encode character '\xXX' in position XX

解决办法:改变标准输出

from urllib import request
import io
import sys
sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='gb18030') #改变标准输出的默认编码
req=request.Request('http://www.baidu.com')
req.add_header('User-Agent','Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36')
resp=request.urlopen(req)
print(resp.read().decode('UTF-8'))
添加页面中 标注红色的代码即可

ps:

1.str转bytes叫encode,bytes转str叫decode

2.常用的中文编码名称

参考文章:http://blog.csdn.net/jim7424994/article/details/22675759


UnicodeEncodeError: 'gbk' codec can't encode character '\xbb' in position的更多相关文章

  1. UnicodeEncodeError: 'gbk' codec can't encode character '\xbb' in position 30633: illegal multibyte sequence

    import urllib.request def load_baidu(): url = "https://www.baidu.com/" header = {"Use ...

  2. python 写入日志的问题 UnicodeEncodeError: 'gbk' codec can't encode character '\xbb' in position 0: illegal multibyte sequence

    最近,使用python的logging模块,因为这个写入日志写完后就没有管它.在存储日志信息的时候,一直提示: UnicodeEncodeError: 'gbk' codec can't encode ...

  3. UnicodeEncodeError: 'gbk' codec can't encode character '\xbb' in position 0: illegal multibyte sequence

    使用Python写文件的时候,或者将网络数据流写入到本地文件的时候,大部分情况下会遇到:UnicodeEncodeError: 'gbk' codec can't encode character ' ...

  4. UnicodeEncodeError: 'gbk' codec can't encode character '\xbb' in position 26269: illegal multibyte sequence

    解决方法参见下面的链接: http://blog.csdn.net/jim7424994/article/details/22675759

  5. 解决python3.6的UnicodeEncodeError: 'gbk' codec can't encode character '\xbb' in position 28613: illegal multibyte sequence

    这是python3.6的print()函数自身有限制,不能完全打印所有的unicode字符. 主要的是windows下python的默认编码不是'utf-8',改一下python的默认编码成'utf- ...

  6. 解决python3 UnicodeEncodeError: 'gbk' codec can't encode character '\xXX' in position XX

    从网上抓了一些字节流,想打印出来结果发生了一下错误: UnicodeEncodeError: 'gbk' codec can't encode character '\xbb' in position ...

  7. 解决python3 UnicodeEncodeError: 'gbk' codec can't encode character '\xXX' in position XX

    从网上抓了一些字节流,想打印出来结果发生了一下错误: UnicodeEncodeError: 'gbk' codec can't encode character '\xbb' in position ...

  8. 解决python3 UnicodeEncodeError: 'gbk' codec can't encode character '\xXX' in position XX(转)

    原文地址:https://www.cnblogs.com/feng18/p/5646925.html 从网上抓了一些字节流,想打印出来结果发生了一下错误: UnicodeEncodeError: 'g ...

  9. python基础===解决python3 UnicodeEncodeError: 'gbk' codec can't encode character '\xXX' in position XX(转载)

    本文转自:解决python3 UnicodeEncodeError: 'gbk' codec can't encode character '\xXX' in position XX 从网上抓了一些字 ...

随机推荐

  1. poj-1170 (状态压缩形式下的完全背包)

    #include <iostream> #include <algorithm> #include <cstring> using namespace std; ; ...

  2. Almost Union-Find 并查集(脱离原来的树)

    h: 0px; "> I hope you know the beautiful Union-Find structure. In this problem, you’re to im ...

  3. spring boot redis -> @Cacheable,@CacheEvict, @CachePut

    https://blog.csdn.net/eumenides_/article/details/78298088?locationNum=9&fps=1 https://www.cnblog ...

  4. @Transactional + FetchType.LYZY (hibernate) <---> Exception: could not initialize proxy - no Session;

    转自: https://blog.csdn.net/blueheart20/article/details/52912023 4.问题的解决 尝试1:  在Service方法中新增了@Transact ...

  5. signal()信号操作

    一.函数描述 #include <signal.h> typedef void (*sighandler_t)(int);sighandler_t signal(int signum, s ...

  6. gridview 自动序号 合计

    第一种方式,直接在Aspx页面GridView模板列中.这种的缺点是到第二页分页时又重新开始了. <asp:TemplateField HeaderText="序号" Ins ...

  7. JVM(上)

    堆.栈 JVM内存≍Heap(堆内存)+PermGen(方法区)+Thrend(栈)Heap(堆内存)=Young(年轻代)+Old(老年代),官方文档建议整个年轻代占整个堆内存的3/8,老年代占整个 ...

  8. 自适应巡航控制系统——ACC

    ACC(Adaptive Cruise Control)自适应巡航控制系统(以下简称ACC)是一种基于传感器识别技术而诞生的智能巡航控制,相比只能根据驾驶者设置的速度进行恒定速度巡航的传统巡航控制系统 ...

  9. 转jmeter --JDBC请求

    做JDBC请求,首先要了解这个JDBC对象是什么,然后寻找响应的数据库连接URL和数据库驱动. 数据库URL:jdbc:sqlserver://200.99.197.190:1433;database ...

  10. tomcat和servlet的关系

    一.什么是servlet? 处理请求和发送响应的过程是由一种叫做Servlet的程序来完成的,并且Servlet是为了解决实现动态页面而衍生的东西.理解这个的前提是了解一些http协议的东西,并且知道 ...