<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Typ…
注意:如果想爬取详情页的信息请按须添加方法 import requests import os import re import threading from lxml import etree #爬去详情页得HTML内容 class CnBeta(object): def get_congtent(self,url): #获取网页首页HTML信息 r = requests.get(url) #将获取得HTML页面进行解码 html = r.content.decode('utf-8') #返回…