Python 多进程爬虫实例 import json import re import time from multiprocessing import Pool import requests from requests.exceptions import RequestException from bs4 import BeautifulSoup def get_one_page(url): try: response = requests.get(url) if response.sta…
最近学习 python 走火入魔,趁着热情继续初级体验一下下爬虫,以前用 java也写过,这里还是最初级的爬取html,都没有用html解析器,正则等...而且一直在循环效率肯定### 很低下 import urllib.request as urllib2 import random ua_list = [ "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv2.0.1) Gecko/20100101 Firefox/4.0.1", &qu…