知乎模拟登录 requests session

Python 3.5

# -*- coding: utf-8 -*-

"""

Created on Wed May  3 16:26:55 2017

@author: x-power

"""

import requests

import http.cookiejar as cookielib

import re

import time

import os.path

from PIL import Image

# 构造 Request headers

headers = {

    "Host": "www.zhihu.com",

    "Referer": "https://www.zhihu.com/",

    'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0',

}

# 构造 cookie 信息

session = requests.session()

session.cookies = cookielib.LWPCookieJar(filename='cookies')

try:

    session.cookies.load(ignore_discard=True) # 如果已经有 cookie信息的话 直接用于登录

except:

    print("Cookie 未能加载")

# 以后再用session 访问的时候 都带着 本地已经固定的cookie信息，代表都是一台机器发出的请求。

def get_xsrf():

    '''_xsrf 是一个动态变化的参数'''

    index_url = 'https://www.zhihu.com'

    # 获取登录时需要用到的_xsrf

    index_page = session.get(index_url, headers=headers)

    html = index_page.text

    pattern = r'name="_xsrf" value="(.*?)"'

    # 这里的_xsrf 返回的是一个list

    _xsrf = re.findall(pattern, html)

    return _xsrf[0]

# 获取验证码

def get_captcha():

    t = str(int(time.time() * 1000))

    captcha_url = 'https://www.zhihu.com/captcha.gif?r=' + t + "&type=login"

    r = session.get(captcha_url, headers=headers)

    with open('captcha.jpg', 'wb') as f:

        f.write(r.content)

        f.close()

    # 用pillow 的 Image 显示验证码

    # 如果没有安装 pillow 到源代码所在的目录去找到验证码然后手动输入

    try:

        im = Image.open('captcha.jpg')

        im.show()

        im.close()

    except:

        print(u'请到 %s 目录找到captcha.jpg 手动输入' % os.path.abspath('captcha.jpg'))

    captcha = input("please input the captcha\n>")

    return captcha

def isLogin():

    # 通过查看用户个人信息来判断是否已经登录

    url = "https://www.zhihu.com/settings/profile"

    login_code = session.get(url, headers=headers, allow_redirects=False).status_code  #allow_redirects 不允许重定向

    if login_code == 200:

        return True

    else:

        return False

def login(secret, account):

    _xsrf = get_xsrf()

    headers["X-Xsrftoken"] = _xsrf

    headers["X-Requested-With"] = "XMLHttpRequest"

    # 通过输入的用户名判断是否是手机号

    if re.match(r"^1\d{10}$", account):

        print("手机号登录 \n")

        post_url = 'https://www.zhihu.com/login/phone_num'

        postdata = {

            '_xsrf': _xsrf,

            'password': secret,

            'phone_num': account

        }

    else:

        if "@" in account:

            print("邮箱登录 \n")

        else:

            print("你的账号输入有问题，请重新登录")

            return 0

        post_url = 'https://www.zhihu.com/login/email'

        postdata = {

            '_xsrf': _xsrf,

            'password': secret,

            'email': account

        }

    # 不需要验证码直接登录成功

    login_page = session.post(post_url, data=postdata, headers=headers)

    login_code = login_page.json()

    if login_code['r'] == 1:

        # 不输入验证码登录失败

        # 使用需要输入验证码的方式登录

        postdata["captcha"] = get_captcha()

        login_page = session.post(post_url, data=postdata, headers=headers)

        login_code = login_page.json()

        print(login_code['msg'])

    # 保存 cookies 到文件，

    # 下次可以使用 cookie 直接登录，不需要输入账号和密码

    session.cookies.save()

if __name__ == '__main__':

    if isLogin():

        print('您已经登录')

    else:

        account = input('请输入你的用户名\n>  ')

        secret = input("请输入你的密码\n>  ")

login(secret, account)

知乎模拟登录 requests session的更多相关文章

知乎模拟登录，支持验证码和保存 Cookies
import requests import time import re import base64 import hmac import hashlib import json import ma ...
Java爬虫——模拟登录知乎
登录界面,首先随意输入一个账号,登录查看发送表单的请求可以发现请求是Post : https://www.zhihu.com/login/phone_num 发送的表单是 _xsrf: passwo ...
在Python中用Request库模拟登录（一）：字幕库（无加密，无验证码）
字幕库的登录表单如下所示,其中省去了无关紧要的内容: <form class="login-form" action="/User/login.html" ...
Python3 模拟登录知乎（requests）
# -*- coding: utf-8 -*- """ 知乎登录分为两种登录一是手机登录 API : https://www.zhihu.com/login/phone ...
【爬虫】python requests模拟登录知乎
需求:模拟登录知乎,因为知乎首页需要登录才可以查看,所以想爬知乎上的内容首先需要登录,那么问题来了,怎么用python进行模拟登录以及会遇到哪些问题? 前期准备: 环境:ubuntu,python2. ...
【Python数据分析】Python模拟登录(一) requests.Session应用
最近由于某些原因,需要用到Python模拟登录网站,但是以前对这块并不了解,而且目标网站的登录方法较为复杂, 所以一下卡在这里了,于是我决定从简单的模拟开始,逐渐深入地研究下这块. 注:本文仅为交流学 ...
4 使用Selenium模拟登录csdn，取出cookie信息，再用requests.session访问个人中心（保持登录状态）
代码: # -*- coding: utf-8 -*- """ Created on Fri Jul 13 16:13:52 2018 @author: a " ...
Python爬虫 —— 知乎之selenium模拟登陆获取cookies+requests.Session()访问+session序列化
代码如下: # coding:utf-8 from selenium import webdriver import requests import sys import time from lxml ...
Python爬虫初学（三）—— 模拟登录知乎
模拟登录知乎这几天在研究模拟登录, 以知乎 - 与世界分享你的知识.经验和见解为例.实现过程遇到不少疑问,借鉴了知乎xchaoinfo的代码,万分感激! 知乎登录分为邮箱登录和手机登录两种方式,通过 ...

随机推荐

C#3.0之神奇的Lambda表达式和Lambda语句
“Lambda 表达式”是一个匿名函数,它可以包含表达式和语句,并且可用于创建委托或表达式目录树类型.所有 Lambda 表达式都使用 Lambda 运算符 =>,该运算符读为“goes to” ...
git svn 报错
删除 openjdk 时 remove 了一大堆软件. 可能由于这个原因导致使用 git svn 命令时出现类似下面的错误. sam@sam-CW65S:pics$ git svn rebase Ca ...
openwrt 实现hotplug-button
<*> kmod-gpio-button-hotplug................Simple GPIO Button Hotplug driver gpio-button-hotp ...
web 前端冷知识
前端已经被玩儿坏了!像console.log()可以向控制台输出图片等炫酷的玩意已经不是什么新闻了,像用||操作符给变量赋默认值也是人尽皆知的旧闻了,今天看到Quora上一个帖子,瞬间又GET了好多前 ...
C++，Base64编解码字符串或文件
参考链接:在C语言中使用libb64进行Base64编解码 GitHub地址:https://github.com/BuYishi/cpp_base64_demo base64_demo.cpp #i ...
C++11 std::function、std::bind和lambda表达式
参考博客: C++可调用对象详解-https://www.cnblogs.com/Philip-Tell-Truth/p/5814213.html 一.关于std::function与std::bin ...
react native 之 redux
第一章认识redux 说的通俗且直白一点呢,就是redux提供了一个store,独立的一个内存区,然后放了一些state,你可以在任何component中访问到state,这些state要更改怎么 ...
POJ3685 Matrix —— 二分
题目链接:http://poj.org/problem?id=3685 Matrix Time Limit: 6000MS Memory Limit: 65536K Total Submissio ...
Getting Started with xUnit.net (desktop)
https://xunit.github.io/docs/getting-started-desktop.html In this article, we will demonstrate getti ...
Redis和StackExchange.Redis
redis有多个数据库1.redis 中的每一个数据库,都由一个 redisDb 的结构存储.其中,redisDb.id 存储着 redis 数据库以整数表示的号码.redisDb.dict 存储着该 ...

知乎模拟登录 requests session

知乎模拟登录 requests session的更多相关文章

随机推荐

热门专题