python 下载bilibili视频

说明:

1.清晰度的选择要登录,暂时还没做,目前下载的视频清晰度都是默认的480P

2.进度条仿linux的,参考了一些博客修改了下,侵删

3.其他评论,弹幕之类的相关爬虫代码放在了https://github.com/teleJa/bilibili

4.判断sys.argv那个地方是因为一些爬虫调用了该文件,如果感觉不方面,直接传递视频番号进去就可以了

下载过程如图

直接上代码:

 import requests

 import re

 import os

 import json

 import sys

 import math

 from lxml import etree

 class BLDSplider:

     regex_cid = re.compile("\"cid\":(.{8})")

     def __init__(self, aid):

         self.aid = aid

         self.origin_url = "https://www.bilibili.com/video/av{}?from=search&seid=9346373599622336536".format(aid)

         self.headers = {

             "User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36",

         }

         self.url = "https://api.bilibili.com/x/player/playurl?avid={}&cid={}&qn=0&type=&otype=json"

     def check_dir(self, author_name):

         # 检查目录

         self.parent_path = "e:/bilibili/" + author_name + "/" + str(self.aid) + "/"

         if not os.path.exists(self.parent_path):

             os.makedirs(self.parent_path)

         self.video_name = self.parent_path + str(self.aid) + ".mp4"

     def parse_url(self, item):

         cid = item["cid"]

         print("aid:%s   cid:%s" % (str(self.aid), cid))

         title = item["title"]

         print("title:%s" % title)

         self.headers["Referer"] = self.origin_url

         # 视频

         response = requests.get(self.url.format(self.aid, cid), headers=self.headers)

         if response.status_code == 200:

             result = json.loads(response.content.decode())

             durl = result["data"]["durl"][0]

             video_url = durl["url"]

             print("video_url:%s" % video_url)

             # 视频大小

             size = durl["size"]

             print("size:%s,约%2.2fMB" % (size, size / (1024 * 1024)))

             video_response = requests.get(video_url, headers=self.headers, stream=True)

             if video_response.status_code == 200:

                 with open(self.video_name, "wb") as file:

                     buffer = 1024

                     count = 0

                     while True:

                         if count + buffer <= size:

                             file.write(video_response.raw.read(buffer))

                             count += buffer

                         else:

                             file.write(video_response.raw.read(size % buffer))

                             count += size % buffer

                         file_size = os.path.getsize(self.video_name)

                         # print("\r下载进度 %.2f %%" % (count * 100 / size), end="")

                         width = 50

                         percent = (count / size)

                         use_num = int(percent * width)

                         space_num = int(width - use_num)

                         percent = percent * 100

                         print('\r进度:[%s%s]    %d%%' % (use_num * '#', space_num * ' ', percent), file=sys.stdout,

                               flush=True, end="")

                         if size == count:

                             break

                 print("\r\n")

     # 获取视频相关参数

     def get_video_info(self):

         response = requests.get(self.origin_url, headers=self.headers)

         item = dict()

         if response.status_code == 200:

             # author

             html_element = etree.HTML(response.content.decode())

             author = dict()

             author_name = html_element.xpath(

                 "/html/body/div[@id='app']/div[@class='v-wrap']/div[@class='r-con']/div[@id='v_upinfo']//a[@report-id='name']/text()")[

                 0]

             # 通常是微博,微信公众号等联系方式

             author_others = html_element.xpath(

                 "/html/body/div[@id='app']/div[@class='v-wrap']/div[@class='r-con']/div[@id='v_upinfo']//div[@class='desc']/@title")[

                 0]

             author["name"] = author_name

             author["others"] = author_others

             item["author"] = author

             # cid

             cid = BLDSplider.regex_cid.findall(response.content.decode())[0]

             item["cid"] = cid

             info_url = "https://api.bilibili.com/x/web-interface/view?aid={}&cid={}".format(self.aid, cid)

             info_response = requests.get(info_url, headers=self.headers)

             if info_response.status_code == 200:

                 data = json.loads(info_response.content.decode())["data"]

                 # 视频简介

                 desc = data["desc"]

                 item["desc"] = desc

                 # title

                 title = data["title"]

                 item["title"] = title

                 stat = data["stat"]

                 # 播放量

                 view = stat["view"]

                 item["view"] = view

                 # 弹幕

                 danmaku = stat["danmaku"]

                 item["danmaku"] = danmaku

                 # 评论

                 reply = stat["reply"]

                 item["reply"] = reply

                 # 硬币

                 coin = stat["coin"]

                 item["coin"] = coin

                 # 点赞

                 like = stat["like"]

                 item["like"] = like

                 # 收藏

                 favorite = stat["favorite"]

                 item["favorite"] = favorite

                 # 分享

                 share = stat["share"]

                 item["share"] = share

             self.check_dir(item["author"]["name"])

             # 视频参数

             with open(self.parent_path + "video_info.txt", "w") as file:

                 file.write(json.dumps(item, ensure_ascii=False, indent=2))

             return item

     def run(self):

         item = self.get_video_info()

         self.parse_url(item)

 def main():

     #

     aid = 55036734

     if len(sys.argv) >= 2:

         if sys.argv[1]:

             aid = sys.argv[1]

     splider = BLDSplider(aid)

     splider.run()

 if __name__ == '__main__':

     main()

python 下载bilibili视频的更多相关文章

Python 批量下载BiliBili视频打包成软件
文章目录很多人学习python,不知道从何学起.很多人学习python,掌握了基本语法过后,不知道在哪里寻找案例上手.很多已经做案例的人,却不知道如何去学习更加高深的知识.那么针对这三类人,我给大家 ...
如何下载Bilibili视频
方法1: https://www.bilibili.com/video/av25940642 (源网址) https://www.ibilibili.com/video/av25940642 (新网址 ...
python下载youtube视频
谷歌开源了一个新的数据集,BoundingBox,(网址在这里)这个数据集是经过人工标注的视频数据集,自然想将它尽快地运用在实际之中,那么首先需要将其下载下来:可以看到网址上给出的是csv文件,该文件 ...
爬虫 | Python下载m3u8视频
目录从 m3u8 文件中解析出 ts 信息按时间截取视频抓取 ts 文件单文件测试批量下载合并 ts 文件将合并的ts文件转化为视频文件参考资料: m3u8格式介绍 ts文件格式介绍 ...
python下载网页视频
因网站不同需要修改. 下载 mp4 连接 from bs4 import BeautifulSoup import requests import urllib import re import js ...
下载bilibili视频
http://www.urlgot.com/zh_CN/
利用Selenium和Browsermob批量嗅探下载Bilibili网站视频
Rerence: http://www.liuhao.me/2016/09/20/selenium_browsermob_sniff_bilibili_video/ 日常生活中,用电脑看视频是非常频繁 ...
Go设计模式学习准备——下载bilibili合集视频
需求前段时间面试,被问到设计模式.说实话虽然了解面向对象.多态,但突然被问到设计模式,还要说清解决什么问题,自己是有些懵的,毕竟实习主要工作是在原项目基础进行CRUD,自己还是没有深度思考,所以只能 ...
Python:使用youtube-dl+ffmpeg+FQ软件下载youtube视频
声明:本文所述内容都是从http://blog.csdn.net/u011475134/article/details/71023612博文中学习而来. 背景: 一同学想通过FQ软件下载一些youtu ...

随机推荐

(五)IO流之ByteArrayInput/OutputStream
ByteArrayInputStream:是把字节数组当成源的输入流 String string="hello shanghai"; ByteArrayInputStream bi ...
【JZOJ4878】【NOIP2016提高A组集训第10场11.8】时空传送
题目描述经过旷久的战争,ZMiG和707逐渐培养出了深厚的感♂情.他们逃到了另一块大陆上,决定远离世间的纷争,幸福地生活在一起.钟情707的neither_nor决心要把他们拆散,他要动用手中最大杀 ...
Flask第一篇
一. Python 现阶段三大主流Web框架 Django Tornado Flask 对比 1.Django 主要特点是大而全,集成了很多组件,例如: Models Admin Form 等等, 不 ...
spider csdn blog part II
继续上次的笔记, 继续完善csdn博文的提取. 发现了非常好的模块. html2docx 结果展示: 运行之后, 直接生成docx文档. 截个图如下: 结果已经基本满意了!!! 在编写过程中的一些感想 ...
golang micro client 报错500 {"id":"go.micro.client","code":408,"detail":"call timeout: context deadline exceeded","status":"Request Timeout"}
go micro web端连接services时,第一次访问提示500(broken pipe),排查发现客户端请求services时返回 {"id":"go.micro ...
创建我的flask第一个应用（二）
继上一篇创建我的flask第一个应用(一),继续学习配置flask 在myproject未提供flask默认运行的主程序文件"wsgi.py"或"app.py" ...
ashx不能折叠代码，没有智能提示
visual studio 2013有时候会遇到这个问题.没安装任何第三方插件,创建的是web网站项目.ashx文件忽然就没有intelligent智能提示了. 可以试试: 关闭visual stud ...
Python 进阶02 文本文件的输入输出
Python 具有基本的文本文件读写功能,Python的标准库提供有更丰富的读写功能. 文本文件的读写主要通过open()所构建的文件对象来实现创建文件对象我们打开一个文件,并适用一个对象来表示该 ...
linux中使用gbd进行单布调试
在linux 中使用gdb命令行进行单步调试,将整个过程介绍如下: 1.在当前路径下新建文件夹main, 并进入文件夹,新建文件main.cpp mkdir main cd main touch ma ...
手机QQ浏览器属于代理服务器吗？
这两天.上QQ,会员上线提示.老是显示福建省,而没有具体的地方.这是怎么回事呢?而且那个时间段我都没有上QQ.但是有用手机QQ浏览器.偷菜.这是怎么回事,机子也没有病毒没有木马到底怎么搞的...! ...

python 下载bilibili视频

python 下载bilibili视频的更多相关文章

随机推荐

热门专题