Python实现网页截图（PyQT5）

方案说明

功能要求：实现网页加载后将页面截取成长图片
涉及模块：PyQT5 PIL
逻辑说明：

1：完成窗口设置，利用PyQT5 QWebEngineView加载网页地址，待网页加载完成后，调用check_pag；

class MainWindow(QMainWindow):

    def __init__(self, parent=None):

        super(MainWindow, self).__init__(parent)

        self.setWindowTitle('易哈佛')

        self.temp_height = 0

        self.setWindowFlag(Qt.WindowMinMaxButtonsHint, False)  # 禁用最大化，最小化

        # self.setWindowFlag(Qt.WindowStaysOnTopHint, True)  # 窗口顶置

        self.setWindowFlag(Qt.FramelessWindowHint, True)  # 窗口无边框

    def urlScreenShot(self, url):

        self.browser = QWebEngineView()

        self.browser.load(QUrl(url))

        geometry = self.chose_screen()

        self.setGeometry(geometry)

        self.browser.loadFinished.connect(self.check_page)

        self.setCentralWidget(self.browser)

    def get_page_size(self):

        size = self.browser.page().contentsSize()

        self.set_height = size.height()

        self.set_width = size.width()

        return size.width(), size.height()

    def chose_screen(self):

        width, height = 750, 1370

        desktop = QApplication.desktop()

        screen_count = desktop.screenCount()

        for i in range(0, screen_count):

            rect = desktop.availableGeometry(i)

            s_width, s_height = rect.width(), rect.height()

            if s_width > width and s_height > height:

                return QRect(rect.left(), rect.top(), width, height)

        return QRect(0, 0, width, height)

if __name__ == '__main__':

    app = QApplication(sys.argv)

    win = MainWindow()

    win.show()

    app.exit(app.exec_())

2：收集页面高度，并计算分次截屏的次数和余量高度；实例化图片合并工具，设置定时器，超时信号发出后，执行exe_command；

    def check_page(self):

        p_width, p_height = self.get_page_size()

        self.page, self.over_flow_size = divmod(p_height, self.height())

        if self.page == 0:

            self.page = 1

        self.ssm = ScreenShotMerge(self.page, self.over_flow_size)

        self.timer = QTimer(self)

        self.timer.timeout.connect(self.exe_command)

        self.timer.setInterval(400)

        self.timer.start()

3：exe_command用来控制截图次数，并在每次截图完成后控制网页向下滑屏幕的高度；所有的页面都已截取时，完成图片合并。

    def exe_command(self):

        if self.page > 0:

            self.screen_shot()

            self.run_js()

        elif self.page < 0:

            self.timer.stop()

            self.ssm.image_merge()

            self.close()

        elif self.over_flow_size > 0:

            self.screen_shot()

        self.page -= 1

    def run_js(self):

        script = """

            var scroll = function (dHeight) {

            var t = document.documentElement.scrollTop

            var h = document.documentElement.scrollHeight

            dHeight = dHeight || 0

            var current = t + dHeight

            if (current > h) {

                window.scrollTo(0, document.documentElement.clientHeight)

              } else {

                window.scrollTo(0, current)

              }

            }

        """

        command = script + '\n scroll({})'.format(self.height())

        self.browser.page().runJavaScript(command)

4：screen_shot在每次截图完成后将图片保存，并将图片对象由图片合并根据保存到列表中。

   def screen_shot(self):

        screen = QApplication.primaryScreen()

        winid = self.browser.winId()

        pix = screen.grabWindow(int(winid))

        name = '{}/temp.png'.format(self.ssm.root_path)

        pix.save(name)

        self.ssm.add_im(name)

5：截图合并工具，在每次截图完成后将图片对象保存，完成余量截图的重绘和截图的合并。

class ScreenShotMerge():

    def __init__(self, page, over_flow_size):

        self.im_list = []

        self.page = page

        self.over_flow_size = over_flow_size

        self.get_path()

    def get_path(self):

        self.root_path = Path(__file__).parent.joinpath('temp')

        if not self.root_path.exists():

            self.root_path.mkdir(parents=True)

        self.save_path = self.root_path.joinpath('merge.png')

    def add_im(self, path):

        if len(self.im_list) == self.page:

            im = self.reedit_image(path)

        else:

            im = Image.open(path)

        im.save('{}/{}.png'.format(self.root_path, len(self.im_list) + 1))

        self.im_list.append(im)

    def get_new_size(self):

        max_width = 0

        total_height = 0

        # 计算合成后图片的宽度（以最宽的为准）和高度

        for img in self.im_list:

            width, height = img.size

            if width > max_width:

                max_width = width

            total_height += height

        return max_width, total_height

    def image_merge(self, ):

        if len(self.im_list) > 1:

            max_width, total_height = self.get_new_size()

            # 产生一张空白图

            new_img = Image.new('RGB', (max_width - 15, total_height), 255)

            x = y = 0

            for img in self.im_list:

                width, height = img.size

                new_img.paste(img, (x, y))

                y += height

            new_img.save(self.save_path)

            print('截图成功:', self.save_path)

        else:

            obj = self.im_list[0]

            width, height = obj.size

            left, top, right, bottom = 0, 0, width, height

            box = (left, top, right, bottom)

            region = obj.crop(box)

            new_img = Image.new('RGB', (width, height), 255)

            new_img.paste(region, box)

            new_img.save(self.save_path)

            print('截图成功:', self.save_path)

    def reedit_image(self, path):

        obj = Image.open(path)

        width, height = obj.size

        left, top, right, bottom = 0, height - self.over_flow_size, width, height

        box = (left, top, right, bottom)

        region = obj.crop(box)

        return region

截图功能完整代码

#!/usr/bin/env python

# -*- coding:UTF-8 -*-

# Author:Leslie-x

import sys

from PyQt5.QtCore import *

from PyQt5.QtWidgets import *

from PyQt5.QtWebEngineWidgets import *

from PIL import Image

from pathlib import Path

class ScreenShotMerge():

    def __init__(self, page, over_flow_size):

        self.im_list = []

        self.page = page

        self.over_flow_size = over_flow_size

        self.get_path()

    def get_path(self):

        self.root_path = Path(__file__).parent.joinpath('temp')

        if not self.root_path.exists():

            self.root_path.mkdir(parents=True)

        self.save_path = self.root_path.joinpath('merge.png')

    def add_im(self, path):

        if len(self.im_list) == self.page:

            im = self.reedit_image(path)

        else:

            im = Image.open(path)

        im.save('{}/{}.png'.format(self.root_path, len(self.im_list) + 1))

        self.im_list.append(im)

    def get_new_size(self):

        max_width = 0

        total_height = 0

        # 计算合成后图片的宽度（以最宽的为准）和高度

        for img in self.im_list:

            width, height = img.size

            if width > max_width:

                max_width = width

            total_height += height

        return max_width, total_height

    def image_merge(self, ):

        if len(self.im_list) > 1:

            max_width, total_height = self.get_new_size()

            # 产生一张空白图

            new_img = Image.new('RGB', (max_width - 15, total_height), 255)

            x = y = 0

            for img in self.im_list:

                width, height = img.size

                new_img.paste(img, (x, y))

                y += height

            new_img.save(self.save_path)

            print('截图成功:', self.save_path)

        else:

            obj = self.im_list[0]

            width, height = obj.size

            left, top, right, bottom = 0, 0, width, height

            box = (left, top, right, bottom)

            region = obj.crop(box)

            new_img = Image.new('RGB', (width, height), 255)

            new_img.paste(region, box)

            new_img.save(self.save_path)

            print('截图成功:', self.save_path)

    def reedit_image(self, path):

        obj = Image.open(path)

        width, height = obj.size

        left, top, right, bottom = 0, height - self.over_flow_size, width, height

        box = (left, top, right, bottom)

        region = obj.crop(box)

        return region

class MainWindow(QMainWindow):

    def __init__(self, parent=None):

        super(MainWindow, self).__init__(parent)

        self.setWindowTitle('易哈佛')

        self.temp_height = 0

        self.setWindowFlag(Qt.WindowMinMaxButtonsHint, False)  # 禁用最大化，最小化

        # self.setWindowFlag(Qt.WindowStaysOnTopHint, True)  # 窗口顶置

        self.setWindowFlag(Qt.FramelessWindowHint, True)  # 窗口无边框

    def urlScreenShot(self, url):

        self.browser = QWebEngineView()

        self.browser.load(QUrl(url))

        geometry = self.chose_screen()

        self.setGeometry(geometry)

        self.browser.loadFinished.connect(self.check_page)

        self.setCentralWidget(self.browser)

    def get_page_size(self):

        size = self.browser.page().contentsSize()

        self.set_height = size.height()

        self.set_width = size.width()

        return size.width(), size.height()

    def chose_screen(self):

        width, height = 750, 1370

        desktop = QApplication.desktop()

        screen_count = desktop.screenCount()

        for i in range(0, screen_count):

            rect = desktop.availableGeometry(i)

            s_width, s_height = rect.width(), rect.height()

            if s_width > width and s_height > height:

                return QRect(rect.left(), rect.top(), width, height)

        return QRect(0, 0, width, height)

    def check_page(self):

        p_width, p_height = self.get_page_size()

        self.page, self.over_flow_size = divmod(p_height, self.height())

        if self.page == 0:

            self.page = 1

        self.ssm = ScreenShotMerge(self.page, self.over_flow_size)

        self.timer = QTimer(self)

        self.timer.timeout.connect(self.exe_command)

        self.timer.setInterval(400)

        self.timer.start()

    def exe_command(self):

        if self.page > 0:

            self.screen_shot()

            self.run_js()

        elif self.page < 0:

            self.timer.stop()

            self.ssm.image_merge()

            self.close()

        elif self.over_flow_size > 0:

            self.screen_shot()

        self.page -= 1

    def run_js(self):

        script = """

            var scroll = function (dHeight) {

            var t = document.documentElement.scrollTop

            var h = document.documentElement.scrollHeight

            dHeight = dHeight || 0

            var current = t + dHeight

            if (current > h) {

                window.scrollTo(0, document.documentElement.clientHeight)

              } else {

                window.scrollTo(0, current)

              }

            }

        """

        command = script + '\n scroll({})'.format(self.height())

        self.browser.page().runJavaScript(command)

    def screen_shot(self):

        screen = QApplication.primaryScreen()

        winid = self.browser.winId()

        pix = screen.grabWindow(int(winid))

        name = '{}/temp.png'.format(self.ssm.root_path)

        pix.save(name)

        self.ssm.add_im(name)

if __name__ == '__main__':

    url = 'http://blog.sina.com.cn/lm/rank/focusbang//'

    app = QApplication(sys.argv)

    win = MainWindow()

    win.urlScreenShot(url)

    win.show()

    app.exit(app.exec_())

Python实现网页截图（PyQT5）的更多相关文章

Python中使用 Selenium 实现网页截图实例
Selenium 是一个可以让浏览器自动化地执行一系列任务的工具,常用于自动化测试.不过,也可以用来给网页截图.目前,它支持 Java.C#.Ruby 以及 Python 四种客户端语言.如果你使用 ...
Python各种花式截图工具，截到你手软
前言: 最近,项目中遇到了一个关于实现通过给定URL,实现对网页屏幕进行截图的一个功能,前面代码中已经用python的第三方库实现了截图功能,但在上线以后出现了一些bug,所以就改bug的任务就落在了 ...
Python下载网页的几种方法
get和post方式总结 get方式:以URL字串本身传递数据参数,在服务器端可以从'QUERY_STRING'这个变量中直接读取,效率较高,但缺乏安全性,也无法来处理复杂的数据(只能是字符串,比如在 ...
使用PhantomJS实现网页截图服务
这是上半年遇到的一个小需求,想实现网页的抓取,并保存为图片.研究了不少工具,效果都不理想,不是显示太差了(Canvas.Html2Image.Cobra),就是性能不怎么样(如SWT的Brower). ...
html2canvas 网页截图下载上传
利用html2canvas插件对网页截图并下载和上传图片. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//E ...
Python编写网页爬虫爬取oj上的代码信息
OJ升级,代码可能会丢失. 所以要事先备份. 一開始傻傻的复制粘贴, 后来实在不能忍, 得益于大潇的启示和聪神的原始代码, 网页爬虫走起! 已经有段时间没看Python, 这次网页爬虫的原始代码是 p ...
爬虫学习笔记（1）-- 利用Python从网页抓取数据
最近想从一个网站上下载资源,懒得一个个的点击下载了,想写一个爬虫把程序全部下载下来,在这里做一个简单的记录 Python的基础语法在这里就不多做叙述了,黑马程序员上有一个基础的视频教学,可以跟着学习一 ...
iPhone 收藏网址[添加到书签] 和 [添加到主屏幕] 显示自定义图标,而不是网页截图
iPhone 收藏网址[添加到书签] 和 [添加到主屏幕] 显示自定义图标,而不是网页截图:  <link rel="sh ...
chrome也可以整张网页截图,保存完整网页为图片
转自:http://www.webkaka.com/blog/archives/chrome-save-a-webpage.html 关于浏览器截图,一直以为Chrome无能为力,最近发现,原来Chr ...

随机推荐

Google C++ 编码规范
刚刚看到一位博主的文章分享Google C++ 编码规范本人做一下记录,方便以后学习.. 中文在线版本地址: http://zh-google-styleguide.readthedocs.io/e ...
Web Workers文档
Web Worker为Web内容在后台线程中运行脚本提供了一种简单的方法.线程可以执行任务而不干扰用户界面.此外,他们可以使用XMLHttpRequest执行 I/O (尽管responseXML和 ...
one team
Double H Team 1.队员王熙航211606379(队长) 李冠锐211606364 曾磊鑫211606350 戴俊涵211606359 聂寒冰211606324 杨艺勇211606342 ...
Linux内核分析第三周总结
构造一个简单的Linux系统MenuOS 操作系统的"两把宝剑":中断上下文的切换(保存现场和恢复现场).进程上下文的切换 Linux内核源代码简介 --------------- ...
YOLO（You Only Look Once）
参考 YOLO(You Only Look Once)算法详解 YOLO算法的原理与实现一.介绍 YOLO算法把物体检测问题处理成回归问题,用一个卷积神经网络结构就可以从输入图像直接预测boundi ...
个人作业Week2-代码复审
代码复审Check List 概要部分代码能符合需求和规格说明么? 符合.针对-c和-s可以将正确的结果输出到相应的sudoku.txt,并在规定的时间内求解. 代码设计是否有周全的考虑? 有的.我 ...
HDU 2029 算菜价
http://acm.hdu.edu.cn/showproblem.php?pid=2090 Problem Description 妈妈每天都要出去买菜,但是回来后,兜里的钱也懒得数一数,到底花了多 ...
[转帖].NET Framework各版本操作系统支持
.NET Framework .NET版本 1.0 1.1 2.0 3.0 3.5 4.0 4.5 完整版本 1.0.3705.0 1.1.4322.573 2.0.50727.42 3.0.4506 ...
Orchard是如何运行的
建立一个CMS网站(内容管理系统)是不同于建立一个普通的web站点:它更像是建立一个应用程序容器. 设计这样一个系统时,必须建立一流的可扩展性功能.这必需是一个非常开放式的构架,但是一个开放性的系统可 ...
DELPHI 字符串与日期格式互转
procedure TForm1.Button1Click(Sender: TObject); var D:TDateTime; s:string; begin D:=VarToDateTime('0 ...

Python实现网页截图（PyQT5）

方案说明

1：完成窗口设置，利用PyQT5 QWebEngineView加载网页地址，待网页加载完成后，调用check_pag；

2：收集页面高度，并计算分次截屏的次数和余量高度；实例化图片合并工具，设置定时器，超时信号发出后，执行exe_command；

3：exe_command用来控制截图次数，并在每次截图完成后控制网页向下滑屏幕的高度；所有的页面都已截取时，完成图片合并。

4：screen_shot在每次截图完成后将图片保存，并将图片对象由图片合并根据保存到列表中。

5：截图合并工具，在每次截图完成后将图片对象保存，完成余量截图的重绘和截图的合并。

截图功能完整代码

Python实现网页截图（PyQT5）的更多相关文章

随机推荐

热门专题