Chromium Graphics: Compositor Thread Architecture
Compositor Thread Architecture
<jamesr, enne, vangelis, nduca> @chromium.org GoalsThe main render thread is a pretty scary place. This is where HTML, CSS, Javascript and pretty much everything on the web platform runs... or originates. It routinely stalls for tens to hundreds of milliseconds. On ARM, stalls can be seconds long. Sadly, it is not feasible to prevent all these stalls: style recalculation, synchronous network requests, long painting times, garbage collection, all these things have content-dependent costs. The compositor thread architecture allows us to snapshot a version of the page and allow the user to scroll and see animations directly on the snapshot, presenting the illusion that the page is running smoothly. BackgroundSome background on the basic frontend compositor archtecture, as well as Chrome’s gpu architecture, can be found here: http://dev.chromium.org/developers/design-documents/gpu-accelerated-compositing-in-chrome Basic ApproachThe compositor is architected into two halves: the main thread half, and the “impl thread” half. The word “impl” is horribly chosen, sorry! :) The main thread half of the world is a typical layer tree. A layer has transformation, size, and content. Layers are filled in on-demand: layers can be damaged (setNeedsDisplayInRect). The compositor decides when to run the layer delegate to tell it to paint. This is similar to InvalidateRect/Paint model you see in most operating systems, but just with layers. Layers have children, and can clip/reflect/etc, allowing all sorts of neat visual effects to be created. The impl-side of the compositor is hidden from users of the layer tree. It is a nearly-complete clone of the main thread tree --- when we have a layer on the main thread, it has a corresponding layer on the impl thread. Our naming is a little strange but:
The main thread tree is a model of what webkit wants to draw. The main thread paints layer contents into textures. These are handed to the impl tree. The impl tree is actually what gets drawn to the screen. We can draw the impl tree anytime, even while the main thread is blocked. Users of the LayerChromium tree can specify that layers are scrollable. By routing all input events to the impl thread before passing them to the main thread, we can scroll and redraw the tree without ever consulting the main thread. This allows us to implement “smooth scrolling” even when the main thread is blocked. Users of the LayerChromium tree can add animations to the tree. We can run those animations on the impl tree, allowing hitch-free animations. Tree Synchronization, Hosts and CommitsEvery tab in Chromium has a different layer tree. Each tab has a layer tree host, which manages the tab-specific state for the tree. Again:
These two trees, the main thread tree and the impl tree are completely isolated from one another. The impl tree is effectively a copy of the main thread tree, although with a different data type. We manually synchronize the impl tree to the main thread tree periodically, a process we call “commit”. A commit is a recursive walk over the main tree’s layers where we push “pushPropertiesTo” the impl-side equivalent of a layer. We do this on the impl thread with the main thread completely blocked. The basic logic of when to perform a commit is delayed. When the main tree changes, we simply make a note that a commit is needed (setNeedsCommit). When a layer’s contents change, e.g. we change a HTML div text somehow, we treat it as a commit. Later (under the discretion of a scheduler, discussed later) we decide to perform the commit. A commit is a blocking operation but still very cheap: it typically takes no more than a few milliseconds. An aside on our primitive thread model: we assume that both the main thread and the impl thread are message loops. E.g. they have postTask and postDelayedTask primitives. We try to keep both threads idle as often as possible and prefer async programming to taking a lock and blocking the thread. The commit flow is as follows (see CCThreadProxy for implementation):
At this point, the impl tree can draw as often as it wants without consulting the main thread. Similarly, the main thread (thus javascript, etc) can mutate the main thread tree as much as it wants without consulting the impl thread. We have one very important rule in the CCThreadProxy architecture: the main thread can make blocking calls to the impl thread, but the impl thread cannot make a blocking call to the main thread. Breaking this rule can lead to deadlocks. CCProxyTo allow development of the threaded compositor while still shipping a single-threaded compositor, we have made it possible to run the same basic two-tree architecture in both single- and threaded modes. In single threaded mode, we still have two trees and delayed commits, but simply run a different synchronization/scheduling algorithm and host the tree on the main thread. This is implemented by the CCProxy interface, which abstracts the types of communication that go on between the main thread and the impl thread. For instance:
Thus, there are two subclasses of CCProxy:
CCSchedulerIn addition to synchronizing trees, we have a lot of logic in the compositor that deals with when to commit, when to draw, whether to run animations, when to upload textures, and so on. This logic is not specific to whether the impl is running on the compositor thread or the main thread, so is put inside a standalone class called the CCScheduler. The scheduler exists logically as part of the impl side of tree, and thus in threaded mode lives on the impl thread. The scheduler itself is a very simple class that glues together two key systems:
Input Handling Once on the impl thread, they hit the WebCompositorInputHandler. This handler looks at the events and can ask the impl tree to try to scroll particular layers. However, scrolls can sometimes fail: WebKit does not give every scrollable area a layer (and associated clip objects). Therefore, on the impl tree, we track on each layer areas that cannot be impl-side scrolled. If the scroll request from the WebCompositorInputHandler fails because of hitting one of these areas, then we post the scrolling event to the main thread for normal processing. We call main-thread handled scrolls “slow scrolls” and impl-thread-side scrolls “fast scrolls.” Memory Management At the compositor level, each LayertTreeHost/Impl pair get an allocation from the GPU process for a certain memory budget. They are to do their best to not exceed this memory budget. We do this by prioritizing all the tiles on all layers, and then giving out memory budget to each tile in descending priority order until we hit our limit. Prioritization includes things like visibility, distance from viewport, whether the tile is on an animating layer, and whether the current layer velocity is likely to bring the tile onscreen. Texture UploadOne challenge with all these textures is that we rasterize them on the main thread of the renderer process, but need to actually get them into the GPU memory. This requires handing information about these textures (and their contents) to the impl thread, then to the GPU process, and once there, into the GL/D3D driver. Done naively, this causes us to copy a single texture over and over again, something we definitely don't want to do. We have two tricks that we use right now to make this a bit faster. To understand them, an aside on “painting” versus “rasterization.”
With these definitions in mind, we deal with texture upload with the following tricks:
The holy grail of texture upload is “zero copy” upload. With such a scheme, we manage to get a raw pointer inside the renderer process’ sandbox to GPU memory, which we software-rasterize directly into. We can’t yet do this anywhere, but it is something we fantasize about. AnimationWe allow animations to be added with layers. They allow you to fade or translate layers using “curves,” which are keyframed representations of the position or opacity of a layer over time. Although animations are added on the main thread, they are executed on the impl thread. Animations done with the compositor are thus “hitch free.” TerminologyThreads:
Impl thread is a word we use often. The compositor can operate in either single or threaded mode. Impl thread merely means “this lives on the impl half of the system.” Seeing the word “impl thread” does not mean that that code only runs on the compositor thread -- it just means that it handles data that is part of the impl part of the architecture. Suffixes indicate which thread data lives on:
We use words that are ordinarily synonyms to mean very important and distinct steps in the updating of the screen:
|
Chromium Graphics: Compositor Thread Architecture的更多相关文章
- Chromium Graphics: Multithreaded Rasterization
Multithreaded Rasterization @nduca, @enne, @vangelis (and many others) Implementation status: crbug. ...
- Chromium Graphics: Video Playback and Compositor
Video Playback and Compositor Authors: jamesr@chromium.org, danakj@chromium.org The Chromium composi ...
- Chromium Graphics : GPU Accelerated Compositing in Chrome
GPU Accelerated Compositing in Chrome Tom Wiltzius, Vangelis Kokkevis & the Chrome Graphics team ...
- Chromium Graphics: GPUclient的原理和实现分析之间的同步机制-Part I
摘要:Chromium于GPU多个流程架构的同意GPUclient这将是这次访问的同时GPU维修,和GPUclient这之间可能存在数据依赖性.因此必须提供一个同步机制,以确保GPU订购业务.本文讨论 ...
- Chromium Graphics: Aura
Aura (obsolete) This document is still good for a high level overview, with contact information, but ...
- Chromium Graphics: Android L平台上WebView的变化及其对浏览器厂商的影响分析
原创文章.转载请以链接形式注明原始出处为http://blog.csdn.net/hongbomin/article/details/40799167. 摘要:Google近期公布的Android L ...
- Chromium Graphics: GPUclient的原理和实现分析之间的同步机制-Part II
摘要:Part I探析GPUclient之间的同步问题,以及Chromium的GL扩展同步点机制的基本原理.本文将源码的角度剖析同步点(SyncPoint)机制的实现方式. 同步点机制的实现主要涉及到 ...
- Chromium Graphics Update in 2014(滑动)
原创文章,转载请注明为链接原始来源对于http://blog.csdn.net/hongbomin/article/details/40897433. 摘要:Chromium图形栈在2014年有多项改 ...
- Chromium Graphics: Graphics and Skia
Graphics and Skia Chrome uses Skia for nearly all graphics operations, including text rendering. GDI ...
随机推荐
- HTML5客户端数据存储机制Web Storage和Web SQL Database
引言 html5本地存储可以选择两种方式,一种是本地存储,一种是sqlite. 比如开发html5的购物车功能,就可以考虑选择其中之一,进行本地存储与操作. 又或者保存用户登录信息,可以使用local ...
- zzulioj--1719--小胖的疑惑(整数划分+dp打表)
1719: 小胖的疑惑 Time Limit: 1 Sec Memory Limit: 128 MB Submit: 108 Solved: 51 SubmitStatusWeb Board De ...
- C# 位域[flags]
.NET中的枚举我们一般有两种用法,一是表示唯一的元素序列,例如一周里的各天:还有就是用来表示多种复合的状态.这个时候一般需要为枚举加上[Flags]特性标记为位域,例如: [Flags] enu ...
- CentOS 安装 MySQL8
@Linux 官网:https://dev.mysql.com/doc/refman/8.0/en/binary-installation.html 个人博客:https://www.xingchen ...
- JavaScript学习——JS对象和全局函数
1. Array对象 数组的特点:长度可变!数组的长度=最大角标+1 2.Boolean对象 如果value 不写,那么默认创建的结果为false 3.Date对象 getTime()返回1970年1 ...
- 【原创】RPM安装软件时解决依赖性问题(自动解决依赖型)
满足以下3个条件才能自动解决依赖性: 1.使用rpmdb -redhat(在安装时会自动弹出依赖性错误) 2.所有互相依赖的软件都必须在同一个目录下面. 3.调用-aid参数.
- 我所认识的EXT2(一)
前言: 本文是笔者自己在学习文件系统中的一些体会,写出来和大家分享一下.本文首先是介绍了下文件系统的一些理论概念,然后分析了ext2文件系统的原理和部分源码. 文件系统是什么: 人们在认识一件陌生事物 ...
- 我的Java历程_maven配置的心路历程
从github上download了个maven管理的开源项目,接下来随笔下安装maven的心路历程: 异常尴尬的是import进ide之后一个红色的感叹号!震惊!google一下知道了,maven没配 ...
- 在远程X server上显示图形的设置方法
1.在服务器的/etc/ssh/sshd_config中,设置X11Forwarding yes,然后重启ssh服务,cd /etc/init.d这个目录下执行 ./ssh restart 2.在客户 ...
- luogu P1586 四方定理(背包)
题意 题解 首先吐槽一下体面的第一句话.反正我不知道(可能是因为我太菜了) 可能没有睡醒,没看出来是个背包. 但告诉是个背包了应该就好做了. #include<iostream> #inc ...