Discussion about z pre-pass
Z pre-pass
In the rendering Process, the first pass render to a depth buffer to get the front layer of depth. Next, we use this depth layer to cull the objects behind where a lot of draws are omitted.
This technique works well when we render transparent objects. The disordered internal structure of the transparent objects will not appear, due to the depth culling.
Actually, the efficiency of z pre-pass seems not quite optimistic.
http://casual-effects.blogspot.hk/2013/08/z-prepass-considered-irrelevant.html
http://www.gamedev.net/topic/641257-depth-pre-pass-worth-it/
These two authors have test the performance with or without z pre-pass. The conclusion is that there was no efficiency improvement.
The saving cost of overshadeing in second pass pays the cost of transformation, tessellation and rasterizer setup in first pass.
I think this may be the right reason AC2 cut z pre-pass off, but it will lead to transparent objects rendering order issues.
front-to-back
Render the opaque objects from front to back, so that the objects obscured will be culled from the front surface by depth test. The efficiency will be improved.
Pack multiple batches together
Submit draw calls of the same render sates one time instead of small batchs many times.
https://www.nvidia.com/docs/IO/8228/BatchBatchBatch.pdf
(This document from Nvidia has detailed explanation of batch. It is a CPU bottleneck, not GPU. Show me many new opinions even conflict with my ideas before. Now I am not very confidence with my opinion about pack batches. )
But there is a contradiction between front-to-back and pack batches of the same rendering state. We want to render some grass located everywhere around the scene, for example. If render them from front to back strictly,
will lead to switch render states repeatedly and you could not merge batches.
In response to this question, Zhangxiaoyu and Chenzhe discussed the idea that if you do a z pre-pass, you do not need front-to-back, so you could pack batches.
We all agreed. But then I read those above two articles aware of the following questions:
1. At the first pass of z pre-pass, if we use front-to-back, efficiency improved. This is have been neglecting during the discussion. Z pre-pass and front-to-back are not mutually exclusive.
2.The discussion ignores the cost of z pre-pass in the first pass from vertex shader to rasterize. Although there won’t be any ps, go to the rasterization cost a lot, from the two tests above.
In summary, z pre-pass plus pack batches is not optimistic. I will test by myself after my deferred and forward demos established to get a further insight.
Quoted from Morgan McGuire(the author of G3D):
In other words, the z-prepass may be irrelevant in modern rendering systems that submit many draw calls for well-sorted objects,
and is potentially harmful as tessellation (and thus rasterizer setup) and skinning workloads increase.
Discussion about z pre-pass的更多相关文章
- Tile based Rendering 二 tbr and tbdr 优化建议tiled based deferred rendering
http://www.seas.upenn.edu/~pcozzi/OpenGLInsights/OpenGLInsights-TileBasedArchitectures.pdf tbr 和tbdr ...
- 深入剖析GPU Early Z优化
最近在公司群里同事发了一个UE4关于Mask材质的优化,比如在场景中有大面积的草和树的时候,可以在很大程度上提高效率.这其中的原理就是利用了GPU的特性Early Z,但是它的做法跟我最开始的理解有些 ...
- Python 3 条件、循环和assert、pass、del
条件: if 条件: 语句块 elif: 语句块 else: 语句块 elif 表示 else if 这居然是合法的!!!1 < x < 2!!! >> ...
- hdu4939 Stupid Tower Defense (DP)
2014多校7 第二水的题 4939 Stupid Tower Defense Time Limit: 12000/6000 MS (Java/Others) Memory Limit: 131 ...
- 引擎设计跟踪(九.14.3) deferred shading 准备
目前做的一些准备工作 1.depth prepass for forward shading. 做depth prepass的原因是为了完善渲染流程, 虽然架构上支持多个pass, 但实际上从来没有测 ...
- 引擎设计跟踪(九.14.2 final) Inverse Kinematics: CCD 在Blade中的实现
因为工作忙, 好久没有记笔记了, 但是有时候发现还得翻以前的笔记去看, 所以还是尽量记下来备忘. 关于IK, 读了一些paper, 觉得之前翻译的那篇, welman的paper (http://gr ...
- 常用工具类——StringUtils
/* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreem ...
- 使用flink Table &Sql api来构建批量和流式应用(2)Table API概述
从flink的官方文档,我们知道flink的编程模型分为四层,sql层是最高层的api,Table api是中间层,DataStream/DataSet Api 是核心,stateful Stream ...
- shell 条件测试
1.文件相关 -e 判断文件或者文件夹是否存在 -d 判断目录是否存在 -f 判断文件是否存在 -r 判断是否有读权限 -w 判断是否有写权限 -x 判断是否有执行权限 1.1命令行使用 [root@ ...
随机推荐
- [oldboy-django][6其他]微信二维码扫码登录注册
http://blog.csdn.net/maerdym/article/details/46380309 http://blog.csdn.net/jiang1013nan/article/deta ...
- html之表单标签
表单标签的属性: 用于向服务器传输数据 表单能够包含input元素,比如文本字段,复选框,单选框,提交按钮等等 表单还可以包含textarea(简介之类的),select(下拉),fieldset和l ...
- 【bzoj1146】[CTSC2008]网络管理Network 倍增LCA+dfs序+树状数组+主席树
题目描述 M公司是一个非常庞大的跨国公司,在许多国家都设有它的下属分支机构或部门.为了让分布在世界各地的N个部门之间协同工作,公司搭建了一个连接整个公司的通信网络.该网络的结构由N个路由器和N-1条高 ...
- 洛谷P1908 逆序对
P1908 逆序对 2.2K通过 4.4K提交 题目提供者该用户不存在 标签云端 难度普及/提高- 时空限制1s / 128MB 提交 讨论 题解 最新讨论更多讨论 归并排序党注意了!数组要开… ...
- iOS-Cocoapods更新不及时
一.问题 使用cocoapods搜索某些库时,搜索到的版本低于Github上面的版本,这样会导致一些问题.例如我在使用一个LTNavigationBar这个库时,在我升级到iOS10的时候,会导致导航 ...
- Unity使用 16bit 压缩 Texture 颜色能均匀过渡
下面是unity自带 16bit 图 的效果,可以看到颜色过度的很不均匀,占用内存 0.5M 如果调成 truecolor 后 颜色过渡很均匀,而内存却占到 1.1 M 讲图片 后缀名改成 ...
- 安装最新版本的cocoapods
因为公司的iOS项目使用了cocoapods来管理第三方库,所以要求所有组员的cocoapods版本一致.一般的就是执行: $ sudo gem install -n /usr/local/bin c ...
- hdu 4501 多维0-1背包
小明系列故事——买年货 Time Limit: 5000/2000 MS (Java/Others) Memory Limit: 65535/32768 K (Java/Others)Total ...
- [AGC002D] Stamp Rally (并查集+整体二分)
Description 给你一个n个点m个条边构成的简单无向连通图,有Q组询问,每次询问从两个点x,y走出两条路径,使这两条路径覆盖z个点,求得一种方案使得路径上经过的变的最大编号最小. Input ...
- *LOJ#2134. 「NOI2015」小园丁与老司机
$n \leq 5e4$个平面上的点,从原点出发,能从当前点向左.右.上.左上或右上到达该方向最近的给定点.问三个问:一.最多经过多少点:二.前一问的方案:三.其所有方案种非左右走的边至少要开几辆挖掘 ...