Z pre-pass

In the rendering Process, the first pass render to a depth buffer to get the front layer of depth. Next, we use this depth layer to cull the objects behind where a lot of draws are omitted.

This technique works well when we render transparent objects. The disordered internal structure of the transparent objects will not appear, due to the depth culling.

Actually, the efficiency of z pre-pass seems not quite optimistic.

http://casual-effects.blogspot.hk/2013/08/z-prepass-considered-irrelevant.html

http://www.gamedev.net/topic/641257-depth-pre-pass-worth-it/

These two authors have test the performance with or without z pre-pass. The conclusion is that  there was no efficiency improvement.

The saving cost of overshadeing in second pass pays the cost of transformation, tessellation and rasterizer setup in first pass.

I think this may be the right reason AC2 cut z pre-pass off, but it will lead to transparent objects rendering order issues.

front-to-back

Render the opaque objects from front to back, so that the objects obscured will be culled from the front surface by depth test. The efficiency will be improved.

Pack multiple batches together

Submit draw calls of the same render sates one time instead of small batchs many times.

https://www.nvidia.com/docs/IO/8228/BatchBatchBatch.pdf

(This document from Nvidia has detailed explanation of batch. It is a CPU bottleneck, not GPU. Show me many new opinions even conflict with my ideas before. Now I am not very confidence with my opinion about pack batches. )

But there is a contradiction between front-to-back and pack batches of the same rendering state. We want to render some grass located everywhere around the scene, for example. If render them from front to back strictly,

will lead to switch render states repeatedly and you could not merge  batches.

In response to this question, Zhangxiaoyu and Chenzhe discussed the idea that if you do a z pre-pass, you do not need front-to-back, so you could pack batches.

We all agreed. But then I read those above two articles aware of the following questions:

1. At the first pass of z pre-pass, if we use front-to-back, efficiency improved.  This is have been neglecting during the discussion. Z pre-pass and front-to-back are not mutually exclusive.

2.The discussion ignores the cost of z pre-pass in the first pass from vertex shader to rasterize. Although there won’t be any ps, go to the rasterization cost a lot, from the two tests above.

In summary, z pre-pass plus pack batches is not optimistic. I will test by myself after my deferred and forward demos established to get a further insight.

Quoted from Morgan McGuire(the author of G3D):

In other words, the z-prepass may be irrelevant in modern rendering systems that submit many draw calls for well-sorted objects,

and is potentially harmful as tessellation (and thus rasterizer setup) and skinning workloads increase.

Discussion about z pre-pass的更多相关文章

  1. Tile based Rendering 二 tbr and tbdr 优化建议tiled based deferred rendering

    http://www.seas.upenn.edu/~pcozzi/OpenGLInsights/OpenGLInsights-TileBasedArchitectures.pdf tbr 和tbdr ...

  2. 深入剖析GPU Early Z优化

    最近在公司群里同事发了一个UE4关于Mask材质的优化,比如在场景中有大面积的草和树的时候,可以在很大程度上提高效率.这其中的原理就是利用了GPU的特性Early Z,但是它的做法跟我最开始的理解有些 ...

  3. Python 3 条件、循环和assert、pass、del

    条件: if 条件:     语句块 elif:     语句块 else:     语句块 elif 表示 else if 这居然是合法的!!!1 < x < 2!!! >> ...

  4. hdu4939 Stupid Tower Defense (DP)

    2014多校7 第二水的题 4939 Stupid Tower Defense Time Limit: 12000/6000 MS (Java/Others)    Memory Limit: 131 ...

  5. 引擎设计跟踪(九.14.3) deferred shading 准备

    目前做的一些准备工作 1.depth prepass for forward shading. 做depth prepass的原因是为了完善渲染流程, 虽然架构上支持多个pass, 但实际上从来没有测 ...

  6. 引擎设计跟踪(九.14.2 final) Inverse Kinematics: CCD 在Blade中的实现

    因为工作忙, 好久没有记笔记了, 但是有时候发现还得翻以前的笔记去看, 所以还是尽量记下来备忘. 关于IK, 读了一些paper, 觉得之前翻译的那篇, welman的paper (http://gr ...

  7. 常用工具类——StringUtils

    /* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreem ...

  8. 使用flink Table &Sql api来构建批量和流式应用(2)Table API概述

    从flink的官方文档,我们知道flink的编程模型分为四层,sql层是最高层的api,Table api是中间层,DataStream/DataSet Api 是核心,stateful Stream ...

  9. shell 条件测试

    1.文件相关 -e 判断文件或者文件夹是否存在 -d 判断目录是否存在 -f 判断文件是否存在 -r 判断是否有读权限 -w 判断是否有写权限 -x 判断是否有执行权限 1.1命令行使用 [root@ ...

随机推荐

  1. Wordpress 作者模板页中的自定义帖子类型分页问题

    <?php // 获取当前页面的页数,函数的参数为 paged $paged = (get_query_var('paged')) ? get_query_var('paged') : 1; $ ...

  2. 11 JVM 垃圾回收(上)

    引用计数法和可达性分析 垃圾回收,就是将已经分配出去的,但却不在使用的内存回收回来,以便再次分配.在 Java 虚拟机语境下,垃圾指的是死亡的对象所占据的堆空间.下面就总结一下如何如何辨别一个对象是否 ...

  3. sqlserver2008链接服务器中执行存储过程配置过程

    参考:http://www.cnblogs.com/ycsfwhh/archive/2010/12/15/1906507.html 1.双方启动MSDTC服务MSDTC(分布式交易协调器),协调跨多个 ...

  4. javascript计算两个时间的差

    function GetDateDiff(startTime, endTime, diffType) { //将xxxx-xx-xx的时间格式,转换为 xxxx/xx/xx的格式 startTime ...

  5. linux VIM编辑器常用指令

    一般模式 查看文本-移动光标 [Ctrl] + [f] 屏幕『向前』移动一页 [Ctrl] + [b]  屏幕『向后』移动一页  n<space> 按下数字后再按空格键,光标会向右移动这一 ...

  6. 微信小程序--问题汇总及详解之图片上传和地图

    地图用的是百度的地图,链接:http://lbsyun.baidu.com/index.php?title=wxjsapi/guide/getlocation 获取日期时间可以用小程序里自带的js文件 ...

  7. mysql的下载及配置(复制1)

    ---恢复内容开始--- MySQL数据库安装与配置详解 目录 一.概述 二.MySQL安装 三.安装成功验证 四.NavicatforMySQL下载及使用 一.概述 MySQL版本:5.7.17 下 ...

  8. HDU 4180 扩展欧几里得

    RealPhobia Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Others)Total ...

  9. Log4j官方文档翻译(一、基本介绍)

    简介 log4j是使用java语言编写的可靠的.快速的.灵活的日志框架,它是基于Apache的license. log4j支持c,c++,c#,perl,python,ruby等语言.在运行时通过额外 ...

  10. 树的合并 connect

    树的合并 connect 题目描述 话说moreD经过不懈努力,终于背完了循环整数,也终于完成了他的蛋糕大餐. 但是不幸的是,moreD得到了诅咒,受到诅咒的原因至今无人知晓. moreD在发觉自己得 ...