引擎设计跟踪(九.14.3) deferred shading 准备

目前做的一些准备工作

1.depth prepass for forward shading.

做depth prepass的原因是为了完善渲染流程, 虽然架构上支持多个pass, 但实际上从来没有测过. 等流程完善的差不多了, 理论上只需要修改xml配置和shader, 不需要修改C++代码就可以实现自定义渲染流程.

2.linear depth

默认情况下, 顶点经过mvp变换, 再单位化到NDC以后, 深度是非线性的, 其函数是一个(-) 1/(a + bx)的曲线, 导致靠近near clip plane的z值精度高, 远处精度低. 使用linear depth 以后深度变化都是均匀的.

linear depth通常使用view space的z 再除以max view distance即far clip distance得到单位化的z,这个z值可以直接写入zbuffer, 用于depth test, 不过需要所有的shader都输出linear depth才能正常使用, 除了那些不需要深度写和深度测试的对象, 可以不用修改.

目前已经将forward shading里面所有的depth都改成linear depth, 来做深度测试.

另外需要注意的是如果是右手坐标系, 那么view 向量朝向-z轴, 所以view space的z都是负值, 如果按负值比较的话, depth test是反的. 所以shader里面需要取反.
还有, 这个linear depth也可以用于线性插值, 所以可以在vertex shader里面计算出结果, 在pixel shader里面输出到深度.
还有一个问题就是depth bias会出问题, 因为硬件中使用bias的公式是基于非线性depth来的. 解决方法: 因为depth在shader里面输出, 所以可以直接在shader里面直接修改depth的偏移.

3.如果不用linear depth, 需要将NDC坐标unproject到world space, 即乘以 invViewProjection, 一次矩阵运算来得到world space.

使用linear depth的话, 单位化的深度z*far clip distance就得到目标点距离视点的距离. 使用linear depth另外的一个好处就是, 可以使用一个ray向量, 乘以深度就可以得到坐标.

通常这个操作习惯于在view space计算, 即
viewPos = [view space origin +] view space ray * viewdepth
viewPos = ray * viewDistance (后面有备注)

由于view space的origin就是(0,0,0) 所以只需要一个ray就可以.

但是由于历史原因, Blade的forward shading使用的全部是世界空间法线和世界空间位置, 为了使原有shader不做太大的修改, 所以想在在deferred shading时不在view space, 而再世界空间计算. 即将view space ray变换到世界空间, 即

world pos = eyePos + worldRay * viewdepth viewDistance
实际上不需要将view space ray变换到world space. 因为在世界空间计算光照, 在vertex shader里面计算world pos, 那么

worldRay = normalize(world pos - view pos);

只需要在vertexshader里多传入一个uniform, 把相机的视点坐标传过来即可, 而后在pixel shader里面, 传入同样的uniform, 还原世界坐标:

float3 worldRay : TEXCOORD1

uniform float4 eye_position : EYE_POS    //eye pos in world space 

...

//expand INTZ

float depth = dot(tex2D(depthBuffer, depthUV).arg, float3(0.996093809371817670572857294849, 0.0038909914428586627756752238080039, 1.5199185323666651467481343000015e-5));

float3 worldPos = eye_position.xyz + worldRay * depth;

Directional light的优化

如果方向光很多, 如果对每一个directional light都单独绘制一个quad, 做一个pass, 那么IO量太大. 所以目前使用的方法是一次IO, 迭代所有方向光.

uniform int direcitonal_count;

uniform float4 light_drections[MAX_DIR_LIGHT_COUNT];

uniform float4 directional_diffuses[MAX_DIR_LIGHT_COUNT];

uniform float4 directional_speculars[MAX_DIR_LIGHT_COUNT];

for(int i = ; i < directional_count; ++i)

{

    color += shading(light_drections[i], diretional_diffuses[i], directioanl_speculars[i], normal, depth);

}

这样对于多个方向光, 只需要绘制一次quad, 采样一次GBuffer法线, 然后迭代多个光源.

另外, 因为方向光比较特殊, 他的光照方向不需要world pos, 而计算half vector需要的viewdir已经由vertex shader计算好, 并完成插值, 实际上不需要world position, 所以也不需要采样depth buffer:

for(int i = ; i < light_directional_count; ++i)

{

    float3 light_dir = light_directions[i].xyz;

    //direcitonal lights doesn't need world position, thus doesn't need depth sampling,

    //the depthBuffer sampling will be optimized by compiler.

    float3 half_vec = normalize(light_dir + (-worldRay));

    ... // calculate lighting

}

这样就能完成和forward shading兼容的, 使用世界空间计算光照的defered shading.

目前deferred shading流程正在整理, deferred的方向光已经可以使用, 最终效果也跟forward shading一样. 因为编辑器里面还没有动态添加光源的功能, 所以后面会慢慢加上point和spot的支持.

问题: forward shading现在vertex shader里面既要计算view space pos (linear depth的需要), 又要计算world space pos(计算光照), 这样矩阵乘法太多. 还是把forward sahding的光照改在view space计算光照比较好, 这样deferred shading也要改在view space.

更新: linear viewdepth = mul(pos, worldViewMatrix).z / farclipDist
实际上shader里面只需要 worldViewMatrix的第三列, 所以shader里面只要传入第三列, 那么

viewDepth = dot(pos, worldViewMatrixCol3) / faclipDist, 而且worldViewMatrixCol3可以在CPU端预乘1/farClipDist, 这样计算view space normalized depth只需要一条指令, 这样的话没有多余的计算, 就不需要考虑在view space计算光照了.
这是在CPU端计算view depth vector的方式: shader 里面一条dot指令就搞定了, 如果使用"mul(pos, worldViewMatrix).z / farclipDist"的话, 优化开到O3指令数为3, 目测为transpose+dot+除法.

//////////////////////////////////////////////////////////////////////////

//vector to calculate view space normalized depth

//this is an optimization for shader, use dot(v, viewdepthvector) instead of mul(v,world_view_matrix).z / far_clip_distance

class ViewDepthVectorUpdater : public InstanceVariableUpdater

{

public:

    ViewDepthVectorUpdater(WorldViewMatrixUpdater* worldViewUpdater)

        :InstanceVariableUpdater(SCT_FLOAT4, )

        ,mWorldViewUpdater(worldViewUpdater)

        ,mWorldViewMatrixCol3(Vector4::ZERO)

    {

    }

    /** @brief  */

    virtual const void* updateData() const

    {

        const ICamera* camera = IShaderVariableSource::getSingleton().getCamera();

        if( camera == NULL )

            return &Vector4::ZERO;

        const Matrix44& worldViewMat = *(const Matrix44*)mWorldViewUpdater->getVariable()->getData();

        mWorldViewMatrixCol3[] = worldViewMat[][];

        mWorldViewMatrixCol3[] = worldViewMat[][];

        mWorldViewMatrixCol3[] = worldViewMat[][];

        mWorldViewMatrixCol3[] = worldViewMat[][];

        //note: since we're using Right Hand axis, the view direction points at -Z,

        //view space position's z values are negative.

        //use this negative value to generate a positive z value into z buffer

        mWorldViewMatrixCol3 *= -1.0f / camera->getFarClipDistance();

        return &mWorldViewMatrixCol3;

    }

protected:

    WorldViewMatrixUpdater* mWorldViewUpdater;

    mutable Vector4    mWorldViewMatrixCol3;

};

更新2:

如果在pixel shdaer里面输出depth, 那么意味着pipeline必须走完pixel shader才能做depth test, 所以early Z完全失效, 同时也无法真正避免overdraw, 因为所有的pixel必须处理完成才能做depth dest, 这个时候depth pre pass已经毫无意义.

所以尝试在vertex shader里面直接输出深度, 即POSITION.z.
然而结果在某些情况下(靠近near plane的一些mesh, 地形倒没发现)会出现问题.

改用了这个方法: https://www.mvps.org/directx/articles/linear_z/linearz.htm 结果仍然一样(其实思路差不多, 不同的是Blade里面 view.z = 0 对应depth的0, 而文中view.z = znear对应depth的0.

最后发现, 因为虽然单个顶点输出了线性深度, 但相邻顶点的插值会有问题:

http://www.yosoygames.com.ar/wp/2014/01/linear-depth-buffer-my-ass/

文章中提到, 如果一定要在PixelShader写深度的话, 也建议使用高精度的 logarithmic z-buffer, 不过对于defered shading, 需要解出原始深度, 所以需要逆向求解, 目测可以实现, 需要使用pow.

等研究一下有没有更好的办法再继续.

更新3:

由于无法在vertex shader里面输出linear depth, 而又不想在pixel shader输出深度(效率上的问题), 所以目前已经改回常规的depth, 并在deferred shading阶段把zbuffer的z值转换为view space的z, 这个根据projection matrix反向计算就可以得出, 需要注意的是采样zbuffer得出的z是[0,1], 符合d3d的NDC范围, 但是OGL是[-1,1]. 这个转换参数可以在CPU端计算, 不影响shader.

具体的计算这里有个链接, 跟我的思路一样: http://www.derschmale.com/2014/01/26/reconstructing-positions-from-the-depth-buffer/

RAWZ v.s. INTZ

根据G80的doc, INTZ直接采样就得到深度, 只有RAWZ才需要从新构建深度. (http://developer.download.nvidia.com/GPU_Programming_Guide/GPU_Programming_Guide_G80.pdf)

所以前面的INTZ的展开是不需要的, 直接用任意一个通道就可以.

引擎设计跟踪(九.14.3) deferred shading 准备的更多相关文章

引擎设计跟踪(九.14.2a) 导出插件问题修复和 Tangent Space 裂缝修复
由于工作很忙, 近半年的业余时间没空搞了, 不过工作马上忙完了, 趁十一有时间修了一些小问题. 这次更新跟骨骼动画无关, 修复了一个之前的, 关于tangent space裂缝的问题: 引擎设计跟踪( ...
引擎设计跟踪(九.14.3.2) Deferred shading的后续实现和优化
最近完成了deferred shading和spot light的支持, 并作了一部分优化. 之前forward shading也只支持方向光, 现在也支持了点光源和探照光. 对于forward sh ...
引擎设计跟踪(九.14.3.1) deferred shading: Depthstencil as GBuffer depth
问题汇总 1.Light support for Editor编辑器加入了灯光工具, 可以添加和修改灯光. 问题1. light object的用户互交.point light可以把对应的volume ...
引擎设计跟踪(九.14.2 final) Inverse Kinematics: CCD 在Blade中的实现
因为工作忙, 好久没有记笔记了, 但是有时候发现还得翻以前的笔记去看, 所以还是尽量记下来备忘. 关于IK, 读了一些paper, 觉得之前翻译的那篇, welman的paper (http://gr ...
引擎设计跟踪(九.14.3.3) Deferred shading的一些小细节
1.ambient light 之前的shader里面, 方向光会加上ambient 的计算. 但是如果没有方向光, 就没有ambient. 这是把全局方向光改为点光源之后发现的, 因为透明物体的fo ...
引擎设计跟踪(九.14.2i) Android GLES 3.0 完善
最近把渲染设备对应的GLES的API填上了. 主要有IRenderDevice/IShader/ITexture/IGraphicsResourceManager/IIndexBuffer/IVert ...
引擎设计跟踪(九.14.2g) 将GNUMake集成到Visual Studio
最近在做纹理压缩工具, 以及数据包的生成. shader编译已经在vs工程里面了, 使用custom build tool, build命令是调用BladeShaderComplier, 并且每个文件 ...
引擎设计跟踪(九.14.2f) 最近更新: OpenGL ES & tools
之前骨骼动画的IK暂时放一放, 最近在搞GLES的实现. 之前除了GLES没有实现, Android的代码移植已经完毕: [原]跨平台编程注意事项(三): window 到 android 的移植 ...
引擎设计跟踪(九.14.2d) [翻译] shader的跨平台方案之2014
Origin: http://aras-p.info/blog/2014/03/28/cross-platform-shaders-in-2014/ 简译 translation: 作者在2012年写 ...

随机推荐

关于WCF服务 http://XXXXXX/XXX/xxx.svc不支持内容类型 application/sop+xml;charset=utf-8 错误解决方法
有时候用IIS部署一个WCF服务时,无论是在客户端还是在服务端通过地址都能正常访问. 但是当你在客户端添加服务引用时, 怎么也添加不上, 会碰到了如下错误: 好啦. 现在说说怎么解决吧. 其实很简单. ...
LeetCode算法历程-01
给定一个整数数组和一个目标值,找出数组中和为目标值的两个数. 你可以假设每个输入只对应一种答案,且同样的元素不能被重复利用. 示例: 给定 nums = [2, 7, 11, 15], target ...
node.js安装使用express框架
官网:http://www.expressjs.com.cn/ 使用方式(如果后面需要添加路由等推荐第二种方式,不然需要需要手动添加):一.自己动手搭建 1. 新建项目文件夹,如test,并在命令行中 ...
elasticsearch(6) 映射和分析
类似关系型数据库中每个字段都有对应的数据类型,例如nvarchar.int.date等等,elasticsearch也会将文档中的字段映射成对应的数据类型,这一映射可以使ES自动生成的,也是可以由我们 ...
WordCount（java）
github项目链接 https://gitee.com/huwenli/Wc.git 1.项目简介 WordCount的需求可以概括为:对程序设计语言源文件统计字符数.单词数.行数,统计结果以指定格 ...
Python 从入门到实践试一试参考代码
这两天学习Python 看了python从入门到实践的书籍,里面有课后题“试一试” 然后就跟着写了,代码在以下地址,如果需要自取 https://files.cnblogs.com/files/fud ...
Action访问Servlet API的三种方法
一.为什么要访问Servlet API ? Struts2的Action并未与Servlet API进行耦合,这是Struts2 的一个改良,从而方便了单独对Action进行测试.但是对于Web控制器 ...
《贝贝GO》隐私政策
隐私政策贝贝GO尊重并保护所有使用服务用户的个人隐私权.为了给您提供更准确.更有个性化的服务,贝贝GO会按照本隐私权政策的规定使用和披露您的个人信息.但贝贝GO将以高度的勤勉.审慎义务对待这些信息. ...
BUAAOO P1-P3 Expression Dirivation
目录 1.问题描述 1.1.概念定义 7.程序度量 8.知识点笔记 1.运行 2.1.方法 2.2.检测相等性 2.3.空串与null串 2.4.使用StringBuilder构建字符串 2.5.使用 ...
基于vue开发的element-ui树形控件报错问题解决
对没错,这次又是ElementUI的问题,在使用ElementUI中的 tree 树形控件时需要动态添加DOM元素,但是在使用文档中给出的案例的时候会报错. 案例:ElementUI树形控件 - 自定 ...

引擎设计跟踪(九.14.3) deferred shading 准备

引擎设计跟踪(九.14.3) deferred shading 准备的更多相关文章

随机推荐

热门专题