D3D 11到D3D 12的重大改进

Direct3D 12 的编程模型和 Direct3D 11大相径庭。在Direct3D 12中,应用程序和硬件靠得近多了,这在以前是从未有过的。 这让D3D12 变得前所未有的快速和高效。但是速度和效率提高的代价是,相比D3D11,Direct3D 12需要在应用程序中完成更多的任务。

  • 显式同步处理
  • 物理内存驻留管理
  • 命令列表和命令集(Command list 和command bundle)
  • 描述符堆和描述符表
  • 从D11移植到D12
  • 相关主题

Direct3D 12 is a return to low-level programming; it gives you more control over the graphical elements of your games and apps by introducing these new features: objects to represent the overall state of the pipeline, command lists and bundles for work submission, and descriptor heaps and tables for resource access.

Your app has increased speed and efficiency with Direct3D 12, but you are responsible for more tasks than you were with Direct3D 11.

Explicit Synchronization

  • In Direct3D 12, CPU-GPU synchronization is now the explicit responsibility of the app and is no longer implicitly performed by the runtime, as it is in Direct3D 11. This fact also means that no automatic checking for pipeline hazards is performed by Direct3D 12, so again this is the apps responsibility.
  • In Direct3D 12, apps are responsible for pipelining data updates. That is, the "Map/Lock-DISCARD" pattern in Direct3D 11 must be performed manually in Direct3D 12. In Direct3D 11, if the GPU is still using the buffer when you callID3D11DeviceContext::Map withD3D11_MAP_WRITE_DISCARD, the runtime returns a pointer to a new region of memory instead of the old buffer data. This allows the GPU to continue using the old data while the app places data in the new buffer. No additional memory management is required in the app; the old buffer is reused or destroyed automatically when the GPU is finished with it.
  • In Direct3D 12, all dynamic updates (including constant buffers, dynamic vertex buffers, dynamic textures, and so on) are explicitly controlled by the app. These dynamic updates include any required GPU fences or buffering. The app is responsible for keeping the memory available until it is no longer needed.
  • Direct3D 12 uses COM-style reference counting only for the lifetimes of interfaces (by using the weak reference model of Direct3D tied to the lifetime of the device). All resource and description memory lifetimes are the sole responsibly of the app to maintain for the proper duration, and are not reference counted. Direct3D 11 uses reference counting to manage the lifetimes of interface dependencies as well.

Physical Memory Residency Management

A Direct3D 12 application must prevent race-conditions between multiple queues, multiple adapters, and the CPU threads. D3D12 no longer synchronizes the CPU and GPU, nor supports convenient mechanisms for resource renaming or multi-buffering. Fences must be used to avoid multiple processing units from over-writing memory before another processing unit finishes using it.

The Direct3D 12 application must ensure data is resident in memory while the GPU reads it. Memory used by each object is made resident during the creation of the object. Applications which call these methods must use fences to ensure the GPU doesn't access objects which have been evicted.

Resource barriers are another type of synchronization needed, used to synchronize resource and subresource transitions at a very granular level.

Refer to Memory Management in Direct3D 12.

Pipeline state objects

Direct3D 11 allows pipeline state manipulation through a large set of independent objects. For example, input assembler state, pixel shader state, rasterizer state, and output merger state can all be independently modified. This design provides a convenient and relatively high-level representation of the graphics pipeline, but it doesn’t utilize the capabilities of modern hardware, primarily because the various states are often interdependent. For example, many GPUs combine pixel shader and output merger state into a single hardware representation. But because the Direct3D 11 API allows these pipeline stages to be set separately, the display driver can't resolve issues of pipeline state until the state is finalized, which isn’t until draw time. This scheme delays hardware state setup, which means extra overhead and fewer maximum draw calls per frame.

Direct3D 12 addresses this scheme by unifying much of the pipeline state into immutable pipeline state objects (PSOs), which are finalized upon creation. Hardware and drivers can then immediately convert the PSO into whatever hardware native instructions and state are required to execute GPU work. You can still dynamically change which PSO is in use, but to do so, the hardware only needs to copy the minimal amount of pre-computed state directly to the hardware registers, rather than computing the hardware state on the fly. By using PSOs, draw call overhead is reduced significantly, and many more draw calls can occur per frame. For more information about PSOs, seeManaging graphics pipeline state in Direct3D 12.

Command lists and bundles

In Direct3D 11, all work submission is done via the immediate context, which represents a single stream of commands that go to the GPU. To achieve multithreaded scaling, games also havedeferred contexts available to them. Deferred contexts in Direct3D 11 don't map perfectly to hardware, so relatively little work can be done in them.

Direct3D 12 introduces a new model for work submission based on command lists that contain the entirety of information needed to execute a particular workload on the GPU. Each new command list contains information such as which PSO to use, what texture and buffer resources are needed, and the arguments to all draw calls. Because each command list is self-contained and inherits no state, the driver can pre-compute all necessary GPU commands up-front and in a free-threaded manner. The only serial process necessary is the final submission of command lists to the GPU via the command queue.

In addition to command lists, Direct3D 12 also introduces a second level of work pre-computation:bundles. Unlike command lists, which are completely self-contained and are typically constructed, submitted once, and discarded, bundles provide a form of state inheritance that permits reuse. For example, if a game wants to draw two character models with different textures, one approach is to record a command list with two sets of identical draw calls. But another approach is to "record" one bundle that draws a single character model, then "play back" the bundle twice on the command list using different resources. In the latter case, the display driver only has to compute the appropriate instructions once, and creating the command list essentially amounts to two low-cost function calls.

For more information about command lists and bundles, see Work Submission in Direct3D 12.

Descriptor heaps and tables

Resource binding in Direct3D 11 is highly abstracted and convenient, but leaves many modern hardware capabilities underutilized. In Direct3D 11, games createview objects of resources, then bind those views to severalslots at various shader stages in the pipeline. Shaders, in turn, read data from those explicit bind slots, which are fixed at draw time. This model means that whenever a game will draw using different resources, it must re-bind different views to different slots, and call draw again. This case also represents overhead that can be eliminated by fully utilizing modern hardware capabilities.

Direct3D 12 changes the binding model to match modern hardware and significantly improves performance. Instead of requiring standalone resource views and explicit mapping to slots, Direct3D 12 provides a descriptor heap into which games create their various resource views. This scheme provides a mechanism for the GPU to directly write the hardware-native resource description (descriptor) to memory up-front. To declare which resources are to be used by the pipeline for a particular draw call, games specify one or more descriptor tables that represent sub-ranges of the full descriptor heap. As the descriptor heap has already been populated with the appropriate hardware-specific descriptor data, changing descriptor tables is an extremely low-cost operation.

In addition to the improved performance offered by descriptor heaps and tables, Direct3D 12 also allows resources to be dynamically indexed in shaders, which provides unprecedented flexibility and unlocks new rendering techniques. As an example, modern deferred rendering engines typically encode a material or object identifier of some kind to the intermediate g-buffer. In Direct3D 11, these engines must be careful to avoid using too many materials, as including too many in one g-buffer can significantly slow down the final render pass. With dynamically indexable resources, a scene with a thousand materials can be finalized just as quickly as one with only ten.

For more information about descriptor heaps and tables, see Resource Binding, and Differences in the Binding Model from Direct3D 11.

Porting from Direct3D 11

Porting from Direct3D 11 is an involved process, described in Porting from Direct3D 11 to Direct3D 12. Also refer to the range of options inWorking with Direct3D 11, Direct3D 10 and Direct2D.

D3D12 图形编程的更多相关文章

  1. 现代3D图形编程学习-基础简介(3)-什么是opengl (译)

    本书系列 现代3D图形编程学习 OpenGL是什么 在我们编写openGL程序之前,我们首先需要知道什么是OpenGL. 将OpenGL作为一个API OpenGL 通常被认为是应用程序接口(API) ...

  2. 现代3D图形编程学习-基础简介(2) (译)

    本书系列 现代3D图形编程学习 基础简介(2) 图形和渲染 接下去的内容对渲染的过程进行粗略介绍.遇到的部分内容不是很明白也没有关系,在接下去的章节中,会被具体阐述. 你在电脑屏幕上看到的任何东西,包 ...

  3. 现代3D图形编程学习-基础简介(1) (译)

    本书系列 现代3D图形编程学习 基础简介 并不像本书的其他章节,这章内容没有相关的源代码或是项目.本章,我们将讨论向量,图形渲染理论,以及OpenGL. 向量 在阅读这本书的时候,你需要熟悉代数和几何 ...

  4. 现代3D图形编程学习-环境设置

    本书系列 现代3D图形编程学习 环境设置 由于本书中的例子,均是基于OpenGL实现的,因此你的工作环境需要能够运行OpenGL,为了读者能够更好的运行原文中的示例,此处简单地介绍了linux和win ...

  5. 现代3D图形编程学习-关于本书(译)

    本书系列 现代3D图形编程学习 关于这本书 三维图像处理硬件很快成为了必不可少的组件.很多操作系统能够直接使用三维图像硬件,有些甚至要求需要有3D渲染能力的硬件.同时对于日益增加的手机系统,3D图像硬 ...

  6. [ios]iOS 图形编程总结

    转自:http://www.cocoachina.com/ios/20141104/10124.html iOS实现图形编程可以使用三种API(UIKIT.Core Graphics.OpenGL E ...

  7. Wince 中的图形编程

    图形编程程序当中,笔者主要要和大家讨论的是画刷的创建和使用以及绘图函数,比如2D图像的绘制等等. *画刷的定义: HBRUSH hBrush; *画刷的类型: 1. 系统内置画刷:GetStockOb ...

  8. iOS 图形编程总结

    iOS实现图形编程可以使用三种API(UIKIT.Core Graphics.OpenGL ES及GLKit). 这些api包含的绘制操作都在一个图形环境中进行绘制.一个图形环境包含绘制参数和所有的绘 ...

  9. 现代3D图形编程学习-设置三角形颜色(译)

    本书系列 现代3D图形变成学习 http://www.cnblogs.com/grass-and-moon/category/920962.html 设置颜色 这一章会对上一章中绘制的三角形进行颜色的 ...

随机推荐

  1. Android开发中activity切换动画的实现

    (1)我们在MainAcitvity中定义两个textview,用于点击触发切换Activity事件,下面是布局文件代码. <LinearLayout android:layout_width= ...

  2. mysql更改默认存储引擎

    在mysql的官网上看到在mysql5.5以上的版本中已经更改了默认的存储引擎,在5.5版本以前是Myisam以后是Innodb. InnoDB as the Default MySQL Storag ...

  3. object-c 1

    多个参数的写法 (方法的数据类型)函数名:(参数1数据类型)参数1的数值的名字 参数2的名字: (参数2数据类型) 参数2值的名字 …. ; 举个例子,一个方法的定义: -(void) setKids ...

  4. Python 基础篇:数据类型、数据运算、表达

    1. 数据类型 1.1 数字 int(整型) 在32位机器上,整数的位数为32位,取值范围为-231-231-1,即-2147483648-2147483647 在64位系统上,整数的位数为64位,取 ...

  5. 1.Bloom filter

    Bloom filter 是由 Howard Bloom 在 1970 年提出的二进制向量数据结构,它具有很好的空间和时间效率,被用来检测一个元素是不是集合中的一个成员,这种检测只会对在集合内的数据错 ...

  6. Python元类实践--自己定义一个和collections中一样的namedtuple

    大家可能很熟悉在collections模块中有一个很好用的扩展数据类型-namedtuple. 如果你还不知道这个类型,那么请翻看标准手册. 我利用元类轻松定义一个namedtuple. 先把代码贴上 ...

  7. Java OCR tesseract 图像智能字符识别技术 Java实现

    Java OCR tesseract 图像智能字符识别技术 Java代码实现 接着上一篇OCR所说的,上一篇给大家介绍了tesseract 在命令行的简单用法,当然了要继承到我们的程序中,还是需要代码 ...

  8. code forces Watermelon

    /* * Watermelon.cpp * * Created on: 2013-10-8 * Author: wangzhu */ /** * 若n是偶数,且大于2,则输出YES, * 否则输出NO ...

  9. CSS margin 属性

    设置外边距的最简单的方法就是使用 margin 属性. margin 属性接受任何长度单位,可以是像素.英寸.毫米或 em. margin 可以设置为 auto.更常见的做法是为外边距设置长度值.下面 ...

  10. javaweb学习总结(三十九)——数据库连接池

    一.应用程序直接获取数据库连接的缺点 用户每次请求都需要向数据库获得链接,而数据库创建连接通常需要消耗相对较大的资源,创建时间也较长.假设网站一天10万访问量,数据库服务器就需要创建10万次连接,极大 ...