OpenGL EXT: shader_buffer_load
http://www.opengl.org/registry/specs/NV/shader_buffer_load.txt
Overview
At a very coarse level, GL has evolved in a way that allows
applications to replace many of the original state machine variables
with blocks of user-defined data. For example, the current vertex
state has been augmented by vertex buffer objects, fixed-function
shading state and parameters have been replaced by shaders/programs
and constant buffers, etc.. Applications switch between coarse sets
of state by binding objects to the context or to other container
objects (e.g. vertex array objects) instead of manipulating state
variables of the context. In terms of the number of GL commands
required to draw an object, modern applications are orders of
magnitude more efficient than legacy applications, but this explosion
of objects bound to other objects has led to a new bottleneck -
pointer chasing and CPU L2 cache misses in the driver, and general
L2 cache pollution.
This extension provides a mechanism to read from a flat, 64-bit GPU
address space from programs/shaders, to query GPU addresses of buffer
objects at the API level, and to bind buffer objects to the context in
such a way that they can be accessed via their GPU addresses in any
shader stage.
The intent is that applications can avoid re-binding buffer objects
or updating constants between each Draw call and instead simply use
a VertexAttrib (or TexCoord, or InstanceID, or...) to "point" to the
new object's state. In this way, one of the cheapest "state" updates
(from the CPU's point of view) can be used to effect a significant
state change in the shader similarly to how a pointer change may on
the CPU. At the same time, this relieves the limits on how many
buffer objects can be accessed at once by shaders, and allows these
buffer object accesses to be exposed as C-style pointer dereferences
in the shading language.
As a very simple example, imagine packing a group of similar objects'
constants into a single buffer object and pointing your program
at object <i> by setting "glVertexAttribI1iEXT(attrLoc, i);"
and using a shader as such:
struct MyObjectType {
mat4x4 modelView;
vec4 materialPropertyX;
// etc.
};
uniform MyObjectType *allObjects;
in int objectID; // bound to attrLoc
...
mat4x4 thisObjectsMatrix = allObjects[objectID].modelView;
// do transform, shading, etc.
This is beneficial in much the same way that texture arrays allow
choosing between similar, but independent, texture maps with a single
coordinate identifying which slice of the texture to use. It also
resembles instancing, where a lightweight change (incrementing the
instance ID) can be used to generate a different and interesting
result, but with additional flexibility over instancing because the
values are app-controlled and not a single incrementing counter.
Dependent pointer fetches are allowed, so more complex scene graph
structures can be built into buffer objects providing significant new
flexibility in the use of shaders. Another simple example, showing
something you can't do with existing functionality, is to do dependent
fetches into many buffer objects:
GenBuffers(N, dataBuffers);
GenBuffers(1, &pointerBuffer);
GLuint64EXT gpuAddrs[N];
for (i = 0; i < N; ++i) {
BindBuffer(target, dataBuffers[i]);
BufferData(target, size[i], myData[i], STATIC_DRAW);
// get the address of this buffer and make it resident.
GetBufferParameterui64vNV(target, BUFFER_GPU_ADDRESS,
gpuaddrs[i]);
MakeBufferResidentNV(target, READ_ONLY);
}
GLuint64EXT pointerBufferAddr;
BindBuffer(target, pointerBuffer);
BufferData(target, sizeof(GLuint64EXT)*N, gpuAddrs, STATIC_DRAW);
GetBufferParameterui64vNV(target, BUFFER_GPU_ADDRESS,
&pointerBufferAddr);
MakeBufferResidentNV(target, READ_ONLY);
// now in the shader, we can use a double indirection
vec4 **ptrToBuffers = pointerBufferAddr;
vec4 *ptrToBufferI = ptrToBuffers[i];
This allows simultaneous access to more buffers than
EXT_bindable_uniform (MAX_VERTEX_BINDABLE_UNIFORMS, etc.) and each
can be larger than MAX_BINDABLE_UNIFORM_SIZE.
OpenGL EXT: shader_buffer_load的更多相关文章
- three.js 相关概念
1.什么是three.js? Three.js 是一个 3D JavaScript 库.Three.js 封装了底层的图形接口,使得程序员能够在无需掌握繁冗的图形学知识的情况下,也能用简单的代码实现三 ...
- opengl入门学习
OpenGL入门学习 说起编程作图,大概还有很多人想起TC的#include <graphics.h>吧? 但是各位是否想过,那些画面绚丽的PC游戏是如何编写出来的?就靠TC那可怜的640 ...
- [翻译]opengl扩展教程1
[翻译]opengl扩展教程1 原文地址https://www.opengl.org/sdk/docs/tutorials/ClockworkCoders/extensions.php [翻译]ope ...
- OpenGL开发环境配置-Windows/MinGW/Clion/CMake
因为某些原因,不想用过于臃肿的VS了,转而使用常用的jetbrains的CLion,Clion沿袭了jetbrans的优良传统,基本代码提示功能还是比较好的,不过就是对于windows不熟悉cmake ...
- OpenGL extension specification (from openGL.org)
Shader read/write/atomic into UAV global memory (need manual sync) http://www.opengl.org/registry/sp ...
- [工作积累] OpenGL ES3.0: glInvalidateFramebuffer
https://www.khronos.org/opengles/sdk/docs/man3/html/glInvalidateFramebuffer.xhtml 这个在GLES2.0上只有Exten ...
- Android OpenGL 学习笔记 --开始篇
转自: http://www.cnblogs.com/TerryBlog/archive/2010/07/09/1774475.html 1.什么是 OpenGL? OpenGL 是个专业的3D程序接 ...
- OpenGL入门学习(转)
OpenGL入门学习 http://www.cppblog.com/doing5552/archive/2009/01/08/71532.html 说起编程作图,大概还有很多人想起TC的#includ ...
- 【OpenGL游戏开发之二】OpenGL常用API
OpenGL常用API 开发基于OpenGL的应用程序,必须先了解OpenGL的库函数.它采用C语言风格,提供大量的函数来进行图形的处理和显示.OpenGL库函数的命名方式非常有规律.所有OpenGL ...
随机推荐
- 国密SM4对称算法实现说明(原SMS4无线局域网算法标准)
国密SM4对称算法实现说明(原SMS4无线局域网算法标准) SM4分组密码算法,原名SMS4,国家密码管理局于2012年3月21日发布:http://www.oscca.gov.cn/News/201 ...
- NSOperation基本使用
NSOperation简单介绍 a. 是OC语言中基于GCD的面向对象的封装 b. 使用起来比GCD更加简单(面向对象) c. 提供了一些用GCD不好实现的功能 d. 苹果推荐使用,使用NSOper ...
- php 基础语法
<?php //注释 /* 多行注释 */ //输出语句 //echo "hello","helloa"; //print "world&quo ...
- CodeIgniter - 集成七牛云存储
最近有一个项目需要集成七牛云存储的图片存储和调用功能,程序是基于CodeIgniter2.1.3的PHP框架.刚拿到手完全无从下手的感觉,因为像框架这种东西,想从官方的PHPSDK集成进去,需要改动很 ...
- .net学习之类与对象、new关键字、构造函数、常量和只读变量、枚举、结构、垃圾回收、静态成员、静态类等
1.类与对象的关系类是对一类事务的统称,是抽象的,不能拿来直接使用,比如汽车,没有具体指哪一辆汽车对象是一个具体存在的,看的见,摸得着的,可以拿来直接使用,比如我家的那辆刚刚买的新汽车,就是具体的对象 ...
- 蓝桥杯 算法训练 区间k大数查询(水题)
算法训练 区间k大数查询 时间限制:1.0s 内存限制:256.0MB 问题描述 给定一个序列,每次询问序列中第l个数到第r个数中第K大的数是哪个. 输入格式 第一行包含一个数n,表示序列长度. ...
- chche缓存
打开一张图片,先从缓存中找,如果没有,再去sccard中找,如果还没有,就去网络下载.下载好了以后,先保存到sdcard中,再保存到缓存中 public class ImageAsyncTask ex ...
- more命令
more 命令 用于分屏显示 more命令一般用于显示内容超过一屏的文件.其他命令经常和more匹配使用,但more命令也客单独使用. (1)其他命令和more命令匹配使用: 格式: 其他命令格 ...
- try-catch 示例
package unit5; import java.util.Scanner; import javax.print.CancelablePrintJob; import javax.sound.m ...
- jmeter性能测试实战-web登录测试
一.项目背景: 网站信息: 操作系统类型 二.需求: 登录并发测试 三.场景: 1s增加两个线程,运行2000次 分别看20.40.60并发下的表现 四.监控: 成功率.响应时间.标准差.cpu.me ...