Asm Shader Reference --- Shader Model 2.0 part
ps部分
ps_2_0
概览
| Instruction Set | ||||||||
| Name | Description | Instruction slots | Setup | Arithmetic | Texture | New | ||
| abs - ps | 绝对值 | 1 | x | x | ||||
| add - ps | 加法 | 1 | x | |||||
| cmp - ps | Compare source to 0 | 1 | x | |||||
| crs - ps | 叉积 | 2 | x | x | ||||
| dcl_samplerType (sm2, sm3 - ps asm) | 为采样器声明贴图维数 | 0 | x | x | ||||
| dcl - (sm2, sm3 - ps asm) | 声明顶点shader输出寄存器和像素shader输入寄存器之间的联合 | 0 | x | x | ||||
| def - ps | 定义常量 | 0 | x | |||||
| dp2add - ps | 2D点积之后加法运算 | 2 | x | x | ||||
| dp3 - ps | 3D点积 | 1 | x | |||||
| dp4 - ps | 4D点积 | 1 | x | |||||
| exp - ps | 全精度的2的x次方 | 1 | x | x | ||||
| frc - ps | 小数部分 | 1 | x | x | ||||
| log - ps | 全精度的 log₂(x) | 1 | x | x | ||||
| lrp - ps | 线性插值 | 2 | x | |||||
| m3x2 - ps | 3x2 乘法 | 2 | x | x | ||||
| m3x3 - ps | 3x3 乘法 | 3 | x | x | ||||
| m3x4 - ps | 3x4 乘法 | 4 | x | x | ||||
| m4x3 - ps | 4x3 乘法 | 3 | x | x | ||||
| m4x4 - ps | 4x4 乘法 | 4 | x | x | ||||
| mad - ps | 乘法之后加法运算 | 1 | x | |||||
| max - ps | 最大值 | 1 | x | x | ||||
| min - ps | 最小值 | 1 | x | x | ||||
| mov - ps | 赋值 | 1 | x | |||||
| mul - ps | 乘法 | 1 | x | |||||
| nop - ps | 无运算 | 1 | x | |||||
| nrm - ps | 单位化 | 3 | x | x | ||||
| pow - ps | 幂运算 | 3 | x | x | ||||
| ps | 版本 | 0 | x | |||||
| rcp - ps | 倒数 | 1 | x | x | ||||
| rsq - ps | 平方根后倒数 | 1 | x | x | ||||
| sincos - ps | sin和cos | 8 | x | x | ||||
| sub - ps | 减法 | 1 | x | |||||
| texkill - ps | 中断像素渲染 | 1 | x | |||||
| texld - ps_2_0 and up | 采样一张贴图 | 1 | x | x | ||||
| texldb - ps | 根据w部分进行对贴图的level-of-detail bias采样 | 1 | x | x | ||||
| texldp - ps | 根据w部分进行对贴图的投影划分采样 | 1 | x | x | ||||
部分函数细节
crs
语法
crs dst, src0, src1
叉积
算法
dest.x = src0.y * src1.z - src0.z * src1.y;
dest.y = src0.z * src1.x - src0.x * src1.z;
dest.z = src0.x * src1.y - src0.y * src1.x;
dcl_samplerType
语法
dcl_samplerType s#
声明一个像素shader采样器
· _2d
· _cube
· _volume
示例
dcl_cube t0.rgb; // Define a 3D texture map.
add r0, r0, t0; // Perturb texture coordinates.
texld r0, s0, r0; // Load r0 with a color sampled from stage0
// at perturbed texture coordinates r0.
// This is a dependent texture read.
dp2add
语法
dp2add dst, src0, src1, src2.{x|y|z|w}
算法
dest = src0.r * src1.r + src0.g * src1.g + src2.replicate_swizzle
// The scalar result is replicated to the write mask components
nrm
语法
nrm dst, src
单位化一个3D向量
squareRootOfTheSum = (src0.x*src0.x + src0.y*src0.y +src0.z*src0.z)1/2;
算法
dest.x = src0.x * (1 / squareRootOfTheSum);
dest.y = src0.y * (1 / squareRootOfTheSum);
dest.z = src0.z * (1 / squareRootOfTheSum);
dest.w = src0.w * (1 / squareRootOfTheSum);
sincos
语法
ps_2_0 和 ps_2_x
sincos dst.{x|y|xy}, src0.{x|y|z|w}, src1, src2
ps_3_0
sincos dst.{x|y|xy}, src0.{x|y|z|w}
算法
ps_2_0 and ps_2_x
写入通道为.x
dest.x = cos(V)
dest.y is undefined when the instruction completes
dest.z is undefined when the instruction completes
dest.w is not touched by the instruction
写入通道为.y
dest.x is undefined when the instruction completes
dest.y = sin(V)
dest.z is undefined when the instruction completes
dest.w is not touched by the instruction
写入通道为.xy
dest.x = cos(V)
dest.y = sin(V)
dest.z is undefined when the instruction completes
dest.w is not touched by the instruction
ps_3_0
写入通道为.x
dest.x = cos(V)
dest.y is undefined when the instruction completes
dest.z is undefined when the instruction completes
dest.w is not touched by the instruction
写入通道为.y
dest.x is undefined when the instruction completes
dest.y = sin(V)
dest.z is undefined when the instruction completes
dest.w is not touched by the instruction
写入通道为.xy
dest.x = cos(V)
dest.y = sin(V)
dest.z is undefined when the instruction completes
dest.w is not touched by the instruction
示例角度制转弧度制
def c0, pi, 0.5, 2*pi, 1/(2*pi)
mad r0.x, input_angle, c0.w, c0.y
frc r0.x, r0.x
mad r0.x, r0.x, c0.z, -c0.x
vs部分
vs_2_0
概览
| Instruction Set | ||||||||
| Name | Description | Instruction slots | Setup | Arithmetic | Flow control | New | ||
| abs - vs | 绝对值 | 1 | x | x | ||||
| add - vs | 加法运算 | 1 | x | |||||
| call - vs | 调用一个子程序 | 2 | x | x | ||||
| callnz bool - vs | 如果一个boolean寄存器不为0的话调用一个子程序 | 3 | x | x | ||||
| crs - vs | 叉积 | 2 | x | x | ||||
| dcl_usage input (sm1, sm2, sm3 - vs asm) | 声明输入顶点寄存器(see Registers - vs_2_0) | 0 | x | |||||
| def - vs | 定义常量 | 0 | x | |||||
| defb - vs | 定义一个 Boolean 常量 | 0 | x | x | ||||
| defi - vs | 定义一个integer 常量 | 0 | x | x | ||||
| dp3 - vs | 3D 点积 | 1 | x | |||||
| dp4 - vs | 4D 点积 | 1 | x | |||||
| dst - vs | 计算距离向量 | 1 | x | |||||
| else - vs | else - vs block | 1 | x | x | ||||
| endif - vs | endif - vs...else - vs block | 1 | x | x | ||||
| endloop - vs | 结束一个循环 - vs block | 2 | x | x | ||||
| endrep - vs | 结束一个 repeat block | 2 | x | x | ||||
| exp - vs | 全精度计算2的幂 | 1 | x | |||||
| exp - vs | 半精度计算2的幂 | 1 | x | |||||
| frc - vs | 小数部分 | 1 | x | |||||
| if bool - vs | if块 | 3 | x | x | ||||
| label - vs | Label | 0 | x | x | ||||
| lit - vs | 局部光照计算 | 3 | x | |||||
| log - vs | 全精度计算log₂(x) | 1 | x | |||||
| logp - vs | 半精度计算log₂(x) | 1 | x | |||||
| loop - vs | 循环 | 3 | x | x | ||||
| lrp - vs | 线性插值 | 2 | x | x | ||||
| m3x2 - vs | 3x2 乘法 | 2 | x | |||||
| m3x3 - vs | 3x3 乘法 | 3 | x | |||||
| m3x4 - vs | 3x4 乘法 | 4 | x | |||||
| m4x3 - vs | 4x3 乘法 | 3 | x | |||||
| m4x4 - vs | 4x4 乘法 | 4 | x | |||||
| mad - vs | 乘法之后加法运算 | 1 | x | |||||
| max - vs | 最大值 | 1 | x | |||||
| min - vs | 最小值 | 1 | x | |||||
| mov - vs | 赋值 | 1 | x | |||||
| mova - vs | 从浮点寄存器到地址寄存器(a0)移动数据 | 1 | x | x | ||||
| mul - vs | 乘法 | 1 | x | |||||
| nop - vs | 无运算 | 1 | x | |||||
| nrm - vs | 单位化 | 3 | x | x | ||||
| pow - vs | 幂运算 | 3 | x | x | ||||
| rcp - vs | 倒数 | 1 | x | |||||
| rep - vs | reapeat | 3 | x | x | ||||
| ret - vs | 从主函数或子程序做return处理 | 1 | x | x | ||||
| rsq - vs | 子程序的结束 | 1 | x | |||||
| sge - vs | 平方根的倒数 | 1 | x | |||||
| sgn - vs | Sign | 3 | x | x | ||||
| sincos - vs | Sin和cos | 8 | x | x | ||||
| slt - vs | Less than compare | 1 | x | |||||
| sub - vs | 减法 | 1 | x | |||||
| vs | 版本 | 0 | x | |||||
部分函数细节
mova
语法
mova dst, src
dst必须是地址寄存器,a0
示例
将浮点数赋值到整数寄存器,转换要使用凑整函数
if(dest is an integer register)
{
int intSrc =RoundToNearest(src);
dest = intSrc;
}
else
{
dest = src;
}
在2_x或以上版本,地址寄存器是部分向量,写入任何通道都可以
mova a0.xz, r0
sge
语法
sgedst, src0, src1
逐通道比较 src0与src1,如果src0大于等于src1返回1否则返回0
算法
dest.x = (src0.x>= src1.x) ? 1.0f : 0.0f;
dest.y = (src0.y>= src1.y) ? 1.0f : 0.0f;
dest.z = (src0.z>= src1.z) ? 1.0f : 0.0f;
dest.w = (src0.w>= src1.w) ? 1.0f : 0.0f;
sgn
语法
sgn dst, src0, src1
返回src0的符号
src1, src2为临时寄存器保存中间计算过程,为未定义
算法
for each component in src0
{
if(src0.component < 0)
dest.component= -1;
else
if(src0.component == 0)
dest.component = 0;
else
dest.component = 1;
}
slt
语法
slt dst, src0, src1
与sge相反,逐通道比较src0与src1,如果src0小于src1返回1否则返回0
算法
dest.x = (src0.x < src1.x) ? 1.0f : 0.0f;
dest.y = (src0.y < src1.y) ? 1.0f : 0.0f;
dest.z = (src0.z < src1.z) ? 1.0f : 0.0f;
dest.w = (src0.w < src1.w) ? 1.0f : 0.0f;
-----wolf96 2017/1/3
Asm Shader Reference --- Shader Model 2.0 part的更多相关文章
- Asm Shader Reference --- Shader Model 2.x part
ps部分 概览 Instruction Set Name Description Instruction slots S ...
- Asm Shader Reference --- Shader Model 3.0 part
ps部分 概览 Instruction Set Name Description Instruction slots S ...
- Asm Shader Reference --- Shader Model 1 part
ps部分 ps_1_1,ps_1_2,ps_1_3,ps_1_4 总览 Instruction Set ...
- 微软的HLSL Shader Model 6.0 compiler要转向LLVM了,开源的节奏. Apple/Khronos都有各自计划
So, Microsoft is making an opensource HLSL-to-almost-LLVM compiler, and Khronos is making an opensou ...
- Shader Model 3.0:Using Vertex Textures SM3:使用顶点纹理 (NVIDIA spec, 6800支持使用D3DFMT_R32F and D3DFMT_A32B32G32R32F的纹理格式实现Vertex Texture。)
翻译者 周波 zhoubo22@hotmail.com 版权所有 Philipp Gerasimov Randima (Randy) Fernando Simon Green NVIDIA Corpo ...
- Unity3D for VR 学习(9): Unity Shader 光照模型 (illumination model)
关于光照模型 所谓模型,一般是由学术算法发起, 经过大量实际数据验证而成的可靠公式 现在还记得2009年做TD-SCDMA移动通信算法的时候,曾经看过自由空间传播模型(Free space propa ...
- ePass1000 Full ActiveX Control Reference Manual Version 2.0
ePass1000 Full ActiveX Control Reference Manual Version 2.0 Error Code Value Return Status Descripti ...
- caffe编译时候出现 undefined reference to `TIFFReadRGBAStrip@LIBTIFF_4.0'
1.编译时候出现 make: * [.build_release/examples/siamese/convert_mnist_siamese_data.bin] Error 1 /usr/local ...
- Create a Basic Shader in Shader Forge
[Create a Basic Shader in Shader Forge] 1.打开ShaderForge.Window-> Shader Forge.(打开速度较慢) 2.通过NewSha ...
随机推荐
- Sublime Text 3使用技巧总结--快捷键及常用插件
1.Goto Anything(快速搜索) |--Ctrl+p 输入|--①文件名 |--②@+函数名 |--③:+数字 ->跳转到相应行 |--④#+变量名 2.多行游标 |--|--Alt+ ...
- 3.bit-map
适用范围:可进行数据的快速查找,判重,删除,一般来说数据范围是int的10倍以下 基本原理及要点:使用bit数组来表示某些元素是否存在,比如8位电话号码 扩展:bloom filter可以看做是对bi ...
- Unity3d Shader开发(三)Pass(Fog )
雾参数用于雾命令控制. 雾化是通过混合已生成的像素的颜色和基于到镜头的距离来确定的一个不变色来完成.雾化不会改变已经混合的像素的透明度值,只是改变RGB值. Syntax 语法 Fog { Fog C ...
- bzoj 4004: [JLOI2015]装备购买 拟阵 && 高消
4004: [JLOI2015]装备购买 Time Limit: 20 Sec Memory Limit: 128 MBSubmit: 337 Solved: 139[Submit][Status ...
- 从ng-repeat到NgFor
看这篇文章基本明白怎么渲染模板,但是我的工程会报错说#号非法,这篇的写法也不好用. angular2.0.0的语法集: Angular for TypeScript 语法快速指南 (基于2.0.0版本 ...
- Ubutn14.04下caffeine工具不显示在工具栏中的问题
安装过程请参考Ubuntu 14.04下安装Caffeine 2.6.2 阻止显示器进入睡眠状态 至于为什么不显示在任务栏,这不是程序的bug,你可以平ps -e看一下,任务已经在运行. 其实这是新版 ...
- 自定义 Lint 规则简介
上个月,笔者在巴黎 Droidcon 的 BarCamp 研讨会上聆听了 Matthew Compton 关于编写自己的 Lint 规则的讲话.深受启发之后,笔者想就此话题做进一步的探索. 定义 如果 ...
- .net和MVC中的json值和List<T>和DataTable的一些转换
1.List<T>集合转换为Json值 List<ReportModel> dtList = new List<ReportModel>(); JsonResult ...
- 再撸一次简单的NODE.JS
这毕竟大势所趋,了解一下无防的. 最终,对JS的要求还是有点高... 以后弄过一次,很快就忘了. 再来再拾起来一下. server.js var http = require("http&q ...
- QT使用UAC(经过验证)
网上有很多manifest的版本,mingw与vs系列也有不同的解决方案,不管那么多,我是使用这篇文章解决这个问题的: So it turns out that I had another bug t ...