Pitfalls of using opencv GpuMat data in CUDA kernel code
Please note that cv::cuda::GpuMat and cv::Mat using different memory allocation method. cv::cuda::GpuMat the data in is Nvidia Gpu Ram, but cv::Mat store in normal Ram.
The cv::Mat allocated memory normally is continuous, but cv::cuda::GpuMat may have gap between row and row data. Because cv::cuda::GpuMat is using cuda function cudaMallocPitch, which make the step size different from COLS.
So when passing the row data of cv::cuda::GpuMat into a CUDA kernel function, should also pass in the step size into it, so the function can access the row data correctly. If using COLS instead of step, it will easily get wrong, and it is a headache to debug the problem.
For example:
__global__
void kernel_select_cmp_point(
float* dMap,
float* dPhase,
uint8_t* matResult,
uint32_t step,
const int ROWS,
const int COLS,
const int span) {
int start = blockIdx.x * blockDim.x + threadIdx.x;
int stride = blockDim.x * gridDim.x; for (int row = start; row < ROWS; row += stride) {
int offsetOfInput = row * step;
int offsetOfResult = row * step;
}
}
Pitfalls of using opencv GpuMat data in CUDA kernel code的更多相关文章
- 关于keil中data,idata,xdata,pdata,code的问题
转自关于keil中data,idata,xdata,pdata,code的问题 从数据存储类型来说,8051系列有片内.片外程序存储器,片内.片外数据存储器,片内程序存储器还分直接寻址区和间接寻址类 ...
- [OpenCV] Basic data types - Matrix
http://docs.opencv.org/2.4.13/ Basis 矩形 "modules/core/src/drawing.cpp" CV_IMPL void cvRect ...
- [OpenCV] GpuMat and Mat, compare cvtColor perforemence
Introduction I am going to measure the performence of my two GT650M and compare GPU with CPU version ...
- opencv 源码分析 CUDA可分离滤波器设计 ( 发现OpenCV的cuda真TM慢 )
1. 主函数 void SeparableLinearFilter::apply(InputArray _src, OutputArray _dst, Stream& _stream) { G ...
- opencv 4.0 + linux + cuda静态编译
#下载最新的opencv git clone "https://github.com/opencv/opencv.git" git clone "https://gith ...
- 关于keil单片机编程中的data,idata,xdata,pdata,code数据类型
从数据存储类型来说,8051系列有片内.片外程序存储器,片内.片外数据存储器,片内程序存储器还分直接寻址区和间接寻址类型,分别对应code.data.xdata.idata以及根据51系列特点而设定的 ...
- 转:单片机C语言中的data,idata,xdata,pdata,code
从数据存储类型来说,8051系列有片内.片外程序存储器,片内.片外数据存储器,片内程序存储器还分直接寻址区和间接寻址类型,分别对应code.data.xdata.idata以及根据51系列特点而设定的 ...
- CUDA ---- Kernel性能调节
Exposing Parallelism 这部分主要介绍并行分析,涉及掌握nvprof的几个metric参数,具体的这些调节为什么会影响性能会在后续博文解释. 代码准备 下面是我们的kernel函数s ...
- Data Types in the Kernel <LDD3 学习笔记>
Data Types in the Kernel Use of Standard C Types /* * datasize.c -- print the size of common data it ...
随机推荐
- 编写高质量代码改善C#程序的157个建议——建议139:事件处理器命名采用组合方式
建议139:事件处理器命名采用组合方式 所谓事件处理器,就是实际被委托执行的那个方法.查看如下代码: public MainWindow() { InitializeComponent(); Butt ...
- How to count the number of threads in a process on Linux
If you want to see the number of threads per process in Linux environments, there are several ways t ...
- windows mobile 只能运行一个程序实例
static class Program { [System.Runtime.InteropServices.DllImport("coredll.Dll", SetLastErr ...
- Java网络编程のTCP/IP
TCP/IP参考模型和TCP/IP协议 与OSI参考模型相似,TCP/IP参考模型汲取了网络分层的思想,而且对网络的层次做了简化,并在网络各层都提供了完善的协议,这些协议构成了TCP/IP协议集,简称 ...
- .net 序列化反序列化
.net 序列化创建对象的深拷贝 public static object DeepClone(object original) { using (MemoryStream stream = new ...
- TSQL--约束基础和Demo
--============================================================ SQL SERVER 中使用constraint和role来对数据进行限制 ...
- java 统计字符串中子字符串个数
方法一: public class StatisticalStringNumber1 { public static void main(String args[]){ String string=& ...
- 《Beginning Java 7》 - 3 - Equalty 判等
== 用于判断是否为同一引用. 比如对于 String: System.out.println("abc" == "abc"); // Output: true ...
- 使用docker部署java项目
在接触了docker后,干什么都想用docker来弄.最近刚做完毕业设计,本来是说将项目简单在本地部署就行了,结果老师说如果部署在服务器的话有加加分,于是果断用docker来搞,多加点分. java项 ...
- CAS客户端整合(一) Discuz!
有好几个系统需要接入CAS,所以登录模块统统需要重构 版本 CAS服务端是Java的 Cas-server-4.0 CAS的php客户端 是 phpCAS-1.2.0 论坛版本是 Discuz!X3. ...