半径无关快速高斯模糊实现(附完整C代码)
之前,俺也发过不少快速高斯模糊算法.
俺一般认为,只要处理一千六百万像素彩色图片,在2.2GHz的CPU上单核单线程超过1秒的算法,都是不快的.
之前发的几个算法,在俺2.2GHz的CPU上耗时都会超过1秒.
而众所周知,快速高斯模糊有很多实现方法:
1.FIR (Finite impulse response)
https://zh.wikipedia.org/wiki/%E9%AB%98%E6%96%AF%E6%A8%A1%E7%B3%8A
2.SII (Stacked integral images)
http://dx.doi.org/10.1109/ROBOT.2010.5509400
http://arxiv.org/abs/1107.4958
3.Vliet-Young-Verbeek (Recursive filter)
http://dx.doi.org/10.1016/0165-1684(95)00020-E
http://dx.doi.org/10.1109/ICPR.1998.711192
4.DCT (Discrete Cosine Transform)
http://dx.doi.org/10.1109/78.295213
5.box (Box filter)
http://dx.doi.org/10.1109/TPAMI.1986.4767776
6.AM(Alvarez, Mazorra)
http://www.jstor.org/stable/2158018
7.Deriche (Recursive filter)
http://hal.inria.fr/docs/00/07/47/78/PDF/RR-1893.pdf
8.ebox (Extended Box)
http://dx.doi.org/10.1007/978-3-642-24785-9_38
9.IIR (Infinite Impulse Response)
https://software.intel.com/zh-cn/articles/iir-gaussian-blur-filter-implementation-using-intel-advanced-vector-extensions
10.FA (Fast Anisotropic)
http://mathinfo.univ-reims.fr/IMG/pdf/Fast_Anisotropic_Gquss_Filtering_-_GeusebroekECCV02.pdf
......
实现高斯模糊的方法虽然很多,但是作为算法而言,核心关键是简单高效.
目前俺经过实测,IIR是兼顾效果以及性能的不错的方法,也是半径无关(即模糊不同强度耗时基本不变)的实现.
英特尔官方实现的这份:
IIR Gaussian Blur Filter Implementation using Intel® Advanced Vector Extensions [PDF 513KB]
source: gaussian_blur.cpp [36KB]
采用了英特尔处理器的流(SIMD)指令,算法处理速度极其惊人.
俺写算法追求干净整洁,高效简单,换言之就是不采用任何硬件加速方案,实现简单高效,以适应不同硬件环境.
故基于英特尔这份代码,俺对其进行了改写以及优化.
最终在俺2.20GHz的CPU上,单核单线程,不采用流(SIMD)指令,达到了,处理一千六百万像素的彩色照片仅需700毫秒左右.
按照惯例,还是贴个效果图比较直观.
之前也有网友问过这个算法的实现问题.
想了想,还是将代码共享出来,供大家参考学习.
完整代码:
void CalGaussianCoeff(float sigma, float * a0, float * a1, float * a2, float * a3, float * b1, float * b2, float * cprev, float * cnext) {
float alpha, lamma, k; if (sigma < 0.5f)
sigma = 0.5f;
alpha = (float)exp((0.726) * (0.726)) / sigma;
lamma = (float)exp(-alpha);
*b2 = (float)exp(- * alpha);
k = ( - lamma) * ( - lamma) / ( + * alpha * lamma - (*b2));
*a0 = k; *a1 = k * (alpha - ) * lamma;
*a2 = k * (alpha + ) * lamma;
*a3 = -k * (*b2);
*b1 = - * lamma;
*cprev = (*a0 + *a1) / ( + *b1 + *b2);
*cnext = (*a2 + *a3) / ( + *b1 + *b2);
} void gaussianHorizontal(unsigned char * bufferPerLine, unsigned char * lpRowInitial, unsigned char * lpColumn, int width, int height, int Channels, int Nwidth, float a0a1, float a2a3, float b1b2, float cprev, float cnext)
{
int HeightStep = Channels*height;
int WidthSubOne = width - ;
if (Channels == )
{
float prevOut[];
prevOut[] = (lpRowInitial[] * cprev);
prevOut[] = (lpRowInitial[] * cprev);
prevOut[] = (lpRowInitial[] * cprev);
for (int x = ; x < width; ++x) {
prevOut[] = ((lpRowInitial[] * (a0a1)) - (prevOut[] * (b1b2)));
prevOut[] = ((lpRowInitial[] * (a0a1)) - (prevOut[] * (b1b2)));
prevOut[] = ((lpRowInitial[] * (a0a1)) - (prevOut[] * (b1b2)));
bufferPerLine[] = prevOut[];
bufferPerLine[] = prevOut[];
bufferPerLine[] = prevOut[];
bufferPerLine += Channels;
lpRowInitial += Channels;
}
lpRowInitial -= Channels;
lpColumn += HeightStep * WidthSubOne;
bufferPerLine -= Channels;
prevOut[] = (lpRowInitial[] * cnext);
prevOut[] = (lpRowInitial[] * cnext);
prevOut[] = (lpRowInitial[] * cnext); for (int x = WidthSubOne; x >= ; --x) {
prevOut[] = ((lpRowInitial[] * (a2a3)) - (prevOut[] * (b1b2)));
prevOut[] = ((lpRowInitial[] * (a2a3)) - (prevOut[] * (b1b2)));
prevOut[] = ((lpRowInitial[] * (a2a3)) - (prevOut[] * (b1b2)));
bufferPerLine[] += prevOut[];
bufferPerLine[] += prevOut[];
bufferPerLine[] += prevOut[];
lpColumn[] = bufferPerLine[];
lpColumn[] = bufferPerLine[];
lpColumn[] = bufferPerLine[];
lpRowInitial -= Channels;
lpColumn -= HeightStep;
bufferPerLine -= Channels;
}
}
else if (Channels == )
{
float prevOut[]; prevOut[] = (lpRowInitial[] * cprev);
prevOut[] = (lpRowInitial[] * cprev);
prevOut[] = (lpRowInitial[] * cprev);
prevOut[] = (lpRowInitial[] * cprev);
for (int x = ; x < width; ++x) {
prevOut[] = ((lpRowInitial[] * (a0a1)) - (prevOut[] * (b1b2)));
prevOut[] = ((lpRowInitial[] * (a0a1)) - (prevOut[] * (b1b2)));
prevOut[] = ((lpRowInitial[] * (a0a1)) - (prevOut[] * (b1b2)));
prevOut[] = ((lpRowInitial[] * (a0a1)) - (prevOut[] * (b1b2))); bufferPerLine[] = prevOut[];
bufferPerLine[] = prevOut[];
bufferPerLine[] = prevOut[];
bufferPerLine[] = prevOut[];
bufferPerLine += Channels;
lpRowInitial += Channels;
}
lpRowInitial -= Channels;
lpColumn += HeightStep * WidthSubOne;
bufferPerLine -= Channels; prevOut[] = (lpRowInitial[] * cnext);
prevOut[] = (lpRowInitial[] * cnext);
prevOut[] = (lpRowInitial[] * cnext);
prevOut[] = (lpRowInitial[] * cnext); for (int x = WidthSubOne; x >= ; --x) {
prevOut[] = ((lpRowInitial[] * a2a3) - (prevOut[] * b1b2));
prevOut[] = ((lpRowInitial[] * a2a3) - (prevOut[] * b1b2));
prevOut[] = ((lpRowInitial[] * a2a3) - (prevOut[] * b1b2));
prevOut[] = ((lpRowInitial[] * a2a3) - (prevOut[] * b1b2));
bufferPerLine[] += prevOut[];
bufferPerLine[] += prevOut[];
bufferPerLine[] += prevOut[];
bufferPerLine[] += prevOut[];
lpColumn[] = bufferPerLine[];
lpColumn[] = bufferPerLine[];
lpColumn[] = bufferPerLine[];
lpColumn[] = bufferPerLine[];
lpRowInitial -= Channels;
lpColumn -= HeightStep;
bufferPerLine -= Channels;
}
}
else if (Channels == )
{
float prevOut = (lpRowInitial[] * cprev); for (int x = ; x < width; ++x) {
prevOut = ((lpRowInitial[] * (a0a1)) - (prevOut * (b1b2)));
bufferPerLine[] = prevOut;
bufferPerLine += Channels;
lpRowInitial += Channels;
}
lpRowInitial -= Channels;
lpColumn += HeightStep*WidthSubOne;
bufferPerLine -= Channels; prevOut = (lpRowInitial[] * cnext); for (int x = WidthSubOne; x >= ; --x) {
prevOut = ((lpRowInitial[] * a2a3) - (prevOut * b1b2));
bufferPerLine[] += prevOut;
lpColumn[] = bufferPerLine[];
lpRowInitial -= Channels;
lpColumn -= HeightStep;
bufferPerLine -= Channels;
}
}
} void gaussianVertical(unsigned char * bufferPerLine, unsigned char * lpRowInitial, unsigned char * lpColInitial, int height, int width, int Channels, float a0a1, float a2a3, float b1b2, float cprev, float cnext) { int WidthStep = Channels*width;
int HeightSubOne = height - ;
if (Channels == )
{
float prevOut[];
prevOut[] = (lpRowInitial[] * cprev);
prevOut[] = (lpRowInitial[] * cprev);
prevOut[] = (lpRowInitial[] * cprev); for (int y = ; y < height; y++) {
prevOut[] = ((lpRowInitial[] * a0a1) - (prevOut[] * b1b2));
prevOut[] = ((lpRowInitial[] * a0a1) - (prevOut[] * b1b2));
prevOut[] = ((lpRowInitial[] * a0a1) - (prevOut[] * b1b2));
bufferPerLine[] = prevOut[];
bufferPerLine[] = prevOut[];
bufferPerLine[] = prevOut[];
bufferPerLine += Channels;
lpRowInitial += Channels;
}
lpRowInitial -= Channels;
bufferPerLine -= Channels;
lpColInitial += WidthStep * HeightSubOne;
prevOut[] = (lpRowInitial[] * cnext);
prevOut[] = (lpRowInitial[] * cnext);
prevOut[] = (lpRowInitial[] * cnext);
for (int y = HeightSubOne; y >= ; y--) {
prevOut[] = ((lpRowInitial[] * a2a3) - (prevOut[] * b1b2));
prevOut[] = ((lpRowInitial[] * a2a3) - (prevOut[] * b1b2));
prevOut[] = ((lpRowInitial[] * a2a3) - (prevOut[] * b1b2));
bufferPerLine[] += prevOut[];
bufferPerLine[] += prevOut[];
bufferPerLine[] += prevOut[];
lpColInitial[] = bufferPerLine[];
lpColInitial[] = bufferPerLine[];
lpColInitial[] = bufferPerLine[];
lpRowInitial -= Channels;
lpColInitial -= WidthStep;
bufferPerLine -= Channels;
}
}
else if (Channels == )
{
float prevOut[]; prevOut[] = (lpRowInitial[] * cprev);
prevOut[] = (lpRowInitial[] * cprev);
prevOut[] = (lpRowInitial[] * cprev);
prevOut[] = (lpRowInitial[] * cprev); for (int y = ; y < height; y++) {
prevOut[] = ((lpRowInitial[] * a0a1) - (prevOut[] * b1b2));
prevOut[] = ((lpRowInitial[] * a0a1) - (prevOut[] * b1b2));
prevOut[] = ((lpRowInitial[] * a0a1) - (prevOut[] * b1b2));
prevOut[] = ((lpRowInitial[] * a0a1) - (prevOut[] * b1b2));
bufferPerLine[] = prevOut[];
bufferPerLine[] = prevOut[];
bufferPerLine[] = prevOut[];
bufferPerLine[] = prevOut[];
bufferPerLine += Channels;
lpRowInitial += Channels;
}
lpRowInitial -= Channels;
bufferPerLine -= Channels;
lpColInitial += WidthStep*HeightSubOne;
prevOut[] = (lpRowInitial[] * cnext);
prevOut[] = (lpRowInitial[] * cnext);
prevOut[] = (lpRowInitial[] * cnext);
prevOut[] = (lpRowInitial[] * cnext);
for (int y = HeightSubOne; y >= ; y--) {
prevOut[] = ((lpRowInitial[] * a2a3) - (prevOut[] * b1b2));
prevOut[] = ((lpRowInitial[] * a2a3) - (prevOut[] * b1b2));
prevOut[] = ((lpRowInitial[] * a2a3) - (prevOut[] * b1b2));
prevOut[] = ((lpRowInitial[] * a2a3) - (prevOut[] * b1b2));
bufferPerLine[] += prevOut[];
bufferPerLine[] += prevOut[];
bufferPerLine[] += prevOut[];
bufferPerLine[] += prevOut[];
lpColInitial[] = bufferPerLine[];
lpColInitial[] = bufferPerLine[];
lpColInitial[] = bufferPerLine[];
lpColInitial[] = bufferPerLine[];
lpRowInitial -= Channels;
lpColInitial -= WidthStep;
bufferPerLine -= Channels;
}
}
else if (Channels == )
{
float prevOut = ;
prevOut = (lpRowInitial[] * cprev);
for (int y = ; y < height; y++) {
prevOut = ((lpRowInitial[] * a0a1) - (prevOut * b1b2));
bufferPerLine[] = prevOut;
bufferPerLine += Channels;
lpRowInitial += Channels;
}
lpRowInitial -= Channels;
bufferPerLine -= Channels;
lpColInitial += WidthStep*HeightSubOne;
prevOut = (lpRowInitial[] * cnext);
for (int y = HeightSubOne; y >= ; y--) {
prevOut = ((lpRowInitial[] * a2a3) - (prevOut * b1b2));
bufferPerLine[] += prevOut;
lpColInitial[] = bufferPerLine[];
lpRowInitial -= Channels;
lpColInitial -= WidthStep;
bufferPerLine -= Channels;
}
}
}
//本人博客:http://tntmonks.cnblogs.com/ 转载请注明出处.
void GaussianBlurFilter(unsigned char * input, unsigned char * output, int Width, int Height, int Stride, float GaussianSigma) { int Channels = Stride / Width;
float a0, a1, a2, a3, b1, b2, cprev, cnext; CalGaussianCoeff(GaussianSigma, &a0, &a1, &a2, &a3, &b1, &b2, &cprev, &cnext); float a0a1 = (a0 + a1);
float a2a3 = (a2 + a3);
float b1b2 = (b1 + b2); int bufferSizePerThread = (Width > Height ? Width : Height) * Channels;
unsigned char * bufferPerLine = (unsigned char*)malloc(bufferSizePerThread);
unsigned char * tempData = (unsigned char*)malloc(Height * Stride);
if (bufferPerLine == NULL || tempData == NULL)
{
if (tempData)
{
free(tempData);
}
if (bufferPerLine)
{
free(bufferPerLine);
}
return;
}
for (int y = ; y < Height; ++y) {
unsigned char * lpRowInitial = input + Stride * y;
unsigned char * lpColInitial = tempData + y * Channels;
gaussianHorizontal(bufferPerLine, lpRowInitial, lpColInitial, Width, Height, Channels, Width, a0a1, a2a3, b1b2, cprev, cnext);
}
int HeightStep = Height*Channels;
for (int x = ; x < Width; ++x) {
unsigned char * lpColInitial = output + x*Channels;
unsigned char * lpRowInitial = tempData + HeightStep * x;
gaussianVertical(bufferPerLine, lpRowInitial, lpColInitial, Height, Width, Channels, a0a1, a2a3, b1b2, cprev, cnext);
} free(bufferPerLine);
free(tempData);
}
调用方法:
GaussianBlurFilter(输入图像数据,输出图像数据,宽度,高度,通道数,强度)
注:支持通道数分别为 1 ,3 ,4.
关于IIR相关知识,参阅 百度词条 "IIR数字滤波器"
http://baike.baidu.com/view/3088994.htm
天下武功,唯快不破。
本文只是抛砖引玉一下,若有其他相关问题或者需求也可以邮件联系俺探讨。
邮箱地址是:
gaozhihan@vip.qq.com
题外话:
很多网友一直推崇使用opencv,opencv的确十分强大,但是若是想要有更大的发展空间以及创造力.
还是要一步一个脚印去实现一些最基本的算法,扎实的基础才是构建上层建筑的基本条件.
俺目前只是把opencv当资料库来看,并不认为opencv可以用于绝大多数的商业项目.
若本文帮到您,厚颜无耻求微信扫码打个赏.
半径无关快速高斯模糊实现(附完整C代码)的更多相关文章
- 半径无关单核单线程最快速高斯模糊实现(附完整C代码)
之前,俺也发过不少快速高斯模糊算法. 俺一般认为,只要处理一千六百万像素彩色图片,在2.2GHz的CPU上单核单线程超过1秒的算法,都是不快的. 之前发的几个算法,在俺2.2GHz的CPU上耗时都会超 ...
- 快速双边滤波 附完整C代码
很早之前写过<双边滤波算法的简易实现bilateralFilter>. 当时学习参考的代码来自cuda的样例. 相关代码可以参阅: https://github.com/johng12/c ...
- 基于傅里叶变换的音频重采样算法 (附完整c代码)
前面有提到音频采样算法: WebRTC 音频采样算法 附完整C++示例代码 简洁明了的插值音频重采样算法例子 (附完整C代码) 近段时间有不少朋友给我写过邮件,说了一些他们使用的情况和问题. 坦白讲, ...
- 基于RNN的音频降噪算法 (附完整C代码)
前几天无意间看到一个项目rnnoise. 项目地址: https://github.com/xiph/rnnoise 基于RNN的音频降噪算法. 采用的是 GRU/LSTM 模型. 阅读下训练代码,可 ...
- 音频降噪算法 附完整C代码
降噪是音频图像算法中的必不可少的. 目的肯定是让图片或语音 更加自然平滑,简而言之,美化. 图像算法和音频算法 都有其共通点. 图像是偏向 空间 处理,例如图片中的某个区域. 图像很多时候是以二维数据 ...
- mser 最大稳定极值区域(文字区域定位)算法 附完整C代码
mser 的全称:Maximally Stable Extremal Regions 第一次听说这个算法时,是来自当时部门的一个同事, 提及到他的项目用它来做文字区域的定位,对这个算法做了一些优化. ...
- 音频自动增益 与 静音检测 算法 附完整C代码
前面分享过一个算法<音频增益响度分析 ReplayGain 附完整C代码示例> 主要用于评估一定长度音频的音量强度, 而分析之后,很多类似的需求,肯定是做音频增益,提高音量诸如此类做法. ...
- 音频自动增益 与 静音检测 算法 附完整C代码【转】
转自:https://www.cnblogs.com/cpuimage/p/8908551.html 前面分享过一个算法<音频增益响度分析 ReplayGain 附完整C代码示例> 主要用 ...
- 经典傅里叶算法小集合 附完整c代码
前面写过关于傅里叶算法的应用例子. <基于傅里叶变换的音频重采样算法 (附完整c代码)> 当然也就是举个例子,主要是学习傅里叶变换. 这个重采样思路还有点瑕疵, 稍微改一下,就可以支持多通 ...
随机推荐
- 多目标跟踪(MOT)论文随笔-POI: Multiple Object Tracking with High Performance Detection and Appearance Feature
网上已有很多关于MOT的文章,此系列仅为个人阅读随笔,便于初学者的共同成长.若希望详细了解,建议阅读原文. 本文是tracking by detection 方法进行多目标跟踪的文章,最大的特点是使用 ...
- img之间的间隙问题
前言:关于基线(base line),中线(middle line),行高(line height)的了解还是比较浅的,所以引用前辈的成果,稍带解释下 1)行高:两行文字之间"基线" ...
- 软件工程网络15团队作业1——团队组队&展示
Deadline: 2018-3-25 10:00PM,以提交至班级博客时间为准. 申请开通团队博客,并将团队博客地址发表在本次随笔的评论中 团队展示 根据5-6人的组队要求,每个队伍创建团队博客并发 ...
- C语言博客作业—数据类型
一.PTA实验作业 题目1: 1. 本题PTA提交列表 2. 设计思路 (2)if(输入的n为奇数){ for(行数小于n/2+1时){ for(空格数等于n-2*k+1) printf(" ...
- Alpha冲刺No.2
冲刺Day2 一.站立式会议计划 分组讨论研究:较好的掌握MYSQL的使用,以及Android Studio图形化界面设计的学习同步进行. 完成设计数据库架构,进阶版. 登录.注册界面的设计. 能从同 ...
- 20162323周楠《Java程序设计与数据结构》第五周总结
20162323周楠 2016-2017-2 <程序设计与数据结构>第五周学习总结 教材学习内容总结 1.面向对象软件设计的基本部分是确定程序中应该创建哪些类: 2.面向对象程序设计的核心 ...
- Tornado介绍及自定义组件
Tornado 的性能是相当优异的,因为它试图解决一个被称之为"C10k"问题,就是处理大于或等于一万的并发.一万呀,这可是不小的量 条件:处理器为 AMD Opteron, 主频 ...
- 分布式系统之消息中间件rabbitmq
分布式系统之消息中间件rabbitmq 博客分类: 感谢: 一般php 用rabbitmq java 用activemq http://spartan1.iteye.com/blog/11802 ...
- Web Api 使用模型验证
public class Person { public int Id { get; set; } [Required(ErrorMessage = "姓名不能为空啊啊啊!")] ...
- springboot多模块项目下,子模块调用报错:程序包xxxxx不存在
今天在用springboot搭建多模块项目,结构中有一个父工程Parent 一个通用核心工程core 以及一个项目工程A 当我在工程A中引入core时,没有问题,maven install正常 当我 ...