Rarely executed and almost empty if statement drastically reduces performance in C++
Question:
Editor's clarification: When this was originally posted, there were two issues:
- Test performance drops by a factor of three if seemingly inconsequential statement added
- Time taken to complete the test appears to vary randomly
The second issue has been solved: the randomness only occurs when running under the debugger.
The remainder of this question should be understood as being about the first bullet point above, and in the context of running in VC++ 2010 Express's Release Mode with optimizations "Maximize Speed" and "favor fast code".
There are still some Comments in the comment section talking about the second point but they can now be disregarded.
I have a simulation where if I add a simple if statement into the while loop that runs the actual simulation, the performance drops about a factor of three (and I run a lot of calculations in the while loop, n-body gravity for the solar system besides other
things) even though the if statement is almost never executed:
if (time - cb_last_orbital_update > 5000000)
{
cb_last_orbital_update = time;
}
with time and cb_last_orbital_update being both of type
double and defined in the beginning of the main function, where this if statement is too. Usually there are computations I want to run there too, but it makes no difference if I delete them. The if statement as it is above has the same effect on
the performance.
The variable time is the simulation time, it increases in 0.001 steps in the beginning so it takes a really long time until the if statement is executed for the first time (I also included printing a message to see if it is being executed, but
it is not, or at least only when it's supposed to). Regardless, the performance drops by a factor of 3 even in the first minutes of the simulation when it hasn't been executed once yet. If I comment out the line
cb_last_orbital_update = time;
then it runs faster again, so it's not the check for
time - cb_last_orbital_update > 5000000
either, it's definitely the simple act of writing current simulation time into this variable.
Also, if I write the current time into another variable instead of cb_last_orbital_update, the performance does not drop. So this might be an issue with assigning a new value to a variable that is used to check if the "if" should be executed?
These are all shots in the dark though.
Disclaimer: I am pretty new to programming, and sorry for all that text.
I am using Visual C++ 2010 Express, deactivating the stdafx.h precompiled header function didn't make a difference either.
EDIT: Basic structure of the program. Note that nowhere besides at the end of the while loop (time += time_interval;) is
time changed. Also, cb_last_orbital_update has only 3 occurrences: Declaration / initialization, plus the two times in the if statement that is causing the problem.
int main(void)
{
...
double time = 0;
double time_interval = 0.001;
double cb_last_orbital_update = 0;
F_Rocket_Preset(time, time_interval, ...);
while(conditions)
{
Rocket[active].Stage[Rocket[active].r_stage].F_Update_Stage_Performance(time, time_interval, ...);
Rocket[active].F_Calculate_Aerodynamic_Variables(time);
Rocket[active].F_Calculate_Gravitational_Forces(cb_mu, cb_pos_d, time);
Rocket[active].F_Update_Rotation(time, time_interval, ...);
Rocket[active].F_Update_Position_Velocity(time_interval, time, ...);
Rocket[active].F_Calculate_Orbital_Elements(cb_mu);
F_Update_Celestial_Bodies(time, time_interval, ...);
if (time - cb_last_orbital_update > 5000000.0)
{
cb_last_orbital_update = time;
}
Rocket[active].F_Check_Apoapsis(time, time_interval);
Rocket[active].F_Status_Check(time, ...);
Rocket[active].F_Update_Mass (time_interval, time);
Rocket[active].F_Staging_Check (time, time_interval);
time += time_interval;
if (time > 3.1536E8)
{
std::cout << "\n\nBreak main loop! Sim Time: " << time << std::endl;
break;
}
}
...
}
EDIT 2:
Here is the difference in the assembly code. On the left is the fast code with the line
cb_last_orbital_update = time;
outcommented, on the right the slow code with the line.
EDIT 4:
So, i found a workaround that seems to work just fine so far:
int cb_orbit_update_counter = 1; // before while loop
if(time - cb_orbit_update_counter * 5E6 > 0)
{
cb_orbit_update_counter++;
}
EDIT 5:
While that workaround does work, it only works in combination with using . I just removed those from the function declarations again to see if that changes anything, and it does.
__declspec(noinline)
EDIT 6: Sorry this is getting confusing. I tracked down the culprit for the lower performance when removing
__declspec(noinline) to this function, that is being executed inside the
if:
__declspec(noinline) std::string F_Get_Body_Name(int r_body)
{
switch (r_body)
{
case 0:
{
return ("the Sun");
}
case 1:
{
return ("Mercury");
}
case 2:
{
return ("Venus");
}
case 3:
{
return ("Earth");
}
case 4:
{
return ("Mars");
}
case 5:
{
return ("Jupiter");
}
case 6:
{
return ("Saturn");
}
case 7:
{
return ("Uranus");
}
case 8:
{
return ("Neptune");
}
case 9:
{
return ("Pluto");
}
case 10:
{
return ("Ceres");
}
case 11:
{
return ("the Moon");
}
default:
{
return ("unnamed body");
}
}
}
The if also now does more than just increase the counter:
if(time - cb_orbit_update_counter * 1E7 > 0)
{
F_Update_Orbital_Elements_Of_Celestial_Bodies(args);
std::cout << F_Get_Body_Name(3) << " SMA: " << cb_sma[3] << "\tPos Earth: " << cb_pos_d[3][0] << " / " << cb_pos_d[3][1] << " / " << cb_pos_d[3][2] <<
"\tAlt: " << sqrt(pow(cb_pos_d[3][0] - cb_pos_d[0][0],2) + pow(cb_pos_d[3][1] - cb_pos_d[0][1],2) + pow(cb_pos_d[3][2] - cb_pos_d[0][2],2)) << std::endl;
std::cout << "Time: " << time << "\tcb_o_h[3]: " << cb_o_h[3] << std::endl;
cb_orbit_update_counter++;
}
I remove __declspec(noinline) from the function F_Get_Body_Name alone, the code gets slower. Similarly, if i remove the execution of this function or add
__declspec(noinline) again, the code runs faster. All other functions still have
__declspec(noinline).
EDIT 7:So i changed the switch function to
const std::string cb_names[] = {"the Sun","Mercury","Venus","Earth","Mars","Jupiter","Saturn","Uranus","Neptune","Pluto","Ceres","the Moon","unnamed body"}; // global definition
const int cb_number = 12; // global definition
std::string F_Get_Body_Name(int r_body)
{
if (r_body >= 0 && r_body < cb_number)
{
return (cb_names[r_body]);
}
else
{
return (cb_names[cb_number]);
}
}
and also made another part of the code slimmer. The program now runs fast without any
__declspec(noinline). As ElderBug suggested, an issue with the CPU instruction cache then / the code getting too big?
Answer:
I'd put my money on Intel's branch predictor. http://en.wikipedia.org/wiki/Branch_predictor
The processor assumes (time - cb_last_orbital_update > 5000000) to be false most of the time and loads up the execution pipeline accordingly.
Once the condition (time - cb_last_orbital_update > 5000000) comes true. The misprediction delay is hitting you. You may loose 10 to 20 cycles.
if (time - cb_last_orbital_update > 5000000)
{
cb_last_orbital_update = time;
}
Rarely executed and almost empty if statement drastically reduces performance in C++的更多相关文章
- Following a Select Statement Through Postgres Internals
This is the third of a series of posts based on a presentation I did at the Barcelona Ruby Conferenc ...
- 对PostgreSQL的prepared statement的深入理解
看官方文档: http://www.postgresql.org/docs/current/static/sql-prepare.html PREPARE creates a prepared sta ...
- verilog behavioral modeling--branch statement
conditional statement case statement 1. conditional statement if(expression) statement_o ...
- MySQL 5.6 Reference Manual-14.6 InnoDB Table Management
14.6 InnoDB Table Management 14.6.1 Creating InnoDB Tables 14.6.2 Moving or Copying InnoDB Tables to ...
- Introduction to ASP.NET Web Programming Using the Razor Syntax (C#)
1, http://www.asp.net/web-pages/overview/getting-started/introducing-razor-syntax-c 2, Introduction ...
- CoreCLR源码探索(八) JIT的工作原理(详解篇)
在上一篇我们对CoreCLR中的JIT有了一个基础的了解, 这一篇我们将更详细分析JIT的实现. JIT的实现代码主要在https://github.com/dotnet/coreclr/tree/m ...
- Practical Go: Real world advice for writing maintainable Go programs
转自:https://dave.cheney.net/practical-go/presentations/qcon-china.html?from=timeline 1. Guiding pri ...
- SAP NOTE 1999997 - FAQ: SAP HANA Memory
Symptom You have questions related to the SAP HANA memory. You experience a high memory utilization ...
- 【ruby】ruby基础知识
Install Ruby(安装) For windows you can download Ruby from http://rubyforge.org/frs/?group_id=167 for L ...
随机推荐
- Android开发 - 解决DialogFragment在全屏时View被状态栏遮住的问题
我的上一篇文章:设置DialogFragment全屏显示 可以设置对话框的内容全屏显示,但是存在在某些机型上顶部的View被状态栏遮住的问题.经过测试,发现了一种解决办法,在DialogFragmen ...
- OCP 12c考试题,062题库出现大量新题-第20道
choose three Your database is configured for ARCHIVELOG mode, and a daily full database backup is ta ...
- springmvc接受及响应ajax请求。 以及@RequestBody 和@ResponseBody注解的使用
1.发送ajax请求 $.ajax({ url:"user/testAjax", contentType:"application/json;charset=UTF-8& ...
- [SRC初探]手持新手卡挖SRC逻辑漏洞心得分享
文章来源i春秋 本文适合新手参阅,大牛笑笑就好了,嘿嘿末尾有彩蛋!!!!!!!!!!!!!!!!!本人参加了本次"i春秋部落守卫者联盟"活动,由于经验不足,首次挖SRC,排名不是那 ...
- 【安全狗SRC】抗D设备哪家强?你来!大佬告诉你答案
上周,安全狗SRC联合SRC部落,携手推出了爆款话题:传统抗D设备 vs 新兴CDN抗D:抗D效果哪个好? 一经发布简直好评如潮,热评无数,四方雷动(?)原帖在此,错过的吃瓜表哥们可以再围观一下~ht ...
- Spring Caching集成Ehcache
Ehcache可以对页面.对象.数据进行缓存,同时支持集群/分布式缓存.在应用中用于常常需要读取的数据交换,而不是通过DB DAO数据交换(cache不占用DB宝贵的NIO,直接交换堆内存). 整合S ...
- [CTSC1999] 家园
使用并查集判断无解. 令月球是n+1,地球是0 枚举时长t,将点(地球.月球以及太空站)i拆为t个点(i,j)表示第j时刻的点i. 对于太空船云云建图,容量是h[i]. 源点S和(0,0)连边,容量k ...
- docker 日志方案
docker logs默认会显示命令的标准输出(STDOUT)和标准错误(STDERR).下面使用echo.sh和Dockerfile创建一个名为echo.v1的镜像,echo.sh会一直输出”hel ...
- 图像处理池化层pooling和卷积核
1.池化层的作用 在卷积神经网络中,卷积层之间往往会加上一个池化层.池化层可以非常有效地缩小参数矩阵的尺寸,从而减少最后全连层中的参数数量.使用池化层即可以加快计算速度也有防止过拟合的作用. 2.为什 ...
- java 访问剪切板(读取与设置)
设置文本到剪切板 public void setIntoClipboard(String data) { Clipboard clipboard = Toolkit.getDefaultToolkit ...