Basic Information

Authors: Jooyong Yi, Shin Hwei Tan, Sergey Mechtaev, Marcel Böhme, Abhik Roychoudhury
Publication: EMSE'17
Conclusion: In general, with the increase of traditional test suite metrics, the reliability of repairs tend to increase. In particular, such a trend is most strongly observed in statement coverage. Their results imply that the traditional test suite metrics proposed for software testing can also be used for automated program repair to improve the reliability of repairs.

Interesting Points

Correlation between Mutation Testing and Automated Program Repair:

To some extent, automated program repair and mutation testing are very similar. It can be viewed that automated program repair “mutates" the original program, this time in an attempt to find a repair. As in mutation testing, mutants that fail to pass all tests in the provided test-suite are considered buggy (hence, incorrect repairs). This conceptual similarity between mutation testing and automated program repair suggests the plausibility of using the mutation score to measure the quality of a test-suite not only for mutation testing but also for automated program repair. Just as a higher mutation score is associated with a better fault-detection ability in mutation testing, it appears plausible to associate a higher mutation score with a better ability to guide a reliable repair.

There is not only similarity but also duality between mutation testing and automated program repair. As pointed out by Weimer et al (2013), “our confidence in mutant testing increases with the set of non-redundant mutants considered, but our confidence in the quality of a program repair gains increases with the set of non-redundant tests." Note that mutation score measures the non-redundancy of killed mutants, not the non-redundancy of tests capable of killing mutants. We introduce a new metric called capable-tests ratio in the next section that measures the non-redundancy of capable tests.

Measure quality of Test-suite quality and Repair

This paper mainly explore the correlation between quality of automated program repair (APR) and test-suite.

To evaluate quality of APR, traditional metrics (i.e., 1) statement coverage, 2) branch coverage, 3) test-suite size, 4) mutation score) and capable-tests ratio are used.

RQs and Results

RQ1: : Is there a negative correlation between the metrics of a testsuite and the regression ratio of automatically generated repairs? In other words, are generated repairs less likely to cause regressions, as test-suite metrics increase?

As the traditional test-suite metrics (statement coverage, branch coverage, test-suite size, and mutation score) increase, the regression rate of automatically generated repairs generally decreases, showing the promise of using the traditional test-suite metrics to control the regression ratio of automatically generated repairs. Capable-tests ratio does not seem as useful as the traditional metrics in controlling the quality of generated repairs.

RQ2: Which test-suite metric is most strongly correlated with the regression ratio of automatically generated repairs?

In our experiments, statement coverage is, on average, more strongly correlated with regression ratio than other metrics we investigate. Our results suggest that to reduce the regression ratio, increasing statement coverage is more promising than improving the other test-suite metrics.

RQ3: Is there a negative correlation between the metrics of a test-suite and the repairability of automated program repair? In other words, should repairability be sacrificed in an attempt to obtain a higher-quality repair via a higher-quality testsuite?

Our experimental results are inconclusive about the correlation between test-suites and repairability. However, we note that increasing test-suite metric does not always decrease repairability. Im some subjects, positive correlations were observed between test-suite metrics and repairability, indicating that as the test-suite metrics increase, repairability tends to increase.

RQ4: Is there a negative correlation between the metrics of a test-suite and repair time? In other words, should more time be spent in an attempt to obtain a higher-quality repair via a higher-quality test-suite?

Our experimental results are inconlusive about the correlation between test-suites and repair time. However, we note that increasing test-suite metric does not always increase repair time. In some subjects, negative correlations were observed between test-suite metrics and repair time, indicating that as the test-suite metrics increase, repair time tends to decrease.

Different Repair Algorithm: SEMFIX

Our experimental results from SEMFIX generally coincide with our finding from the GENPROG experiment, despite the differences in repair algorithms and fault localization techniques. The traditional test-suite metrics are, overall, negatively correlated with regression ratio, similar to our GENPROG experimental results. In particular, **statement coverage** is again shown to be most strongly correlated with regression ratio.

[EMSE'17] A Correlation Study between Automated Program Repair and Test-Suite Metrics的更多相关文章

Reading List on Automated Program Repair
Some resources: https://www.monperrus.net/martin/automatic-software-repair 2017 [ ] DeepFix: Fixing ...
[Benchmark] Codeflaws: A Programming Competition Benchmark for Evaluating Automated Program Repair Tools
Basic Information Publication: ICSE'17 Authors: Shin Hwei Tan, Jooyong Yi, Yulis, Sergey Mechtaev, A ...
One example to understand SemFix: Program Repair via Semantic Analysis
One example to understand SemFix: Program Repair via Semantic Analysis Basic Information Authors: Ho ...
paho_c_pub 使用方法
Latest Paho Status (2) 摘自:http://modelbasedtesting.co.uk/ I last wrote about the state of Paho in Oc ...
A Great List of Windows Tools
Windows is an extremely effective and a an efficient operating system. Like any other operating syst ...
docker入门级详解
Docker 1 docker安装 yum install docker [root@topcheer ~]# systemctl start docker [root@topcheer ~]# mk ...
C#Lambda表达式演变和Linq的深度解析
Lambda 一.Lambda的演变 Lambda的演变,从下面的类中可以看出,.Net Framwork1.0时还是用方法实例化委托的,2.0的时候出现了匿名方法,3.0的时候出现了Lambda. ...
hadoop 2.7.3本地环境运行官方wordcount
hadoop 2.7.3本地环境运行官方wordcount 基本环境: 系统:win7 虚机环境:virtualBox 虚机:centos 7 hadoop版本:2.7.3 本次先以独立模式(本地模式 ...
Manual——Test （翻译1）
LTE Manual ——Logging(翻译) (本文为个人学习笔记,如有不当的地方,欢迎指正!) 1.17.3 Testing framework(测试框架) ns-3 包含一个仿真核心引擎. ...

随机推荐

小甲鱼Python第十讲课后题---
0. 下边的列表分片操作会打印什么内容? >>> list1 = [1, 3, 2, 9, 7, 8]>>> list1[2:5] [2,9,7] 1.请问 lis ...
Linux命令之rpm篇
作业五:rpm命令 1) 挂载光盘文件到/media目录 [root@localhost 桌面]# mount /dev/sr0 /media mount: /dev/sr0 写保护,将以只读方式 ...
咏南新BS开发框架
咏南新BS开发框架咏南WEB框架支持负载均衡群集. 咏南WEB桌面框架演示:47.106.93.126:9999 咏南WEB手机框架本地:47.106.93.126:8077 咏南CS框架下载:ht ...
下一代微服务 ~ Service Mesh
微服务(Microservices) 微服务(Microservices)是一种架构风格,一个大型复杂软件应用由一个或多个微服务组成.系统中的各个微服务可被独立部署,各个微服务之间是松耦合的.每个微服 ...
EasyMock 简单使用
参考案例:(本位使用markdown编写)https://www.ibm.com/developerworks/cn/opensource/os-cn-easymock/https://www.yii ...
fontawesome图标字体库组件在服务器上显示不出来图标的解决
这个组件在我所开发的网站中被大量使用,为网站增色不少.在本地测试的时候所有图标都能显示出来,可一到服务器上就显示不出来了.网上查列出了可能的原因.其一,IIS没有注册字体类型.经过检查,不存在这个问题 ...
ajax的请求，参数怎么管理？
一般发送一条ajax 然后点击界面需要更改查询条件,第一种是做一个form表单比较合适的设计.更改了参数回收表单然后重新发送ajax: 还有一种是把参数缓存到变量中,然后更改了条件修改变量再次重发aj ...
（原）tensorflow使用eager在mnist上训练的简单例子
转载请注明出处: https://www.cnblogs.com/darkknightzh/p/9989586.html 代码网址: https://github.com/darkknightzh/t ...
GitHub删除已有仓库
之前都只是创建,还没试过删除,讲道理,如果第一次找删除按钮,还是有点小曲折的,特记录如下: 1.先找到你要删除的仓库 2.点进去,到具体项目地址,找到setting 3.点进去,一直往下翻,会看到红色 ...
30天自制操作系统 - 来一个hello world
helloos.nas 源码: ; hello-os ; TAB= ; 以下这段是标准的FAT12格式软盘专用代码 DB 0xeb, 0x4e, 0x90 DB "HELLOIPL" ...

[EMSE'17] A Correlation Study between Automated Program Repair and Test-Suite Metrics

Basic Information

Interesting Points

RQs and Results

[EMSE'17] A Correlation Study between Automated Program Repair and Test-Suite Metrics的更多相关文章

随机推荐

热门专题