Basic Information

  • Authors: Jooyong Yi, Shin Hwei Tan, Sergey Mechtaev, Marcel Böhme, Abhik Roychoudhury
  • Publication: EMSE'17
  • Conclusion: In general, with the increase of traditional test suite metrics, the reliability of repairs tend to increase. In particular, such a trend is most strongly observed in statement coverage. Their results imply that the traditional test suite metrics proposed for software testing can also be used for automated program repair to improve the reliability of repairs.

Interesting Points

Correlation between Mutation Testing and Automated Program Repair:

To some extent, automated program repair and mutation testing are very similar. It can be viewed that automated program repair “mutates" the original program, this time in an attempt to find a repair. As in mutation testing, mutants that fail to pass all tests in the provided test-suite are considered buggy (hence, incorrect repairs). This conceptual similarity between mutation testing and automated program repair suggests the plausibility of using the mutation score to measure the quality of a test-suite not only for mutation testing but also for automated program repair. Just as a higher mutation score is associated with a better fault-detection ability in mutation testing, it appears plausible to associate a higher mutation score with a better ability to guide a reliable repair.

There is not only similarity but also duality between mutation testing and automated program repair. As pointed out by Weimer et al (2013), “our confidence in mutant testing increases with the set of non-redundant mutants considered, but our confidence in the quality of a program repair gains increases with the set of non-redundant tests." Note that mutation score measures the non-redundancy of killed mutants, not the non-redundancy of tests capable of killing mutants. We introduce a new metric called capable-tests ratio in the next section that measures the non-redundancy of capable tests.

Measure quality of Test-suite quality and Repair

This paper mainly explore the correlation between quality of automated program repair (APR) and test-suite.

To evaluate quality of APR, traditional metrics (i.e., 1) statement coverage, 2) branch coverage, 3) test-suite size, 4) mutation score) and capable-tests ratio are used.

RQs and Results

RQ1: : Is there a negative correlation between the metrics of a testsuite and the regression ratio of automatically generated repairs? In other words, are generated repairs less likely to cause regressions, as test-suite metrics increase?

As the traditional test-suite metrics (statement coverage, branch coverage, test-suite size, and mutation score) increase, the regression rate of automatically generated repairs generally decreases, showing the promise of using the traditional test-suite metrics to control the regression ratio of automatically generated repairs. Capable-tests ratio does not seem as useful as the traditional metrics in controlling the quality of generated repairs.

RQ2: Which test-suite metric is most strongly correlated with the regression ratio of automatically generated repairs?

In our experiments, statement coverage is, on average, more strongly correlated with regression ratio than other metrics we investigate. Our results suggest that to reduce the regression ratio, increasing statement coverage is more promising than improving the other test-suite metrics.

RQ3: Is there a negative correlation between the metrics of a test-suite and the repairability of automated program repair? In other words, should repairability be sacrificed in an attempt to obtain a higher-quality repair via a higher-quality testsuite?

Our experimental results are inconclusive about the correlation between test-suites and repairability. However, we note that increasing test-suite metric does not always decrease repairability. Im some subjects, positive correlations were observed between test-suite metrics and repairability, indicating that as the test-suite metrics increase, repairability tends to increase.

RQ4: Is there a negative correlation between the metrics of a test-suite and repair time? In other words, should more time be spent in an attempt to obtain a higher-quality repair via a higher-quality test-suite?

Our experimental results are inconlusive about the correlation between test-suites and repair time. However, we note that increasing test-suite metric does not always increase repair time. In some subjects, negative correlations were observed between test-suite metrics and repair time, indicating that as the test-suite metrics increase, repair time tends to decrease.

Different Repair Algorithm: SEMFIX

Our experimental results from SEMFIX generally coincide with our finding from the GENPROG experiment, despite the differences in repair algorithms and fault localization techniques. The traditional test-suite metrics are, overall, negatively correlated with regression ratio, similar to our GENPROG experimental results. In particular, **statement coverage** is again shown to be most strongly correlated with regression ratio.

[EMSE'17] A Correlation Study between Automated Program Repair and Test-Suite Metrics的更多相关文章

  1. Reading List on Automated Program Repair

    Some resources: https://www.monperrus.net/martin/automatic-software-repair 2017 [ ] DeepFix: Fixing ...

  2. [Benchmark] Codeflaws: A Programming Competition Benchmark for Evaluating Automated Program Repair Tools

    Basic Information Publication: ICSE'17 Authors: Shin Hwei Tan, Jooyong Yi, Yulis, Sergey Mechtaev, A ...

  3. One example to understand SemFix: Program Repair via Semantic Analysis

    One example to understand SemFix: Program Repair via Semantic Analysis Basic Information Authors: Ho ...

  4. paho_c_pub 使用方法

    Latest Paho Status (2) 摘自:http://modelbasedtesting.co.uk/ I last wrote about the state of Paho in Oc ...

  5. A Great List of Windows Tools

    Windows is an extremely effective and a an efficient operating system. Like any other operating syst ...

  6. docker入门级详解

    Docker 1 docker安装 yum install docker [root@topcheer ~]# systemctl start docker [root@topcheer ~]# mk ...

  7. C#Lambda表达式演变和Linq的深度解析

    Lambda 一.Lambda的演变 Lambda的演变,从下面的类中可以看出,.Net Framwork1.0时还是用方法实例化委托的,2.0的时候出现了匿名方法,3.0的时候出现了Lambda. ...

  8. hadoop 2.7.3本地环境运行官方wordcount

    hadoop 2.7.3本地环境运行官方wordcount 基本环境: 系统:win7 虚机环境:virtualBox 虚机:centos 7 hadoop版本:2.7.3 本次先以独立模式(本地模式 ...

  9. Manual——Test (翻译1)

    LTE Manual ——Logging(翻译) (本文为个人学习笔记,如有不当的地方,欢迎指正!) 1.17.3 Testing framework(测试框架)   ns-3 包含一个仿真核心引擎. ...

随机推荐

  1. H5C302

    H5C302 1.网络监听端口 ononline及onoffline事件 2.全屏接口 注意:在使用时不同浏览器需要添加不同的前缀: chrome:webkit firefox:moz ie:ms o ...

  2. python之数据类型与变量

    第一:变量 变量作用:保存状态:说白了,程序运行的状态就是状态的变化,变量是用来保存状态的,变量值的不断变化就产生了运行程序的最终输出结果 一:声明变量 # -*-coding:utf-8-*- na ...

  3. Redis连接出现Error: Connection reset by peer的问题是由于使用Redis的安全模式

    现在网上一查出现安全模式的连接,基本都是要关闭服务端的操作,其实这种方式是不正确的,最有效的解决方式是使用stunnel进行安全模式的连接. 我碰到的问题是微软云(其实我不想用!)连接Redis,默认 ...

  4. [MySQL]查看用户权限与GRANT用法

    摘自:http://apps.hi.baidu.com/share/detail/15071849 查看用户权限 show grants for 你的用户 比如:show grants for roo ...

  5. SharePoint PowerShell 修改母版页

    前言 最近在群里帮忙回答问题,碰到这么一个尴尬的问题,有人创建了一个新母版页,然后引用了新的母版页,不知道怎么的母版页有问题了,再也进不去站点了,希望修改回旧的母版页. 看到问题,想了一下,其实两种方 ...

  6. Javascript数组(一)排序

    一.简介首先,我们来看一下JS中sort()和reverse()这两个函数的函数吧reverse();这个函数是用来进行倒序,这个没有什么可说的,所谓倒序就是大的在前面,小的在后面. 比如: var ...

  7. Python多进程池 multiprocessing Pool

    1. 背景 由于需要写python程序, 定时.大量发送htttp请求,并对结果进行处理. 参考其他代码有进程池,记录一下. 2. 多进程 vs 多线程 c++程序中,单个模块通常是单进程,会启动几十 ...

  8. n2n的编译和运行、配置

    交叉编译: cmake -DCMAKE_TOOLCHAIN_FILE=../cmake/CMakeToolchainFileMingw32.cmake -build ./ ../ 1.n2n  基于p ...

  9. exception The valid characters are defined in RFC 7230 and RFC 3986

      1.情景展示 当你使用浏览器进行问号传参与后台进行交互时,会报这个异常. tomcat控制台报错信息如下: The valid characters are defined in RFC 7230 ...

  10. go微服务框架go-micro深度学习-目录

    go微服务框架go-micro深度学习(一) 整体架构介绍 go微服务框架go-micro深度学习(二) 入门例子 go微服务框架go-micro深度学习(三) Registry服务的注册和发现 go ...