Basic Information

  • Publication: ICSE'17
  • Authors: Shin Hwei Tan, Jooyong Yi, Yulis, Sergey Mechtaev, Abhik Roychoudhury
  • Language: C Program
  • Source: Codeforces Programming Contest (Reject/Accept)
  • Description: a set of 3902 defects from 7436 programs automatically classified across 39 defect classes
  • Dataset Homepage

Summary

Existing benchmarks (like ManyBugs and IntroClass) on automated program repairs do not allow thorough investigation of the relationship between fault types and the effectiveness of repair tools.
Four criterias for a benchmark that allows extensive evaluation of repair tools:

  • C1: Diverse types of real defects.
  • C2: Large number of defects.
  • C3: Large number of programs.
  • C4: Programs that are algorithmically complex
  • C5: Large held-out test suite for patch correctness verification

Overall, author crawled over 10000 webpages from Codeforces programming contest. For each rejected submission r, they find another accepted submission a by the same user for the same programming problem in the crawled data. Each fault is represented by the submission pair (r, a). In total, they obtain 5544 defects. Then they further exclude 924 defects due to inadequate held-out tests, 677 defects due to non-reproducible bugs, and 41 defects due to a known CIL bugs2 in handling variable sized multidimensional array.

All defects are divided into 39 classes by using Gumtree on AST-level syntactic differences between buggy program and patched program.

Structure

codeflaws
|> 1-A-bug-18353198-18353306 (<contestid>-<problem>-bug-<buggy-submisionid>-<accepted-submissionid>)
|===> 1-A-18353198.c (<contestid>-<problem>-<buggy-submisionid>.c)
|===> 1-A-18353306.c (<contestid>-<problem>-<accepted-submissionid>.c)
|===> input-neg1 (Test input files: input[0-9]+ file used by Test suite (i))
|===> output-neg1 (Test output files: output[0-9]+ file used by Test suite (i))
|===> heldout-input-pos1 (heldout-input[0-9]+ file used by Test suite (ii))
|===> heldout-output-pos1 (heldout-output[0-9]+ file used by Test suite (ii))
|===> 1-A-18353198.c.revlog(Test configuration for SPR that specify the name for pass/fail test: --.c.revlog)
|===> test-genprog.sh (Repair Test script (test suite given to repair tools for generating repair), test-genprog.sh is for search-based repair tools (GenProg, SPR, Prophet))
|===> test-angelix.sh (Repair Test script (test suite given to repair tools for generating repair), test-angelix.sh is for Angelix as it requires inserting special instrumentation)
|===> test-valid.sh(Test script for patch validation (held-out test suite): test-valid.sh is for validating the correctness of patches)
|===> Makefile (Makefile for compiling the buggy submission. This contains the CFLAGS options recommended by Codeforces. To compile the accepted submission, use the command make FILENAME=10-A-13543524)
|===> Makefile.genprog (Makefile.genprog for compiling the buggy submission using cilly. This is for GenProg experiments as GenProg works on CIL representation.)

[Benchmark] Codeflaws: A Programming Competition Benchmark for Evaluating Automated Program Repair Tools的更多相关文章

  1. Benchmark result without MONITOR running: Benchmark result with MONITOR running (redis-cli monitor > /dev/null): 吞吐量 下降约1半 Redis监控工具,命令和调优

    https://redis.io/commands/monitor In this particular case, running a single MONITOR client can reduc ...

  2. 2050 Programming Competition

    http://2050.acmclub.cn/contests/contest_show.php?cid=3 开场白 Time Limit: 2000/1000 MS (Java/Others)    ...

  3. 2050 Programming Competition (CCPC)

    Pro&Sol 链接: https://pan.baidu.com/s/17Tt3EPKEQivP2-3OHkYD2A 提取码: wbnu 复制这段内容后打开百度网盘手机App,操作更方便哦 ...

  4. 2019 China Collegiate Programming Contest Qinhuangdao Onsite F. Forest Program(DFS计算图中所有环的长度)

    题目链接:https://codeforces.com/gym/102361/problem/F 题意 有 \(n\) 个点和 \(m\) 条边,每条边属于 \(0\) 或 \(1\) 个环,问去掉一 ...

  5. Reading List on Automated Program Repair

    Some resources: https://www.monperrus.net/martin/automatic-software-repair 2017 [ ] DeepFix: Fixing ...

  6. Azure Redis Cache (3) 在Windows 环境下使用Redis Benchmark

    <Windows Azure Platform 系列文章目录> 熟悉Redis环境的读者都知道,我们可以在Linux环境里,使用Redis Benchmark,测试Redis的性能. ht ...

  7. MYSQL BENCHMARK函数的使用

    MYSQL BENCHMARK函数是最重要的函数之一,下文对该函数的使用进行了详尽的分析,如果您对此感兴趣的话,不妨一看. 下文为您介绍的是MYSQL BENCHMARK函数的语法,及一些MYSQL  ...

  8. Benchmark与Profiler---性能调优得力助手

    转载请注明出处:http://blog.csdn.net/gaoyanjie55/article/details/34981077 性能优化.它是一种诊断性能瓶颈,能问题点进行优化的过程.前两天听完s ...

  9. c++性能测试工具:google benchmark入门(一)

    如果你正在寻找一款c++性能测试工具,那么这篇文章是不容错过的. 市面上的benchmark工具或多或少存在一些使用上的不便,那么是否存在一个使用简便又功能强大的性能测试工具呢?答案是google/b ...

随机推荐

  1. 如何实现织梦dedecms表单提交时发送邮箱功能【已解决】

    我们通过织梦系统制作网站时,很多客户需要有在线留言功能,这时就会用到自定义表单.但是很多用户觉得经常登陆后台查看留言信息太麻烦了,于是想能否在提交留言是直接把内容发送到指定邮箱.网站经过测试终于实现了 ...

  2. ES6_函数方法

    //2017/7/15 //Javascript 中的方法:在一个对象中绑定函数,称为这个对象的方法. */ var boy={ name:'xiaoming', birth:2007, age:fu ...

  3. poj2385 Apple Catching(dp状态转移方程推导)

    https://vjudge.net/problem/POJ-2385 猛刷简单dp的第一天的第一题. 状态:dp[i][j]表示第i秒移动j次所得的最大苹果数.关键要想到移动j次,根据j的奇偶判断人 ...

  4. JavaScirpt对象原生方法

    Object.assign() Object.assign()方法用于合并对象,只会合并可枚举的属性 const obj1= {a: 1} const obj2 = Object.assign({}, ...

  5. HRMS(人力资源管理系统)-从单机应用到SaaS应用-系统介绍

    上周发布的<2018,全新出发(全力推动实现住有所居)>文章,其中记录了个人在这5年过程中的成长和收获,有幸认识了不少博客园的朋友,大家一起学习交流,在这个过程当中好多朋友提出SaaS系统 ...

  6. jsp中添加过滤器,实现校验用户身份

    我现在需要实现一个功能,就是用户登录前不允许访问系统,我使用的是jsp的过滤器来实现的. 先把filter过滤器的代码粘出来: package com.day8.filter; import java ...

  7. Visual Studio 2017 扩展

    Visual Studio 2017 扩展 Visual Studio 2017 15.4.4 : 目前是最新的版本号,所有的工具&插件都支持这个版本号.所以请对号入座. ReSharper  ...

  8. Quartz小记(一):Elastic-Job - 分布式定时任务框架

    Elastic-Job是ddframe中dd-job的作业模块中分离出来的分布式弹性作业框架.去掉了和dd-job中的监控和ddframe接入规范部分.该项目基于成熟的开源产品Quartz和Zooke ...

  9. Effective Java 第三版——70. 对可恢复条件使用检查异常,对编程错误使用运行时异常

    Tips 书中的源代码地址:https://github.com/jbloch/effective-java-3e-source-code 注意,书中的有些代码里方法是基于Java 9 API中的,所 ...

  10. 使用viewport中的vm来适配移动端页面

    前言 作为一个小前端,经常要和H5打交道,这就面临着不同终端的适配问题. Flexible方案通过Hack手段来根据设备的dpr值相应改变<meta>标签中viewport的值,给我更贴切 ...