Basic Information

  • Publication: ICSE'17
  • Authors: Shin Hwei Tan, Jooyong Yi, Yulis, Sergey Mechtaev, Abhik Roychoudhury
  • Language: C Program
  • Source: Codeforces Programming Contest (Reject/Accept)
  • Description: a set of 3902 defects from 7436 programs automatically classified across 39 defect classes
  • Dataset Homepage

Summary

Existing benchmarks (like ManyBugs and IntroClass) on automated program repairs do not allow thorough investigation of the relationship between fault types and the effectiveness of repair tools.
Four criterias for a benchmark that allows extensive evaluation of repair tools:

  • C1: Diverse types of real defects.
  • C2: Large number of defects.
  • C3: Large number of programs.
  • C4: Programs that are algorithmically complex
  • C5: Large held-out test suite for patch correctness verification

Overall, author crawled over 10000 webpages from Codeforces programming contest. For each rejected submission r, they find another accepted submission a by the same user for the same programming problem in the crawled data. Each fault is represented by the submission pair (r, a). In total, they obtain 5544 defects. Then they further exclude 924 defects due to inadequate held-out tests, 677 defects due to non-reproducible bugs, and 41 defects due to a known CIL bugs2 in handling variable sized multidimensional array.

All defects are divided into 39 classes by using Gumtree on AST-level syntactic differences between buggy program and patched program.

Structure

codeflaws
|> 1-A-bug-18353198-18353306 (<contestid>-<problem>-bug-<buggy-submisionid>-<accepted-submissionid>)
|===> 1-A-18353198.c (<contestid>-<problem>-<buggy-submisionid>.c)
|===> 1-A-18353306.c (<contestid>-<problem>-<accepted-submissionid>.c)
|===> input-neg1 (Test input files: input[0-9]+ file used by Test suite (i))
|===> output-neg1 (Test output files: output[0-9]+ file used by Test suite (i))
|===> heldout-input-pos1 (heldout-input[0-9]+ file used by Test suite (ii))
|===> heldout-output-pos1 (heldout-output[0-9]+ file used by Test suite (ii))
|===> 1-A-18353198.c.revlog(Test configuration for SPR that specify the name for pass/fail test: --.c.revlog)
|===> test-genprog.sh (Repair Test script (test suite given to repair tools for generating repair), test-genprog.sh is for search-based repair tools (GenProg, SPR, Prophet))
|===> test-angelix.sh (Repair Test script (test suite given to repair tools for generating repair), test-angelix.sh is for Angelix as it requires inserting special instrumentation)
|===> test-valid.sh(Test script for patch validation (held-out test suite): test-valid.sh is for validating the correctness of patches)
|===> Makefile (Makefile for compiling the buggy submission. This contains the CFLAGS options recommended by Codeforces. To compile the accepted submission, use the command make FILENAME=10-A-13543524)
|===> Makefile.genprog (Makefile.genprog for compiling the buggy submission using cilly. This is for GenProg experiments as GenProg works on CIL representation.)

[Benchmark] Codeflaws: A Programming Competition Benchmark for Evaluating Automated Program Repair Tools的更多相关文章

  1. Benchmark result without MONITOR running: Benchmark result with MONITOR running (redis-cli monitor > /dev/null): 吞吐量 下降约1半 Redis监控工具,命令和调优

    https://redis.io/commands/monitor In this particular case, running a single MONITOR client can reduc ...

  2. 2050 Programming Competition

    http://2050.acmclub.cn/contests/contest_show.php?cid=3 开场白 Time Limit: 2000/1000 MS (Java/Others)    ...

  3. 2050 Programming Competition (CCPC)

    Pro&Sol 链接: https://pan.baidu.com/s/17Tt3EPKEQivP2-3OHkYD2A 提取码: wbnu 复制这段内容后打开百度网盘手机App,操作更方便哦 ...

  4. 2019 China Collegiate Programming Contest Qinhuangdao Onsite F. Forest Program(DFS计算图中所有环的长度)

    题目链接:https://codeforces.com/gym/102361/problem/F 题意 有 \(n\) 个点和 \(m\) 条边,每条边属于 \(0\) 或 \(1\) 个环,问去掉一 ...

  5. Reading List on Automated Program Repair

    Some resources: https://www.monperrus.net/martin/automatic-software-repair 2017 [ ] DeepFix: Fixing ...

  6. Azure Redis Cache (3) 在Windows 环境下使用Redis Benchmark

    <Windows Azure Platform 系列文章目录> 熟悉Redis环境的读者都知道,我们可以在Linux环境里,使用Redis Benchmark,测试Redis的性能. ht ...

  7. MYSQL BENCHMARK函数的使用

    MYSQL BENCHMARK函数是最重要的函数之一,下文对该函数的使用进行了详尽的分析,如果您对此感兴趣的话,不妨一看. 下文为您介绍的是MYSQL BENCHMARK函数的语法,及一些MYSQL  ...

  8. Benchmark与Profiler---性能调优得力助手

    转载请注明出处:http://blog.csdn.net/gaoyanjie55/article/details/34981077 性能优化.它是一种诊断性能瓶颈,能问题点进行优化的过程.前两天听完s ...

  9. c++性能测试工具:google benchmark入门(一)

    如果你正在寻找一款c++性能测试工具,那么这篇文章是不容错过的. 市面上的benchmark工具或多或少存在一些使用上的不便,那么是否存在一个使用简便又功能强大的性能测试工具呢?答案是google/b ...

随机推荐

  1. __getitem__函数

    主要是为了探究第三行为什么打印出很多提示信息,然后探究了下为什么有第三行这种写法,是因为 这个类中定义了def __getitem__(self, query),这样就可以类似于list那种用法了.但 ...

  2. Some interesting facts about static member functions in C++

    Ref http://www.geeksforgeeks.org/some-interesting-facts-about-static-member-functions-in-c/ 1) stati ...

  3. Vue(八)发送跨域请求

    使用vue-resource发送跨域请求 axios不支持跨域 1 安装vue-resource并引入 cnpm install vue-resource -S 2 基本用法 使用this.$http ...

  4. 【模拟】[NOIP2011]铺地毯[c++]

    题目描述 为了准备一个独特的颁奖典礼,组织者在会场的一片矩形区域(可看做是平面直角坐标系的第一象限)铺上一些矩形地毯,一共有n张地毯,编号从 1 到n.现在将这些地毯按照编号从小到大的顺序平行于坐标轴 ...

  5. Codeforces909C Python Indentation(动态规划)

    http://codeforces.com/problemset/problem/909/C dp[i][j]表示第i行缩进j的方案数. 当第i-1行为f时,无论当前行是f或s都必须缩进,即dp[i] ...

  6. E. Thematic Contests 二分,离散化

    题目意思是给你n个问题即数字,n的大小代表问题所在的话题,题目要求举办多场比赛,每场比赛的只能一种问题,且后一场比赛的问题必须是前一场的两倍,求举办比赛可能最多的问题总数 传送门 解题思路:将出现每种 ...

  7. 论YUV422(YUYV)与YUV420相互转换

    Example 2.13. V4L2_PIX_FMT_YUYV 4 × 4 pixelimage start + 0: Y'00 Cb00 Y'01 Cr00 Y'02 Cb01 Y'03 Cr01 ...

  8. Reactor反应器模式 (epoll)

    1. 背景 最近在看redis源码,主体流程看完了. 在网上看到了reactor模式,看了一下,其实我们经常使用这种模式. 2. 什么是reactor模式 反应器设计模式(Reactor patter ...

  9. Office365 OneDrive Geo Move

    Issue Description: 1. Connect to SPO Service. 2. Validate SPO Service OneDrive Geo move compatibilit ...

  10. shell编程学习笔记(四):Shell中转义字符的输出

    通过echo可以输出字符串,下面看一下怎么输出特殊转义字符,首先我先列出来echo的转义字符: \\ 输入\ \a 输出警告音 \b 退格,即向左删除一个字符 \c 取消输出行末的换行符,和-n选项一 ...