We often come across 'ablation study' in machine learning papers, for example, in this paper with the original R-CNN, it has a section of ablation studies. But what does this means?

Well, we know that when we build a model, we usually have different components of the model. If we remove some component of the model, what's the effect on the model? This is a very coarse definition of ablation study - we want to see the contributions of some proposed components in the model by comparing the model including this component with that without this component.

In the above paper, in order to see the effect of fine-tuning of the CNN, the authors analyzed the performance of the model with the fine-tuning and the performance of it without the fine-tuning. This way, we can easily see the effect of the fine-tuning.

The following I copied from the answer of Jonathan Uesato on Quora, it explains very well:

An ablation study typically refers to removing some “feature” of the model or algorithm and seeing how that affects performance.
Examples:
    • An LSTM has 4 gates: feature, input, output, forget. We might ask: are all 4 necessary? What if I remove one? Indeed, lots of experimentation has gone into LSTM variants, the GRU being a notable example (which is simpler).
    • If certain tricks are used to get an algorithm to work, it’s useful to know whether the algorithm is robust to removing these tricks. For example, DeepMind’s original DQN paper reports using (1) only periodically updating the reference network and (2) using a replay buffer rather than updating online. It’s very useful for the research community to know that both these tricks are necessary, in order to build on top of these results.
    • If an algorithm is a modification of a previous work, and has multiple differences, researchers want to know what the key difference is.
    • Simpler is better (inductive prior towards simpler model classes). If you can get the same performance with two models, prefer the simpler one.

Ablation Study的更多相关文章

  1. 深度学习研究理解5:Visualizing and Understanding Convolutional Networks(转)

    Visualizing and understandingConvolutional Networks 本文是Matthew D.Zeiler 和Rob Fergus于(纽约大学)13年撰写的论文,主 ...

  2. 《DSOD:Learning Deeply Supervised Object Detectors from Scratch》翻译

    原文地址:https://arxiv.org/pdf/1708.01241 DSOD:从零开始学习深度有监督的目标检测器 Abstract摘要: 我们提出了深入的监督对象检测器(DSOD),一个框架, ...

  3. 论文笔记(2):Deep Crisp Boundaries: From Boundaries to Higher-level Tasks

    ---------------------------------------------------------------------------------------------------- ...

  4. SCNN车道线检测--(SCNN)Spatial As Deep: Spatial CNN for Traffic Scene Understanding(论文解读)

    Spatial As Deep: Spatial CNN for Traffic Scene Understanding 收录:AAAI2018 (AAAI Conference on Artific ...

  5. [Arxiv1706] Few-Example Object Detection with Model Communication 论文笔记

    p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #042eee } p. ...

  6. [论文解读]CNN网络可视化——Visualizing and Understanding Convolutional Networks

    概述 虽然CNN深度卷积网络在图像识别等领域取得的效果显著,但是目前为止人们对于CNN为什么能取得如此好的效果却无法解释,也无法提出有效的网络提升策略.利用本文的反卷积可视化方法,作者发现了AlexN ...

  7. (转)The Evolved Transformer - Enhancing Transformer with Neural Architecture Search

    The Evolved Transformer - Enhancing Transformer with Neural Architecture Search 2019-03-26 19:14:33 ...

  8. Dual Attention Network for Scene Segmentation

    Dual Attention Network for Scene Segmentation 原始文档 https://www.yuque.com/lart/papers/onk4sn 在本文中,我们通 ...

  9. 【中文版 | 论文原文】BERT:语言理解的深度双向变换器预训练

    BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding 谷歌AI语言组论文<BERT:语言 ...

随机推荐

  1. CentOS6.7编译安装mysql5.5(详解编译选项)

    注意!  mysql5.5之前一般都是用make编译 mysql5.5 -5.6 一般都是用cmake编译 cmake : 跨平台编译器, mysql官方提供的rpm包 mysql-client :提 ...

  2. django bms

    1. 创建模型 一对多: 需要在""多""的表创建一个""关键字段"" 关联  就像在mysql的哪项少的比如(书与出版 ...

  3. centos7下安装docker 以及简单使用

    一 环境准备1.虚拟机or物理机 2.centos7系统(稳定,对docker支持友好) 二 安装过程step1:使用yum命令进行安装 yum install -y docker备注:-y 表示不询 ...

  4. 谷歌学术出现We're sorry解决办法

    出现这个的原因应该是同ip段的或者就是这个ip曾经是个google的黑名单ip,因为恶意爬取谷歌学术了.解决办法就是申请Hurricane Electric Free IPv6 Tunnel Brok ...

  5. Python实战之ATM+购物车

    ATM + 购物车 需求分析 ''' - 额度 15000或自定义 - 实现购物商城,买东西加入 购物车,调用信用卡接口结账 - 可以提现,手续费5% - 支持多账户登录 - 支持账户间转账 - 记录 ...

  6. vue-cli3.0启动项目,在局域网内其他电脑通过自己ip访问

    最近一直在使用vue-cli3.0做项目, package.json中配置后,自启动项目,也就没留意过小黑窗, "scripts": { "serve": &q ...

  7. TCP/IP协议总结

    TCP/IP网络协议栈分为四层, 从下至上依次是: 链路层 其实在链路层下面还有物理层, 指的是电信号的传输方式, 比如常见的双绞线网线, 光纤, 以及早期的同轴电缆等, 物理层的设计决定了电信号传输 ...

  8. Qt常用类——Qpoint

    QPoint 类代表一个坐标点,实现在 QtCore 共享库中.它可以认为是一个整型的横坐标和一个整型的纵坐标的组合. 构造 QPoint 类支持以下两种构造方式: QPoint(); // 构造横纵 ...

  9. 常见网页编辑器(富文本,Markdown,代码编辑等)

    编辑器:网页不常用的功能,但却又是不可少的功能,如果要造个编辑器轮子,它可以把人玩死!!前端几大禁忌就有富文本, 为什么都说富文本编辑器是天坑? 下面记录一下常见的一些编辑器,该文随时更新: 富文本编 ...

  10. Cannot read lifecycle mapping metadata for artifact org.apache.maven.plugins:mav问题

    1.导致问题原因:从装系统,从win7改到win10 由于重装了系统,打开eclipse时,maven验证会出错,点击pom文件,会发现有红色的Cannot read lifecycle mappin ...