We often come across 'ablation study' in machine learning papers, for example, in this paper with the original R-CNN, it has a section of ablation studies. But what does this means?

Well, we know that when we build a model, we usually have different components of the model. If we remove some component of the model, what's the effect on the model? This is a very coarse definition of ablation study - we want to see the contributions of some proposed components in the model by comparing the model including this component with that without this component.

In the above paper, in order to see the effect of fine-tuning of the CNN, the authors analyzed the performance of the model with the fine-tuning and the performance of it without the fine-tuning. This way, we can easily see the effect of the fine-tuning.

The following I copied from the answer of Jonathan Uesato on Quora, it explains very well:

An ablation study typically refers to removing some “feature” of the model or algorithm and seeing how that affects performance.
Examples:
    • An LSTM has 4 gates: feature, input, output, forget. We might ask: are all 4 necessary? What if I remove one? Indeed, lots of experimentation has gone into LSTM variants, the GRU being a notable example (which is simpler).
    • If certain tricks are used to get an algorithm to work, it’s useful to know whether the algorithm is robust to removing these tricks. For example, DeepMind’s original DQN paper reports using (1) only periodically updating the reference network and (2) using a replay buffer rather than updating online. It’s very useful for the research community to know that both these tricks are necessary, in order to build on top of these results.
    • If an algorithm is a modification of a previous work, and has multiple differences, researchers want to know what the key difference is.
    • Simpler is better (inductive prior towards simpler model classes). If you can get the same performance with two models, prefer the simpler one.

Ablation Study的更多相关文章

  1. 深度学习研究理解5:Visualizing and Understanding Convolutional Networks(转)

    Visualizing and understandingConvolutional Networks 本文是Matthew D.Zeiler 和Rob Fergus于(纽约大学)13年撰写的论文,主 ...

  2. 《DSOD:Learning Deeply Supervised Object Detectors from Scratch》翻译

    原文地址:https://arxiv.org/pdf/1708.01241 DSOD:从零开始学习深度有监督的目标检测器 Abstract摘要: 我们提出了深入的监督对象检测器(DSOD),一个框架, ...

  3. 论文笔记(2):Deep Crisp Boundaries: From Boundaries to Higher-level Tasks

    ---------------------------------------------------------------------------------------------------- ...

  4. SCNN车道线检测--(SCNN)Spatial As Deep: Spatial CNN for Traffic Scene Understanding(论文解读)

    Spatial As Deep: Spatial CNN for Traffic Scene Understanding 收录:AAAI2018 (AAAI Conference on Artific ...

  5. [Arxiv1706] Few-Example Object Detection with Model Communication 论文笔记

    p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #042eee } p. ...

  6. [论文解读]CNN网络可视化——Visualizing and Understanding Convolutional Networks

    概述 虽然CNN深度卷积网络在图像识别等领域取得的效果显著,但是目前为止人们对于CNN为什么能取得如此好的效果却无法解释,也无法提出有效的网络提升策略.利用本文的反卷积可视化方法,作者发现了AlexN ...

  7. (转)The Evolved Transformer - Enhancing Transformer with Neural Architecture Search

    The Evolved Transformer - Enhancing Transformer with Neural Architecture Search 2019-03-26 19:14:33 ...

  8. Dual Attention Network for Scene Segmentation

    Dual Attention Network for Scene Segmentation 原始文档 https://www.yuque.com/lart/papers/onk4sn 在本文中,我们通 ...

  9. 【中文版 | 论文原文】BERT:语言理解的深度双向变换器预训练

    BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding 谷歌AI语言组论文<BERT:语言 ...

随机推荐

  1. The Preliminary Contest for ICPC Asia Shenyang 2019 F. Honk's pool

    题目链接:https://nanti.jisuanke.com/t/41406 思路:如果k的天数足够大,那么所有水池一定会趋于两种情况: ① 所有水池都是一样的水位,即平均水位 ② 最高水位的水池和 ...

  2. Java期末复习——主观题

    JDK 和 JRE 有什么区别? JDK:Java Development Kit 的简称,java 开发工具包,提供了 java 的开发环境和运行环境. JRE:Java Runtime Envir ...

  3. TL-WDN5200H无线usb网卡在Linux上的使用

    买了个TL-WDN5200H无线usb网卡,但是发现它居然不支持Linux,但是我有时需要在Linux上使用,这就尴尬了.于是到网上搜索资料,终于解决了这个问题. 首先编译安装:https://git ...

  4. mybatis 模糊查询 mapper.xml的写法

    1. sql中字符串拼接 SELECT * FROM tableName WHERE name LIKE CONCAT(CONCAT('%', #{text}), '%'); 2. 使用 ${...} ...

  5. C#的介绍

    C#是一种面向对象的.运行于.net框架上的一种高级程序设计语言. 它的优点在于简单,类型安全,垃圾回收器自动回收内存,封装了许多常用的类,适合快速开发. 它的缺点在于依赖.net框架,跨平台支持有限 ...

  6. adb 命令之push pull

    C:\Users\ceshi>adb pull /storage/emulated/legacy/00001.vcf D:/E:\eclipse\Demo1>adb push E:\ecl ...

  7. Python并发编程之进程通信

    ''' 进程间的通信 ''' """ multiprocessing模块支持进程间通信的两种主要形式:管道和队列 都是基于消息传递实现的, ""&qu ...

  8. 9.consul获取服务实例,调用测试

    package main import ( "context" "fmt" "github.com/go-kit/kit/endpoint" ...

  9. ESA2GJK1DH1K基础篇: 阿里云物联网平台: 测试云平台显示MQTT客户端发过来的消息

    现在这里空空如也 咱自定义的也没有数据 现在就是传上来温度数据,让这里显示温度数据 你发布的主题  /sys/a1m7er1nJbQ/Mqtt/thing/event/property/post 发布 ...

  10. 转载:深度学习在NLP中的应用

    之前研究的CRF算法,在中文分词,词性标注,语义分析中应用非常广泛.但是分词技术只是NLP的一个基础部分,在人机对话,机器翻译中,深度学习将大显身手.这篇文章,将展示深度学习的强大之处,区别于之前用符 ...