We often come across 'ablation study' in machine learning papers, for example, in this paper with the original R-CNN, it has a section of ablation studies. But what does this means?

Well, we know that when we build a model, we usually have different components of the model. If we remove some component of the model, what's the effect on the model? This is a very coarse definition of ablation study - we want to see the contributions of some proposed components in the model by comparing the model including this component with that without this component.

In the above paper, in order to see the effect of fine-tuning of the CNN, the authors analyzed the performance of the model with the fine-tuning and the performance of it without the fine-tuning. This way, we can easily see the effect of the fine-tuning.

The following I copied from the answer of Jonathan Uesato on Quora, it explains very well:

An ablation study typically refers to removing some “feature” of the model or algorithm and seeing how that affects performance.
Examples:
    • An LSTM has 4 gates: feature, input, output, forget. We might ask: are all 4 necessary? What if I remove one? Indeed, lots of experimentation has gone into LSTM variants, the GRU being a notable example (which is simpler).
    • If certain tricks are used to get an algorithm to work, it’s useful to know whether the algorithm is robust to removing these tricks. For example, DeepMind’s original DQN paper reports using (1) only periodically updating the reference network and (2) using a replay buffer rather than updating online. It’s very useful for the research community to know that both these tricks are necessary, in order to build on top of these results.
    • If an algorithm is a modification of a previous work, and has multiple differences, researchers want to know what the key difference is.
    • Simpler is better (inductive prior towards simpler model classes). If you can get the same performance with two models, prefer the simpler one.

Ablation Study的更多相关文章

  1. 深度学习研究理解5:Visualizing and Understanding Convolutional Networks(转)

    Visualizing and understandingConvolutional Networks 本文是Matthew D.Zeiler 和Rob Fergus于(纽约大学)13年撰写的论文,主 ...

  2. 《DSOD:Learning Deeply Supervised Object Detectors from Scratch》翻译

    原文地址:https://arxiv.org/pdf/1708.01241 DSOD:从零开始学习深度有监督的目标检测器 Abstract摘要: 我们提出了深入的监督对象检测器(DSOD),一个框架, ...

  3. 论文笔记(2):Deep Crisp Boundaries: From Boundaries to Higher-level Tasks

    ---------------------------------------------------------------------------------------------------- ...

  4. SCNN车道线检测--(SCNN)Spatial As Deep: Spatial CNN for Traffic Scene Understanding(论文解读)

    Spatial As Deep: Spatial CNN for Traffic Scene Understanding 收录:AAAI2018 (AAAI Conference on Artific ...

  5. [Arxiv1706] Few-Example Object Detection with Model Communication 论文笔记

    p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #042eee } p. ...

  6. [论文解读]CNN网络可视化——Visualizing and Understanding Convolutional Networks

    概述 虽然CNN深度卷积网络在图像识别等领域取得的效果显著,但是目前为止人们对于CNN为什么能取得如此好的效果却无法解释,也无法提出有效的网络提升策略.利用本文的反卷积可视化方法,作者发现了AlexN ...

  7. (转)The Evolved Transformer - Enhancing Transformer with Neural Architecture Search

    The Evolved Transformer - Enhancing Transformer with Neural Architecture Search 2019-03-26 19:14:33 ...

  8. Dual Attention Network for Scene Segmentation

    Dual Attention Network for Scene Segmentation 原始文档 https://www.yuque.com/lart/papers/onk4sn 在本文中,我们通 ...

  9. 【中文版 | 论文原文】BERT:语言理解的深度双向变换器预训练

    BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding 谷歌AI语言组论文<BERT:语言 ...

随机推荐

  1. linux的性能调优

    单机调优: 分析性能瓶颈的原因,解决它.   cpu子系统 内存子系统 IO子系统 网络系统       @cpu子系统调优 cpu技术指标 xeon E5520 2.27GHz 8192kb # c ...

  2. jquery-ui提供的拖拽方法

    项目当中遇到了任意拖动div标签的功能,找到了jqueryui提供的draggable的插件,这个插件可以实现任意的div的移动,也可以移动到整个屏幕或者在父元素的范围内进行移动. 插件的api    ...

  3. eslint的语法配置项

    其实我并不反对这些语法检测,但是像许多反个人意愿的那就真的不得不吐槽了,比如vue-cli脚手架创建的默认eslint规则: 代码末尾不能加分号 ; 代码中不能存在多行空行 tab键不能使用,必须换成 ...

  4. django rest framework 解析器组件 接口设计,视图组件 (1)

    一.解析器组件 -解析器组件是用来解析用户请求数据的(application/json), content-type 将客户端发来的json数据进行解析 -必须适应APIView -request.d ...

  5. mybatis 模糊查询 mapper.xml的写法

    1. sql中字符串拼接 SELECT * FROM tableName WHERE name LIKE CONCAT(CONCAT('%', #{text}), '%'); 2. 使用 ${...} ...

  6. udp,select超时和recvfrom收不到数据原因

    wirshark抓包,发现有数据.但是select超时,直接recvfrom又失败. 代码中需要改进:select超时后,会移除fd_set集合中超时的那个句柄,所以每次要重新进行FD_SET,然后再 ...

  7. C语言常用库函数实现

    1.memcpy函数 memcpy 函数用于 把资源内存(src所指向的内存区域) 拷贝到目标内存(dest所指向的内存区域):拷贝多少个?有一个size变量控制拷贝的字节数: 函数原型:void * ...

  8. css字体名称中英文对照表

    华文细黑:STHeiti Light [STXihei] 华文黑体:STHeiti 华文楷体:STKaiti 华文宋体:STSong 华文仿宋:STFangsong 俪黑 Pro:LiHei Pro ...

  9. CNCF LandScape Summary

    CNCF Cloud Native Interactive Landscape 1. App Definition and Development 1. Database Vitess:itess i ...

  10. MySql数据库中的datediff函数

    MySql数据库中的datediff函数:主要是用来返回两个日期之间相隔的天数   一半情况下是大日期在前,小日期在后的 这样写也是能够运行的 要注意查询结果: