Neural Architecture Search — Limitations and Extensions

2019-09-16 07:46:09

This blog is from: https://towardsdatascience.com/neural-architecture-search-limitations-and-extensions-8141bec7681f

For the past couple of years, researchers and companies have been trying to make deep learning more accessible to non-experts by providing access to pre-trained computer vision or machine translation models. Using a pre-trained model for another task is known as transfer learning, but it still requires sufficient expertise to fine-tune the model on another dataset. Fully automating this procedure allows even more users to benefit from the great progress that has been made in ML to date.

This is called AutoML, and it can cover many parts of predictive modelling such as architecture search and hyperparameter optimization. In this post, I focus on the former, as there has been a recent explosion of methods that search for the “best” architecture for a given dataset. The results presented are based on joint work with Jonathan Lorraine.

Importance of Neural Architecture Search

As a side note, keep in mind that a neural network with just a single hidden layer and non-linear activation function is able to represent any function possible, provided that there are sufficient neurons in that layer (UAT). However, such a simple architecture, though it is theoretically able to learn any function, does not represent the hierarchical processing that occurs in the human visual cortex. The architecture of a neural network gives it an inductive bias, and shallow, wide networks that do not use convolutions are significantly worse at image classification tasks than deep, convolutional networks. Thus, in order for neural networks to generalize and not overfit the training dataset, it’s important to find architectures with the right inductive bias (regardless if those architectures are inspired by the brain or not).

Neural Architecture Search (NAS) Overview

NAS was an inspiring work out of Google that lead to several follow up works such as ENASPNAS, and DARTS. It involves training a recurrent neural network (RNN) controller using reinforcement learning (RL) to automatically generate architectures. These architectures then have their weights trained and are evaluated on a validation set. Their performance on the validation set is the reward signal for the controller which then increases its probability of generating architectures that have done well, and decreases the probability of generating architectures that have done poorly. For non-technical readers, this essentially takes the process of a human manually tweaking a neural network and learning what works well, and automating it. The idea of automatically creating NN architectures was not coined by NAS, as other approaches using methods such as genetic algorithms existed long before, but NAS effectively used RL to efficiently search a space that is prohibitively large to search exhaustively. Below, the components of NAS are analyzed in a bit more depth, before I go on to discuss the limitations of the method as well as its more efficient successor ENAS, as well as an interesting failure mode. The next 2 subsections are best understood while comparing the text again the below figure showing how architectures are sampled and trained:

 

Diagram of how NAS Works

LSTM Controller

The controller generates architectures by making a series of choices for a pre-defined amount of time steps. For example, when generating a convolutional architecture, the controller begins by only creating architectures with 6 layers in them. For each layer, just 4 decisions are made by the controller: filter height, filter width, number of filters, and stride (so 24 time steps). Assuming that the first layer is numbered 0, then the decisions

Neural Architecture Search — Limitations and Extensions的更多相关文章

  1. 论文笔记:Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells

    Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells 2019-04- ...

  2. (转)Illustrated: Efficient Neural Architecture Search ---Guide on macro and micro search strategies in ENAS

    Illustrated: Efficient Neural Architecture Search --- Guide on macro and micro search strategies in  ...

  3. (转)The Evolved Transformer - Enhancing Transformer with Neural Architecture Search

    The Evolved Transformer - Enhancing Transformer with Neural Architecture Search 2019-03-26 19:14:33 ...

  4. 论文笔记:ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

    ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware 2019-03-19 16:13:18 Pape ...

  5. 论文笔记:Progressive Neural Architecture Search

    Progressive Neural Architecture Search 2019-03-18 20:28:13 Paper:http://openaccess.thecvf.com/conten ...

  6. 论文笔记:Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

    Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation2019-03-18 14:4 ...

  7. 论文笔记系列-Neural Architecture Search With Reinforcement Learning

    摘要 神经网络在多个领域都取得了不错的成绩,但是神经网络的合理设计却是比较困难的.在本篇论文中,作者使用 递归网络去省城神经网络的模型描述,并且使用 增强学习训练RNN,以使得生成得到的模型在验证集上 ...

  8. 小米造最强超分辨率算法 | Fast, Accurate and Lightweight Super-Resolution with Neural Architecture Search

    本篇是基于 NAS 的图像超分辨率的文章,知名学术性自媒体 Paperweekly 在该文公布后迅速跟进,发表分析称「属于目前很火的 AutoML / Neural Architecture Sear ...

  9. Research Guide for Neural Architecture Search

    Research Guide for Neural Architecture Search 2019-09-19 09:29:04 This blog is from: https://heartbe ...

随机推荐

  1. Windows VNC远程连接用法

    VNC (Virtual Network Console)是虚拟网络控制台 被控端 被控端需要打开服务,等待主控端连接 服务端已经启动成功,右下角有小图标 主控端 打开主控端,连接被控端 输入被控端i ...

  2. 文件的内存读取 ,以及image图片(二进制)的读取

    #在python2.x中导入模块方法: from StringIO import String #在python2.x中它还有个孪生兄弟,运行速度比它快,用c实现的 from cStringIO im ...

  3. 解决加载WEB页面时,由于JS文件引用过多影响页面打开速度的问题

    1.一般做法 一般我们会把所有的<script>元素都应该放在页面的<head>标签里,但由于是顺序加载,因此只有当所有JavaScript代码都被依次下载.解析和执行完之后, ...

  4. ubuntu18.04搭建NFS服务器

    系统环境: NFS服务器操作系统: ubuntu18.04 server lts NFS服务器IP:  192.168.1.164 注: NFS服务器 指的是 待安装 NFS服务 的机器(物理机或者虚 ...

  5. golang静态编译

    golang 的编译(不涉及 cgo 编译的前提下)默认使用了静态编译,不依赖任何动态链接库. 这样可以任意部署到各种运行环境,不用担心依赖库的版本问题.只是体积大一点而已,存储时占用了一点磁盘,运行 ...

  6. Pthon面向对象-异常处理

    Pthon面向对象-异常处理 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.异常概述 1>.错误(Error) 逻辑错误: 算法写错了,例如加法写成了减法. 笔误: 例如 ...

  7. 百度语音合成api/sdk及demo

    1.流程 1)换取token 用Api Key 和 SecretKey.访问https://openapi.baidu.com/oauth/2.0/token 换取 token // appKey = ...

  8. OpenStack核心组件-nova计算服务

    1. nova介绍 Nova 是 OpenStack 最核心的服务,负责维护和管理云环境的计算资源.OpenStack 作为 IaaS 的云操作系统,虚拟机生命周期管理也就是通过 Nova 来实现的. ...

  9. discuz x3.4 开启tags聚合标签及伪静态配置方法

    因为SEO的需要,要做tags聚合到一个页面,做到伪静态. 例如: misc.php?mod=tag >>> /tag/ misc.php?mod=tag&id=47 > ...

  10. Spring源码窥探之:@Profile

    Spring为我们提供的多环境启动 1. 配置类,注入三个不同环境的数据源,并加上注解 /** * description: 以下准备了三套不同环境的数据源 * * @author 70KG * @d ...