Neural Architecture Search — Limitations and Extensions

2019-09-16 07:46:09

This blog is from: https://towardsdatascience.com/neural-architecture-search-limitations-and-extensions-8141bec7681f

For the past couple of years, researchers and companies have been trying to make deep learning more accessible to non-experts by providing access to pre-trained computer vision or machine translation models. Using a pre-trained model for another task is known as transfer learning, but it still requires sufficient expertise to fine-tune the model on another dataset. Fully automating this procedure allows even more users to benefit from the great progress that has been made in ML to date.

This is called AutoML, and it can cover many parts of predictive modelling such as architecture search and hyperparameter optimization. In this post, I focus on the former, as there has been a recent explosion of methods that search for the “best” architecture for a given dataset. The results presented are based on joint work with Jonathan Lorraine.

Importance of Neural Architecture Search

As a side note, keep in mind that a neural network with just a single hidden layer and non-linear activation function is able to represent any function possible, provided that there are sufficient neurons in that layer (UAT). However, such a simple architecture, though it is theoretically able to learn any function, does not represent the hierarchical processing that occurs in the human visual cortex. The architecture of a neural network gives it an inductive bias, and shallow, wide networks that do not use convolutions are significantly worse at image classification tasks than deep, convolutional networks. Thus, in order for neural networks to generalize and not overfit the training dataset, it’s important to find architectures with the right inductive bias (regardless if those architectures are inspired by the brain or not).

Neural Architecture Search (NAS) Overview

NAS was an inspiring work out of Google that lead to several follow up works such as ENASPNAS, and DARTS. It involves training a recurrent neural network (RNN) controller using reinforcement learning (RL) to automatically generate architectures. These architectures then have their weights trained and are evaluated on a validation set. Their performance on the validation set is the reward signal for the controller which then increases its probability of generating architectures that have done well, and decreases the probability of generating architectures that have done poorly. For non-technical readers, this essentially takes the process of a human manually tweaking a neural network and learning what works well, and automating it. The idea of automatically creating NN architectures was not coined by NAS, as other approaches using methods such as genetic algorithms existed long before, but NAS effectively used RL to efficiently search a space that is prohibitively large to search exhaustively. Below, the components of NAS are analyzed in a bit more depth, before I go on to discuss the limitations of the method as well as its more efficient successor ENAS, as well as an interesting failure mode. The next 2 subsections are best understood while comparing the text again the below figure showing how architectures are sampled and trained:

 

Diagram of how NAS Works

LSTM Controller

The controller generates architectures by making a series of choices for a pre-defined amount of time steps. For example, when generating a convolutional architecture, the controller begins by only creating architectures with 6 layers in them. For each layer, just 4 decisions are made by the controller: filter height, filter width, number of filters, and stride (so 24 time steps). Assuming that the first layer is numbered 0, then the decisions

Neural Architecture Search — Limitations and Extensions的更多相关文章

  1. 论文笔记:Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells

    Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells 2019-04- ...

  2. (转)Illustrated: Efficient Neural Architecture Search ---Guide on macro and micro search strategies in ENAS

    Illustrated: Efficient Neural Architecture Search --- Guide on macro and micro search strategies in  ...

  3. (转)The Evolved Transformer - Enhancing Transformer with Neural Architecture Search

    The Evolved Transformer - Enhancing Transformer with Neural Architecture Search 2019-03-26 19:14:33 ...

  4. 论文笔记:ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

    ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware 2019-03-19 16:13:18 Pape ...

  5. 论文笔记:Progressive Neural Architecture Search

    Progressive Neural Architecture Search 2019-03-18 20:28:13 Paper:http://openaccess.thecvf.com/conten ...

  6. 论文笔记:Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

    Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation2019-03-18 14:4 ...

  7. 论文笔记系列-Neural Architecture Search With Reinforcement Learning

    摘要 神经网络在多个领域都取得了不错的成绩,但是神经网络的合理设计却是比较困难的.在本篇论文中,作者使用 递归网络去省城神经网络的模型描述,并且使用 增强学习训练RNN,以使得生成得到的模型在验证集上 ...

  8. 小米造最强超分辨率算法 | Fast, Accurate and Lightweight Super-Resolution with Neural Architecture Search

    本篇是基于 NAS 的图像超分辨率的文章,知名学术性自媒体 Paperweekly 在该文公布后迅速跟进,发表分析称「属于目前很火的 AutoML / Neural Architecture Sear ...

  9. Research Guide for Neural Architecture Search

    Research Guide for Neural Architecture Search 2019-09-19 09:29:04 This blog is from: https://heartbe ...

随机推荐

  1. SQL必知必会实践--mysql

    -- mysql安装 --   https://www.mysql.com/downloads/

  2. node.js 学习一

    Node.js 是单进程单线程应用程序,但是通过事件和回调支持并发,所以性能非常高. 与PHP 相似 都是单进程. Node.js 的每一个 API 都是异步的,并作为一个独立线程运行,使用异步函数调 ...

  3. 免费的天气API测试接口

    网上几乎所有的天气接口都需要注册key,然后还各种频率限制,每天调用次数才几百次? 太坑爹了吧 一个简单的天气预报功能, 为什么要搞的这么复杂, 收什么费? 推荐一个真正免费的天气API接口, 返回j ...

  4. 使用Arduino连接HC-SR04超声波距离传感器的方法

    距离传感器是机器人项目最有用的传感器之一. HC-SR04是一种便宜的超声波距离传感器,可以帮助您的机器人在房间周围导航.通过一些努力和一个额外的组件,它也可以用作测量设备.在这篇文章中,您将学习到通 ...

  5. test20190925 老L

    100+0+0=100.概率题套路见的太少了,做题策略不是最优的. 排列 给出 n 个二元组,第 i 个二元组为(ai,bi). 将 n 个二元组按照一定顺序排成一列,可以得到一个排列.显然,这样的排 ...

  6. Cocos2d-x学习小结 开始篇

    Cocos2d-x学习小结 开始篇 想要学习Cocos2d-x,是因为在高中物理课上找不到某些物理定律的证明,例如欧姆定律. 为此,我翻阅了稍高等级的物理教材,其中关于欧姆定律\(R=\frac{U} ...

  7. Python中如何使用线程池和进程池?

    进程池的使用 为什么要有进程池?进程池的概念. 在程序实际处理问题过程中,忙时会有成千上万的任务需要被执行,闲时可能只有零星任务. 那么在成千上万个任务需要被执行的时候,我们就需要去创建成千上万个进程 ...

  8. 使用go初步调用etcd

    使用go初步調用etcd package main import ( "context" "go.etcd.io/etcd/clientv3" "ti ...

  9. vue的操作:使用手机访问电脑的页面做法如下:

    安装好node.js 和 npm 后,配置好路由,且可以在电脑中正常访问页面时, 只要修改我的项目:my-project 下面的config--->index.js--->里面的 host ...

  10. IDEA设置类注释和方法注释模板

    背景 在日常开发中,类和方法上希望有属于自己风格的注释模板,此文将记录如何设置IDEA类和方法注释模板. 注意:如果公司有统一的规范模板,请按照公司提供的规范模板去设置,这样可以统一代码注释风格.当然 ...