http://handong1587.github.io/deep_learning/2015/10/09/training-dnn.html  //转载于

Training Deep Neural Networks

 Published: 09 Oct 2015  Category: deep_learning

Tutorials

Popular Training Approaches of DNNs — A Quick Overview

https://medium.com/@asjad/popular-training-approaches-of-dnns-a-quick-overview-26ee37ad7e96#.pqyo039bb

Activation functions

Rectified linear units improve restricted boltzmann machines (ReLU)

Rectifier Nonlinearities Improve Neural Network Acoustic Models (leaky-ReLU, aka LReLU)

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification (PReLU)

Empirical Evaluation of Rectified Activations in Convolutional Network (ReLU/LReLU/PReLU/RReLU)

Deep Learning with S-shaped Rectified Linear Activation Units (SReLU)

Parametric Activation Pools greatly increase performance and consistency in ConvNets

Noisy Activation Functions

Weights Initialization

An Explanation of Xavier Initialization

Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?

All you need is a good init

Data-dependent Initializations of Convolutional Neural Networks

What are good initial weights in a neural network?

RandomOut: Using a convolutional gradient norm to win The Filter Lottery

Batch Normalization

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift(ImageNet top-5 error: 4.82%)

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Normalization Propagation: A Parametric Technique for Removing Internal Covariate Shift in Deep Networks

Loss Function

The Loss Surfaces of Multilayer Networks

Optimization Methods

On Optimization Methods for Deep Learning

On the importance of initialization and momentum in deep learning

Invariant backpropagation: how to train a transformation-invariant neural network

A practical theory for designing very deep convolutional neural network

Stochastic Optimization Techniques

Alec Radford’s animations for optimization algorithms

http://www.denizyuret.com/2015/03/alec-radfords-animations-for.html

Faster Asynchronous SGD (FASGD)

An overview of gradient descent optimization algorithms (★★★★★)

Exploiting the Structure: Stochastic Gradient Methods Using Raw Clusters

Writing fast asynchronous SGD/AdaGrad with RcppParallel

Regularization

DisturbLabel: Regularizing CNN on the Loss Layer [University of California & MSR] (2016)

Dropout

Improving neural networks by preventing co-adaptation of feature detectors (Dropout)

Regularization of Neural Networks using DropConnect

Regularizing neural networks with dropout and with DropConnect

Fast dropout training

Dropout as data augmentation

A Theoretically Grounded Application of Dropout in Recurrent Neural Networks

Improved Dropout for Shallow and Deep Learning

Gradient Descent

Fitting a model via closed-form equations vs. Gradient Descent vs Stochastic Gradient Descent vs Mini-Batch Learning. What is the difference?(Normal Equations vs. GD vs. SGD vs. MB-GD)

http://sebastianraschka.com/faq/docs/closed-form-vs-gd.html

An Introduction to Gradient Descent in Python

Train faster, generalize better: Stability of stochastic gradient descent

A Variational Analysis of Stochastic Gradient Algorithms

The vanishing gradient problem: Oh no — an obstacle to deep learning!

Gradient Descent For Machine Learning

http://machinelearningmastery.com/gradient-descent-for-machine-learning/

Revisiting Distributed Synchronous SGD

Accelerate Training

Acceleration of Deep Neural Network Training with Resistive Cross-Point Devices

Image Data Augmentation

DataAugmentation ver1.0: Image data augmentation tool for training of image recognition algorithm

Caffe-Data-Augmentation: a branc caffe with feature of Data Augmentation using a configurable stochastic combination of 7 data augmentation techniques

Papers

Scalable and Sustainable Deep Learning via Randomized Hashing

Tools

pastalog: Simple, realtime visualization of neural network training performance

torch-pastalog: A Torch interface for pastalog - simple, realtime visualization of neural network training performance

Training Deep Neural Networks的更多相关文章

  1. Training (deep) Neural Networks Part: 1

    Training (deep) Neural Networks Part: 1 Nowadays training deep learning models have become extremely ...

  2. CVPR 2018paper: DeepDefense: Training Deep Neural Networks with Improved Robustness第一讲

    前言:好久不见了,最近一直瞎忙活,博客好久都没有更新了,表示道歉.希望大家在新的一年中工作顺利,学业进步,共勉! 今天我们介绍深度神经网络的缺点:无论模型有多深,无论是卷积还是RNN,都有的问题:以图 ...

  3. 论文翻译:BinaryConnect: Training Deep Neural Networks with binary weights during propagations

    目录 摘要 1.引言 2.BinaryConnect 2.1 +1 or -1 2.2确定性与随机性二值化 2.3 Propagations vs updates 2.4 Clipping 2.5 A ...

  4. 论文翻译:BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or −1

    目录 摘要 引言 1.BinaryNet 符号函数 梯度计算和累积 通过离散化传播梯度 一些有用的成分 算法1 使用BinaryNet训练DNN 算法2 批量标准化转换(Ioffe和Szegedy,2 ...

  5. 为什么深度神经网络难以训练Why are deep neural networks hard to train?

    Imagine you're an engineer who has been asked to design a computer from scratch. One day you're work ...

  6. This instability is a fundamental problem for gradient-based learning in deep neural networks. vanishing exploding gradient problem

    The unstable gradient problem: The fundamental problem here isn't so much the vanishing gradient pro ...

  7. [C4] Andrew Ng - Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization

    About this Course This course will teach you the "magic" of getting deep learning to work ...

  8. [Box] Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint

    目录 概 主要内容 LSGD Box 初始化 Box for Resnet 代码 Cyr E C, Gulian M, Patel R G, et al. Robust Training and In ...

  9. On Explainability of Deep Neural Networks

    On Explainability of Deep Neural Networks « Learning F# Functional Data Structures and Algorithms is ...

随机推荐

  1. fedora 配置

    静态ip配置 vi /etc/sysconfig/network-scripts/ifcfg-ens33 [root@localhost network-scripts]# cat ifcfg-ens ...

  2. hpp头文件与h头文件的区别

    hpp,其实质就是将.cpp的实现代码混入.h头文件当中,定义与实现都包含在同一文件,则该类的调用者只需要include该hpp文件即可,无需再将cpp加入到project中进行编译.而实现代码将直接 ...

  3. Laravel 校验规则之字段值唯一性校验

    版权声明:本文为博主原创文章,未经博主允许不得转载. 目录(?)[+] laravel validator unique 'name' => 'required|unique:test,disp ...

  4. bootstrap-面板、modal

    面板: <!-- panel 面板 panel-heading 面板头部 panel-title 面板标题样式 panel-body 面板内容 --> <div class=&quo ...

  5. NBU7.0.1迁移C:\Veritas\Netbackup\db到其他盘

    原来NBU MASTER安装在C盘了,导致C盘空间剩余很少,在官网找了一个解决方案如下: - before any operation, of course backup your catalog a ...

  6. centos 用dvd创建yum 仓库

    环境:CentOS 6.0 默认的yum是以网络来安装的,在没有网络或者网速不佳的情况下,通过yum来安装软件是意见非常痛苦的事情.其实对于CentOS DVD来说,里面提供的软件就足以满足我们的需要 ...

  7. 转载自@机智的新手:使用Auto Layout中的VFL(Visual format language)--代码实现自动布局

    本文将通过简单的UI来说明如何用VFL来实现自动布局.在自动布局的时候避免不了使用代码来加以优化以及根据内容来实现不同的UI. 一:API介绍 NSLayoutConstraint API 1 2 3 ...

  8. js创建节点,小试牛刀

    实现如下的功能 非常简单的一个小训练. 思想: 1.首先创建text和一个button 代码如下. <body> <input type="text" id=&q ...

  9. 用通俗易懂的大白话讲解Map/Reduce原理

    Hadoop简介 Hadoop就是一个实现了Google云计算系统的开源系统,包括并行计算模型Map/Reduce,分布式文件系统HDFS,以及分布式数据库Hbase,同时Hadoop的相关项目也很丰 ...

  10. 第五百八十五天 how can I 坚持

    时间过得真的好快啊,晚上不一会就十一点多了,稍微一堕落,时间就没了,还没来得及好好看会书.. 终于把solr拼音搜索弄好了,明天搞搞suggest. 写字,睡觉.