(转) How to Train a GAN? Tips and tricks to make GANs work
How to Train a GAN? Tips and tricks to make GANs work
转自:https://github.com/soumith/ganhacks
While research in Generative Adversarial Networks (GANs) continues to improve the fundamental stability of these models, we use a bunch of tricks to train them and make them stable day to day.
Here are a summary of some of the tricks.
Here's a link to the authors of this document
If you find a trick that is particularly useful in practice, please open a Pull Request to add it to the document. If we find it to be reasonable and verified, we will merge it in.
1. Normalize the inputs
- normalize the images between -1 and 1
- Tanh as the last layer of the generator output
2: A modified loss function
In GAN papers, the loss function to optimize G is min (log 1-D)
, but in practice folks practically use max log D
- because the first formulation has vanishing gradients early on
- Goodfellow et. al (2014)
In practice, works well:
- Flip labels when training generator: real = fake, fake = real
3: Use a spherical Z
- Dont sample from a Uniform distribution
- Sample from a gaussian distribution
- When doing interpolations, do the interpolation via a great circle, rather than a straight line from point A to point B
- Tom White's Sampling Generative Networks has more details
4: BatchNorm
- Construct different mini-batches for real and fake, i.e. each mini-batch needs to contain only all real images or all generated images.
- when batchnorm is not an option use instance normalization (for each sample, subtract mean and divide by standard deviation).
5: Avoid Sparse Gradients: ReLU, MaxPool
- the stability of the GAN game suffers if you have sparse gradients
- LeakyReLU = good (in both G and D)
- For Downsampling, use: Average Pooling, Conv2d + stride
- For Upsampling, use: PixelShuffle, ConvTranspose2d + stride
- PixelShuffle: https://arxiv.org/abs/1609.05158
6: Use Soft and Noisy Labels
- Label Smoothing, i.e. if you have two target labels: Real=1 and Fake=0, then for each incoming sample, if it is real, then replace the label with a random number between 0.7 and 1.2, and if it is a fake sample, replace it with 0.0 and 0.3 (for example).
- Salimans et. al. 2016
- make the labels the noisy for the discriminator: occasionally flip the labels when training the discriminator
7: DCGAN / Hybrid Models
- Use DCGAN when you can. It works!
- if you cant use DCGANs and no model is stable, use a hybrid model : KL + GAN or VAE + GAN
8: Use stability tricks from RL
- Experience Replay
- Keep a replay buffer of past generations and occassionally show them
- Keep checkpoints from the past of G and D and occassionaly swap them out for a few iterations
- All stability tricks that work for deep deterministic policy gradients
- See Pfau & Vinyals (2016)
9: Use the ADAM Optimizer
- optim.Adam rules!
- See Radford et. al. 2015
- Use SGD for discriminator and ADAM for generator
10: Track failures early
- D loss goes to 0: failure mode
- check norms of gradients: if they are over 100 things are screwing up
- when things are working, D loss has low variance and goes down over time vs having huge variance and spiking
- if loss of generator steadily decreases, then it's fooling D with garbage (says martin)
11: Dont balance loss via statistics (unless you have a good reason to)
- Dont try to find a (number of G / number of D) schedule to uncollapse training
- It's hard and we've all tried it.
- If you do try it, have a principled approach to it, rather than intuition
For example
while lossD > A:
train D
while lossG > B:
train G
12: If you have labels, use them
- if you have labels available, training the discriminator to also classify the samples: auxillary GANs
13: Add noise to inputs, decay over time
- Add some artificial noise to inputs to D (Arjovsky et. al., Huszar, 2016)
- adding gaussian noise to every layer of generator (Zhao et. al. EBGAN)
- Improved GANs: OpenAI code also has it (commented out)
14: [notsure] Train discriminator more (sometimes)
- especially when you have noise
- hard to find a schedule of number of D iterations vs G iterations
15: [notsure] Batch Discrimination
- Mixed results
16: Discrete variables in Conditional GANs
- Use an Embedding layer
- Add as additional channels to images
- Keep embedding dimensionality low and upsample to match image channel size
Authors
- Soumith Chintala
- Emily Denton
- Martin Arjovsky
- Michael Mathieu
(转) How to Train a GAN? Tips and tricks to make GANs work的更多相关文章
- Matlab tips and tricks
matlab tips and tricks and ... page overview: I created this page as a vectorization helper but it g ...
- LoadRunner AJAX TruClient协议Tips and Tricks
LoadRunner AJAX TruClient协议Tips and Trickshttp://automationqa.com/forum.php?mod=viewthread&tid=2 ...
- Android Studio tips and tricks 翻译学习
Android Studio tips and tricks 翻译 这里是原文的链接. 正文: 如果你对Android Studio和IntelliJ不熟悉,本页提供了一些建议,让你可以从最常见的任务 ...
- Tips and Tricks for Debugging in chrome
Tips and Tricks for Debugging in chrome Pretty print On sources panel ,clicking on the {} on the bot ...
- [转]Tips——Chrome DevTools - 25 Tips and Tricks
Chrome DevTools - 25 Tips and Tricks 原文地址:https://www.keycdn.com/blog/chrome-devtools 如何打开? 1.从浏览器菜单 ...
- Nginx and PHP-FPM Configuration and Optimizing Tips and Tricks
原文链接:http://www.if-not-true-then-false.com/2011/nginx-and-php-fpm-configuration-and-optimizing-tips- ...
- 10 Essential TypeScript Tips And Tricks For Angular Devs
原文: https://www.sitepoint.com/10-essential-typescript-tips-tricks-angular/ ------------------------- ...
- WWDC笔记:2011 Session 125 UITableView Changes, Tips and Tricks
What’s New Automatic Dimensions - (CGFloat)tableView:(UITableView *)tableView heightForHeaderInSect ...
- C++ Tips and Tricks
整理了下在C++工程代码中遇到的技巧与建议. 0x00 巧用宏定义. 经常看见程序员用 enum 值,打印调试信息的时候又想打印数字对应的字符意思.见过有人写这样的代码 if(today == MON ...
随机推荐
- VM虚拟主机怎么设置网络
VMware是很受欢迎的虚拟机,在我们平时的工作中需要经常用到,此文简单总结了平时使用的三种网络配置方式,具体的原理没有去深究.我估计咱也研究不懂! 虚拟主机安装很简单,网上教程有很多,但是有很多新手 ...
- jquery.cookie.js 操作cookie实现记住密码功能的实现代码
jquery.cookie.js操作cookie实现记住密码功能,很简单很强大,喜欢的朋友可以参考下. 复制代码代码如下: //初始化页面时验证是否记住了密码 $(document).ready( ...
- 谈谈黑客攻防技术的成长规律(aullik5)
黑莓末路 昨晚听FM里谈到了RIM这家公司,有分析师认为它需要很悲催的裁员90%,才能保证活下去.这是一个意料之中,但又有点兔死狐悲的消息.可能在不久的将来,RIM这家公司就会走到尽头,或被收购,或申 ...
- 电子表格控件Spreadsheet 对象方法事件详细介绍
1.ActiveCell:返回代表活动单元格的Range只读对象.2.ActiveSheet:返回代表活动工作表的WorkSheet只读对象.3.ActiveWindow:返回表示当前窗口的Windo ...
- MySQL中的while、repeat、loop循环
循环一般在存储过程和存储函数中使用频繁,这里只给出最简单的示例 while delimiter $$ create procedure test_while() begin declare sum i ...
- HTML教程-各窗口间相互操作(Frame Target)
由Frames分出来的几个窗口的内容并不是静止不变的,往往一个窗口的内容随着另一个窗口的要求而不断变化,这就提高了Frames的利用价值.为了完成各窗口之间的相互操作,我们必须为每一个窗口起一个名字, ...
- poj1741 (点分治)
Problem Tree 题目大意 给一棵树,有边权.求树上距离小于等于K的点对有多少. 解题分析 点分治.对每一棵子树进行dfs,求出每棵子树的重心,继而转化为子问题. 对于经过根的路径i--j,令 ...
- css实现隐藏显示
<head> <meta http-equiv="content-type" content="text/html;charset=utf-8" ...
- Sqlserver2005附加数据库时出错提示操作系统错误5(拒绝访问)错误5120的解决办法
Sqlserver2005附加数据库时出错提示操作系统错误5(拒绝访问)错误5120的解决办法 最近几天从网上找了几个asp.net的登录案例想要研究研究代码,结果在用 Sql Server2005附 ...
- JS中直接从java后台获得对象的值(数组的值)
这里举得例子是:JS直接从后台Contorller中(SpringMVC中的model中)获得数值的值 Contorller 此处将 talentIntegralRecordsDay talentIn ...