[综述]Deep Compression/Acceleration深度压缩/加速/量化
Survey
Recent Advances in Efficient Computation of Deep Convolutional Neural Networks, [arxiv '18]
A Survey of Model Compression and Acceleration for Deep Neural Networks [arXiv '17]
Quantization
- The ZipML Framework for Training Models with End-to-End Low Precision: The Cans, the Cannots, and a Little Bit of Deep Learning [ICML'17]
- Compressing Deep Convolutional Networks using Vector Quantization [arXiv'14]
- Quantized Convolutional Neural Networks for Mobile Devices [CVPR '16]
- Fixed-Point Performance Analysis of Recurrent Neural Networks [ICASSP'16]
- Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations [arXiv'16]
- Loss-aware Binarization of Deep Networks [ICLR'17]
- Towards the Limit of Network Quantization [ICLR'17]
- Deep Learning with Low Precision by Half-wave Gaussian Quantization [CVPR'17]
- ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks [arXiv'17]
- Training and Inference with Integers in Deep Neural Networks [ICLR'18]
- Deep Learning with Limited Numerical Precision[ICML'2015]
- Model compression via distillation and quantization [ICLR '18]
- Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy [ICLR '18]
- On the Universal Approximability of Quantized ReLU Neural Networks [arXiv '18]
- Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference [CVPR '18]
- Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 [NIPS '16]
- XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks [ECCV '16]
- Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration [CVPR '17]
- Maxout Networks
- BinaryConnect: Training Deep Neural Networks with binary weights during propagations
- Ternary weight networks
- From Hashing to CNNs: Training Binary Weight Networks via Hashing
- Learning Accurate Low-Bit Deep Neural Networks with Stochastic Quantization
- TRAINED TERNARY QUANTIZATION
- DOREFA-NET: TRAINING LOW BITWIDTH CONVOLUTIONAL NEURAL NETWORKS WITH LOW BITWIDTH GRADIENTS
- Two-Step Quantization for Low-bit Neural Networks
- LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks
- Fixed-point Factorized Networks
- INCREMENTAL NETWORK QUANTIZATION: TOWARDS LOSSLESS CNNS WITH LOW-PRECISION WEIGHTS
- Network Sketching: Exploiting Binary Structure in Deep CNNs
- Towards Effective Low-bitwidth Convolutional Neural Networks
- SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks
- Very deep convolutional networks for large-scale image recognition
- Towards Accurate Binary Convolutional Neural Network
- Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation
Pruning
- Learning both Weights and Connections for Efficient Neural Networks [NIPS'15]
- Pruning Filters for Efficient ConvNets [ICLR'17]
- Pruning Convolutional Neural Networks for Resource Efficient Inference [ICLR'17]
- Soft Weight-Sharing for Neural Network Compression [ICLR'17]
- Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding [ICLR'16]
- Dynamic Network Surgery for Efficient DNNs [NIPS'16]
- Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning [CVPR'17]
- ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression [ICCV'17]
- To prune, or not to prune: exploring the efficacy of pruning for model compression [ICLR'18]
- Data-Driven Sparse Structure Selection for Deep Neural Networks [arXiv '17]
- Learning Structured Sparsity in Deep Neural Networks [NIPS '16]
- Scalpel: Customizing DNN Pruning to the Underlying Hardware Parallelism [ISCA '17]
- Channel Pruning for Accelerating Very Deep Neural Networks [ICCV '17]
- Learning Efficient Convolutional Networks through Network Slimming [ICCV '17]
- NISP: Pruning Networks using Neuron Importance Score Propagation [CVPR '18]
- Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers [ICLR '18]
- MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep Networks [arXiv '17]
- Efficient Sparse-Winograd Convolutional Neural Networks [ICLR '18]
Low-rank Approximation
- Efficient and Accurate Approximations of Nonlinear Convolutional Networks [CVPR'15]
- Accelerating Very Deep Convolutional Networks for Classification and Detection (Extended version of above one)
- Convolutional neural networks with low-rank regularization [arXiv'15]
- Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation [NIPS'14]
- Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications [ICLR'16]
- High performance ultra-low-precision convolutions on mobile devices [NIPS'17]
- Speeding up convolutional neural networks with low rank expansions
- Coordinating Filters for Faster Deep Neural Networks [ICCV '17]
Knowledge Distillation
- Dark knowledge
- FitNets: Hints for Thin Deep Nets [ICLR '15]
- Net2net: Accelerating learning via knowledge transfer [ICLR '16]
- Distilling the Knowledge in a Neural Network [NIPS '15]
- MobileID: Face Model Compression by Distilling Knowledge from Neurons [AAAI '16]
- DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer [arXiv '17]
- Deep Model Compression: Distilling Knowledge from Noisy Teachers [arXiv '16]
- Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer [ICLR '17]
- Like What You Like: Knowledge Distill via Neuron Selectivity Transfer [arXiv '17]
- Learning Efficient Object Detection Models with Knowledge Distillation [NIPS '17]
- Data-Free Knowledge Distillation For Deep Neural Networks [NIPS '17]
- A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learnin [CVPR '17]
- Moonshine: Distilling with Cheap Convolutions [arXiv '17]
- Model compression via distillation and quantization [ICLR '18]
- Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy [ICLR '18]
Miscellaneous
- Beyond Filters: Compact Feature Map for Portable Deep Model [ICML '17]
- SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization [ICML '17]
Reference
- [1] http://chenrudan.github.io/blog/2018/10/02/networkquantization.html
- [2] https://github.com/TerryLoveMl/Model-Compression-Papers
[综述]Deep Compression/Acceleration深度压缩/加速/量化的更多相关文章
- Deep Learning(深度学习)学习笔记整理
申明:本文非笔者原创,原文转载自:http://www.sigvc.org/bbs/thread-2187-1-3.html 4.2.初级(浅层)特征表示 既然像素级的特征表示方法没有作用,那怎样的表 ...
- 【转载】Deep Learning(深度学习)学习笔记整理
http://blog.csdn.net/zouxy09/article/details/8775360 一.概述 Artificial Intelligence,也就是人工智能,就像长生不老和星际漫 ...
- Deep Learning(深度学习)学习笔记整理系列之(八)
Deep Learning(深度学习)学习笔记整理系列 zouxy09@qq.com http://blog.csdn.net/zouxy09 作者:Zouxy version 1.0 2013-04 ...
- DEEP COMPRESSION小记
2016ICLR最佳论文 Deep Compression: Compression Deep Neural Networks With Pruning, Trained Quantization A ...
- [转载]Deep Learning(深度学习)学习笔记整理
转载自:http://blog.csdn.net/zouxy09/article/details/8775360 感谢原作者:zouxy09@qq.com 八.Deep learning训练过程 8. ...
- 【转】Deep Learning(深度学习)学习笔记整理系列之(八)
十.总结与展望 1)Deep learning总结 深度学习是关于自动学习要建模的数据的潜在(隐含)分布的多层(复杂)表达的算法.换句话来说,深度学习算法自动的提取分类需要的低层次或者高层次特征. 高 ...
- Deep Learning(深度学习)学习系列
目录: 一.概述 二.背景 三.人脑视觉机理 四.关于特征 4.1.特征表示的粒度 4.2.初级(浅层)特征表示 4.3.结构性特征表示 4.4 ...
- CUDA上深度学习模型量化的自动化优化
CUDA上深度学习模型量化的自动化优化 深度学习已成功应用于各种任务.在诸如自动驾驶汽车推理之类的实时场景中,模型的推理速度至关重要.网络量化是加速深度学习模型的有效方法.在量化模型中,数据和模型参数 ...
- Deep Learning(深度学习)整理,RNN,CNN,BP
申明:本文非笔者原创,原文转载自:http://www.sigvc.org/bbs/thread-2187-1-3.html 4.2.初级(浅层)特征表示 既然像素级的特征表示方法没有作用,那怎 ...
随机推荐
- Python中eval函数的作用
eval eval函数就是实现list.dict.tuple与str之间的转化str函数把list,dict,tuple转为为字符串# 字符串转换成列表a = "[[1,2], [3,4], ...
- 您必须知道的 Git 分支开发规范
Git 是目前最流行的源代码管理工具. 为规范开发,保持代码提交记录以及 git 分支结构清晰,方便后续维护,现规范 git 的相关操作. 分支管理 分支命名 master 分支 master 为主分 ...
- MUI框架 按钮点击响应不好的问题解决办法
MUI框架 按钮点击响应不好的问题 实际例子: $(function (){ mui(document.body).on('tap', '.bindchk', function(e) { //触发一次 ...
- thinkphp 5.0 在appache下隐藏index.php入口代码
一.在appache的配置文件httpd.conf中开启rewrite_module 二.启用.htaccess的配置 启用.htaccess,需要修改httpd.conf,启用AllowOverri ...
- 第五周博客作业<西北师范大学|李晓婷>
1.助教博客链接:https://home.cnblogs.com/u/lxt-/ 2.作业要求链接:https://www.cnblogs.com/nwnu-daizh/p/10527959.htm ...
- 转载:Centos升级gcc
一.检查centos 里面是否安装了gcc g++ 输入命令:rpm -qa|grep gcc*有看到就出来gcc的东西就是装了没有的话就yum install gcc* -y 二.升级gcc 对于C ...
- 初识Kotlin之变量
用Java开发了很多年,因为工作的需要学习Kotlin.初识Kotlin时是各种不习惯,觉得这个语言相对于Java而言并不够严谨.随着不断的深入,最终还是逃不过"真香定理".我一直 ...
- 分布式协调服务Zookeeper扫盲篇
分布式协调服务Zookeeper扫盲篇 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 身为运维工程师对kubernetes(k8s)可能比较熟,那么etcd(go语言实现)分布式协 ...
- Helm包管理工具(简介、安装、方法)
认识Helm 每次我们要部署一个应用都需要写一个配置清单(维护一套yaml文件),但是每个环境又不一样.部署一套新的环境成本是真的很高.如果我们能够使用类似于yum的工具来安装我们的应用的话那就太好了 ...
- Java基础知识拾遗(二)
Lambda表达式 lambda表达式本质上就是一个匿名方法.但是这个方法不是独立执行的,而是构成了一个函数式接口定义的抽象方法的实现,该函数式接口定义了它的目标类型. 只有在定义了lambda表达式 ...