(转)Awesome Knowledge Distillation
Awesome Knowledge Distillation
2018-07-19 10:38:40
Reference:https://github.com/dkozlov/awesome-knowledge-distillation
Papers
- Combining labeled and unlabeled data with co-training, A. Blum, T. Mitchell, 1998
- Model Compression, Rich Caruana, 2006
- Dark knowledge, Geoffrey Hinton , OriolVinyals & Jeff Dean, 2014
- Learning with Pseudo-Ensembles, Philip Bachman, Ouais Alsharif, Doina Precup, 2014
- Distilling the Knowledge in a Neural Network, Hinton, J.Dean, 2015
- Cross Modal Distillation for Supervision Transfer, Saurabh Gupta, Judy Hoffman, Jitendra Malik, 2015
- Heterogeneous Knowledge Transfer in Video Emotion Recognition, Attribution and Summarization, Baohan Xu, Yanwei Fu, Yu-Gang Jiang, Boyang Li, Leonid Sigal, 2015
- Distilling Model Knowledge, George Papamakarios, 2015
- Unifying distillation and privileged information, David Lopez-Paz, Léon Bottou, Bernhard Schölkopf, Vladimir Vapnik, 2015
- Learning Using Privileged Information: Similarity Control and Knowledge Transfer, Vladimir Vapnik, Rauf Izmailov, 2015
- Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks, Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, Ananthram Swami, 2016
- Do deep convolutional nets really need to be deep and convolutional?, Gregor Urban, Krzysztof J. Geras, Samira Ebrahimi Kahou, Ozlem Aslan, Shengjie Wang, Rich Caruana, Abdelrahman Mohamed, Matthai Philipose, Matt Richardson, 2016
- Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, Sergey Zagoruyko, Nikos Komodakis, 2016
- FitNets: Hints for Thin Deep Nets, Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, Yoshua Bengio, 2015
- Deep Model Compression: Distilling Knowledge from Noisy Teachers, Bharat Bhusan Sau, Vineeth N. Balasubramanian, 2016
- Knowledge Distillation for Small-footprint Highway Networks, Liang Lu, Michelle Guo, Steve Renals, 2016
- Sequence-Level Knowledge Distillation, deeplearning-papernotes, Yoon Kim, Alexander M. Rush, 2016
- MobileID: Face Model Compression by Distilling Knowledge from Neurons, Ping Luo, Zhenyao Zhu, Ziwei Liu, Xiaogang Wang and Xiaoou Tang, 2016
- Recurrent Neural Network Training with Dark Knowledge Transfer, Zhiyuan Tang, Dong Wang, Zhiyong Zhang, 2016
- Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, Sergey Zagoruyko, Nikos Komodakis, 2016
- Adapting Models to Signal Degradation using Distillation, Jong-Chyi Su, Subhransu Maji,2016
- Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, Antti Tarvainen, Harri Valpola, 2017
- Data-Free Knowledge Distillation For Deep Neural Networks, Raphael Gontijo Lopes, Stefano Fenu, 2017
- Like What You Like: Knowledge Distill via Neuron Selectivity Transfer, Zehao Huang, Naiyan Wang, 2017
- Learning Loss for Knowledge Distillation with Conditional Adversarial Networks, Zheng Xu, Yen-Chang Hsu, Jiawei Huang, 2017
- DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer, Yuntao Chen, Naiyan Wang, Zhaoxiang Zhang, 2017
- Knowledge Projection for Deep Neural Networks, Zhi Zhang, Guanghan Ning, Zhihai He, 2017
- Moonshine: Distilling with Cheap Convolutions, Elliot J. Crowley, Gavin Gray, Amos Storkey, 2017
- Local Affine Approximators for Improving Knowledge Transfer, Suraj Srinivas and Francois Fleuret, 2017
- Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model, Jiasen Lu1, Anitha Kannan, Jianwei Yang, Devi Parikh, Dhruv Batra 2017
- Learning Efficient Object Detection Models with Knowledge Distillation, Guobin Chen, Wongun Choi, Xiang Yu, Tony Han, Manmohan Chandraker, 2017
- Model Distillation with Knowledge Transfer from Face Classification to Alignment and Verification, Chong Wang, Xipeng Lan and Yangang Zhang, 2017
- Learning Transferable Architectures for Scalable Image Recognition, Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc V. Le, 2017
- Revisiting knowledge transfer for training object class detectors, Jasper Uijlings, Stefan Popov, Vittorio Ferrari, 2017
- A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning, Junho Yim, Donggyu Joo, Jihoon Bae, Junmo Kim, 2017
- Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net, Zihao Liu, Qi Liu, Tao Liu, Yanzhi Wang, Wujie Wen, 2017
- Data Distillation: Towards Omni-Supervised Learning, Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, Kaiming He, 2017
- Interpreting Deep Classifiers by Visual Distillation of Dark Knowledge, Kai Xu, Dae Hoon Park, Chang Yi, Charles Sutton, 2018
- Efficient Neural Architecture Search via Parameters Sharing, Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, Jeff Dean, 2018
- Transparent Model Distillation, Sarah Tan, Rich Caruana, Giles Hooker, Albert Gordo, 2018
- Defensive Collaborative Multi-task Training - Defending against Adversarial Attack towards Deep Neural Networks, Derek Wang, Chaoran Li, Sheng Wen, Yang Xiang, Wanlei Zhou, Surya Nepal, 2018
- Deep Co-Training for Semi-Supervised Image Recognition, Siyuan Qiao, Wei Shen, Zhishuai Zhang, Bo Wang, Alan Yuille, 2018
- Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples, Zihao Liu, Qi Liu, Tao Liu, Yanzhi Wang, Wujie Wen, 2018
- Multimodal Recurrent Neural Networks with Information Transfer Layers for Indoor Scene Labeling, Abrar H. Abdulnabi, Bing Shuai, Zhen Zuo, Lap-Pui Chau, Gang Wang, 2018
Videos
- Dark knowledge, Geoffrey Hinton, 2014
- Model Compression, Rich Caruana, 2016
Implementations
MXNet
PyTorch
- Attention Transfer
- Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model
- Interpreting Deep Classifier by Visual Distillation of Dark Knowledge
- A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility
- Mean teachers are better role models
Lua
Torch
- Distilling knowledge to specialist ConvNets for clustered classification
- Sequence-Level Knowledge Distillation, Neural Machine Translation on Android
- cifar.torch distillation
Theano
- FitNets: Hints for Thin Deep Nets
- Transfer knowledge from a large DNN or an ensemble of DNNs into a small DNN
Lasagne + Theano
Tensorflow
- Deep Model Compression: Distilling Knowledge from Noisy Teachers
- Distillation
- An example application of neural network distillation to MNIST
- Data-free Knowledge Distillation for Deep Neural Networks
- Inspired by net2net, network distillation
- Deep Reinforcement Learning, knowledge transfer
- Knowledge Distillation using Tensorflow
Caffe
- Face Model Compression by Distilling Knowledge from Neurons
- KnowledgeDistillation Layer (Caffe implementation)
- Knowledge distillation, realized in caffe
- Cross Modal Distillation for Supervision Transfer
Keras
- Knowledge distillation with Keras
- keras google-vision's distillation
- Distilling the knowledge in a Neural Network
(转)Awesome Knowledge Distillation的更多相关文章
- Focal and Global Knowledge Distillation for Detectors
一. 概述 论文地址:链接 代码地址:链接 论文简介: 此篇论文是在CGNet上增加部分限制loss而来 核心部分是将gt框变为mask进行蒸馏 注释:仅为阅读论文和代码,未进行试验,如有漏错请不吝指 ...
- Feature Fusion for Online Mutual Knowledge Distillation (CVPR 2019)
一.解决问题 如何将特征融合与知识蒸馏结合起来,提高模型性能 二.创新点 支持多子网络分支的在线互学习 子网络可以是相同结构也可以是不同结构 应用特征拼接.depthwise+pointwise,将特 ...
- The Brain as a Universal Learning Machine
The Brain as a Universal Learning Machine This article presents an emerging architectural hypothesis ...
- Classifying plankton with deep neural networks
Classifying plankton with deep neural networks The National Data Science Bowl, a data science compet ...
- [综述]Deep Compression/Acceleration深度压缩/加速/量化
Survey Recent Advances in Efficient Computation of Deep Convolutional Neural Networks, [arxiv '18] A ...
- Bag of Tricks for Image Classification with Convolutional Neural Networks论文笔记
一.高效的训练 1.Large-batch training 使用大的batch size可能会减小训练过程(收敛的慢?我之前训练的时候挺喜欢用较大的batch size),即在相同的迭代次数 ...
- 论文笔记:Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells
Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells 2019-04- ...
- ICML 2018 | 从强化学习到生成模型:40篇值得一读的论文
https://blog.csdn.net/y80gDg1/article/details/81463731 感谢阅读腾讯AI Lab微信号第34篇文章.当地时间 7 月 10-15 日,第 35 届 ...
- 【转载】NeurIPS 2018 | 腾讯AI Lab详解3大热点:模型压缩、机器学习及最优化算法
原文:NeurIPS 2018 | 腾讯AI Lab详解3大热点:模型压缩.机器学习及最优化算法 导读 AI领域顶会NeurIPS正在加拿大蒙特利尔举办.本文针对实验室关注的几个研究热点,模型压缩.自 ...
随机推荐
- win10系统进入BIOS
按住shift+重启,在重启过程中界面会出现“疑难解答”,点击后,在新的界面点击“高级选项”,之后在新界面上点击“UEFI固件设置”,最后点击重启,重启过程中点击Delete键,就进入了BIOS界面了 ...
- Centos 6.5初始化配置
安装好centos 6.5 # -*- coding:utf-8 -*- import win32api import time import os from Tkinter import * are ...
- uva11990 动态逆序对
这题说的是给了一个数组,按照他给的顺序依次删除数,在删除之前输出此时的逆序对个数 我们用Fenwick树 维护这整个数列, C[i]是一个 treap的头, 管理了在树状数组中 能影响他的点,然后我们 ...
- 【Hadoop学习之一】Hadoop介绍
一.概念 Hadoop是一个能够对大量数据进行分布式处理的软件框架,充分利用集群的威力进行高速运算和存储. 二.主要模块Hadoop Common:支持其他Hadoop模块的常用实用程序.Hadoop ...
- mysql 问题:Unknown system variable 'query_cache_size'
报错:Unknown system variable 'query_cache_size' mysql 的 java 驱动等级比较低,与mysql 数据库不匹配.
- Hive中实现group concat功能(不用udf)
在 Hive 中实现将一个字段的多条记录拼接成一个记录: hive> desc t; OK id string str string Time taken: 0.249 seconds hive ...
- 调用微信JS-SDK配置签名
前后端进行分开开发: 1:后端实现获取 +++接口凭证:access_token (公众号的全局唯一接口调用凭据) ** GET 获取:https://api.weixin.qq.com/cgi-bi ...
- linux系统Centos环境下搭建SVN服务器及权限配置
linux系统Centos环境下如何搭建SVN服务器以及svnserve.conf.authz.passwd配置文件详细介绍 至于svn的概念,这里就不做详细阐述了,可以自行百度.简单来讲就是一个 ...
- slideDown留言板
<!doctype html> <html lang="en"> <head> <meta http-equiv="Conten ...
- The Little Prince-11/28
The Little Prince-11/28 Today I find some beautiful words from the book. You know -- one loves the s ...