Contents

Attention

  • Recurrent Models of Visual Attention [2014 deepmind NIPS]
  • Neural Machine Translation by Jointly Learning to Align and Translate [ICLR 2015]

OverallSurvey

  • Efficient Transformers: A Survey [paper]
  • A Survey on Visual Transformer [paper]
  • Transformers in Vision: A Survey [paper]

NLP

Language

  • Sequence to Sequence Learning with Neural Networks [NIPS 2014] [paper] [code]
  • End-To-End Memory Networks [NIPS 2015] [paper] [code]
  • Attention is all you need [NIPS 2017] [paper] [code]
  • Bidirectional Encoder Representations from Transformers: BERT [paper] [code] [pretrained-models]
  • Reformer: The Efficient Transformer [ICLR2020] [paper] [code]
  • Linformer: Self-Attention with Linear Complexity [AAAI2020] [paper] [code]
  • GPT-3: Language Models are Few-Shot Learners [NIPS 2020] [paper] [code]

Speech

  • Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation [INTERSPEECH 2020] [paper] [code]

CV

Backbone_Classification

Papers and Codes

  • CoaT: Co-Scale Conv-Attentional Image Transformers [arxiv 2021] [paper] [code]
  • SiT: Self-supervised vIsion Transformer [arxiv 2021] [paper] [code]
  • VIT: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale [VIT] [ICLR 2021] [paper] [code]
    • Trained with extra private data: do not generalized well when trained on insufficient amounts of data
  • DeiT: Data-efficient Image Transformers [arxiv2021] [paper] [code]
    • Token-based strategy and build upon VIT and convolutional models
  • Transformer in Transformer [arxiv 2021] [paper] [code1] [code-official]
  • OmniNet: Omnidirectional Representations from Transformers [arxiv2021] [paper]
  • Gaussian Context Transformer [CVPR 2021] [paper]
  • General Multi-Label Image Classification With Transformers [CVPR 2021] [paper] [code]
  • Scaling Local Self-Attention for Parameter Efficient Visual Backbones [CVPR 2021] [paper]
  • T2T-ViT: Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet [ICCV 2021] [paper] [code]
  • Swin Transformer: Hierarchical Vision Transformer using Shifted Windows [ICCV 2021] [paper] [code]
  • Bias Loss for Mobile Neural Networks [ICCV 2021] [paper] [[code()]]
  • Vision Transformer with Progressive Sampling [ICCV 2021] [paper] [[code(https://github.com/yuexy/PS-ViT)]]
  • Rethinking Spatial Dimensions of Vision Transformers [ICCV 2021] [paper] [code]
  • Rethinking and Improving Relative Position Encoding for Vision Transformer [ICCV 2021] [paper] [code]

Interesting Repos

Self-Supervised

  • Emerging Properties in Self-Supervised Vision Transformers [ICCV 2021] [paper] [code]
  • An Empirical Study of Training Self-Supervised Vision Transformers [ICCV 2021] [paper] [code]

Interpretability and Robustness

  • Transformer Interpretability Beyond Attention Visualization [CVPR 2021] [paper] [code]
  • On the Adversarial Robustness of Visual Transformers [arxiv 2021] [paper]
  • Robustness Verification for Transformers [ICLR 2020] [paper] [code]
  • Pretrained Transformers Improve Out-of-Distribution Robustness [ACL 2020] [paper] [code]

Detection

  • DETR: End-to-End Object Detection with Transformers [ECCV2020] [paper] [code]
  • Deformable DETR: Deformable Transformers for End-to-End Object Detection [ICLR2021] [paper] [code]
  • End-to-End Object Detection with Adaptive Clustering Transformer [arxiv2020] [paper]
  • UP-DETR: Unsupervised Pre-training for Object Detection with Transformers [[arxiv2020] [paper]
  • Rethinking Transformer-based Set Prediction for Object Detection [arxiv2020] [paper] [zhihu]
  • End-to-end Lane Shape Prediction with Transformers [WACV 2021] [paper] [code]
  • ViT-FRCNN: Toward Transformer-Based Object Detection [arxiv2020] [paper]
  • Line Segment Detection Using Transformers [CVPR 2021] [paper] [code]
  • Facial Action Unit Detection With Transformers [CVPR 2021] [paper] [code]
  • Adaptive Image Transformer for One-Shot Object Detection [CVPR 2021] [paper] [code]
  • Self-attention based Text Knowledge Mining for Text Detection [CVPR 2021] [paper] [code]
  • Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions [ICCV 2021] [paper] [code]
  • Group-Free 3D Object Detection via Transformers [ICCV 2021] [paper] [code]
  • Fast Convergence of DETR with Spatially Modulated Co-Attention [ICCV 2021] [paper] [code]

HOI

  • End-to-End Human Object Interaction Detection with HOI Transformer [CVPR 2021] [paper] [code]
  • HOTR: End-to-End Human-Object Interaction Detection with Transformers [CVPR 2021] [paper] [code]

Tracking

  • Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking [CVPR 2021] [paper] [code]
  • TransTrack: Multiple-Object Tracking with Transformer [CVPR 2021] [paper] [code]
  • Transformer Tracking [CVPR 2021] [paper] [code]
  • Learning Spatio-Temporal Transformer for Visual Tracking [ICCV 2021] [paper] [code]

Segmentation

  • SETR : Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers [CVPR 2021] [paper] [code]
  • Trans2Seg: Transparent Object Segmentation with Transformer [arxiv2021] [paper] [code]
  • End-to-End Video Instance Segmentation with Transformers [arxiv2020] [paper] [zhihu]
  • MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers [CVPR 2021] [paper] [official-code] [unofficial-code]
  • Medical Transformer: Gated Axial-Attention for Medical Image Segmentation [arxiv 2020] [paper] [code]
  • SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation [CVPR 2021] [paper] [code]

Reid

  • Diverse Part Discovery: Occluded Person Re-Identification With Part-Aware Transformer [CVPR 2021] [paper] [code]

Localization

  • LoFTR: Detector-Free Local Feature Matching with Transformers [CVPR 2021] [paper] [code]
  • MIST: Multiple Instance Spatial Transformer [CVPR 2021] [paper] [code]

Generation

Inpainting

  • STTN: Learning Joint Spatial-Temporal Transformations for Video Inpainting [ECCV 2020] [paper] [code]

Image enhancement

  • Pre-Trained Image Processing Transformer [CVPR 2021] [paper]
  • TTSR: Learning Texture Transformer Network for Image Super-Resolution [CVPR2020] [paper] [code]

Pose Estimation

  • Pose Recognition with Cascade Transformers [CVPR 2021] [paper] [code]
  • TransPose: Towards Explainable Human Pose Estimation by Transformer [arxiv 2020] [paper] [code]
  • Hand-Transformer: Non-Autoregressive Structured Modeling for 3D Hand Pose Estimation [ECCV 2020] [paper]
  • HOT-Net: Non-Autoregressive Transformer for 3D Hand-Object Pose Estimation [ACMMM 2020] [paper]
  • End-to-End Human Pose and Mesh Reconstruction with Transformers [CVPR 2021] [paper] [code]
  • 3D Human Pose Estimation with Spatial and Temporal Transformers [arxiv 2020] [paper] [code]
  • End-to-End Trainable Multi-Instance Pose Estimation with Transformers [arxiv 2020] [paper]

Face

  • Robust Facial Expression Recognition with Convolutional Visual Transformers [arxiv 2020] [paper]
  • Clusformer: A Transformer Based Clustering Approach to Unsupervised Large-Scale Face and Visual Landmark Recognition [CVPR 2021] [paper] [code]

Video Understanding

  • Is Space-Time Attention All You Need for Video Understanding? [arxiv 2020] [paper] [code]
  • Temporal-Relational CrossTransformers for Few-Shot Action Recognition [CVPR 2021] [paper] [code]
  • Self-Supervised Video Hashing via Bidirectional Transformers [CVPR 2021] [paper]
  • SSAN: Separable Self-Attention Network for Video Representation Learning [CVPR 2021] [paper]

Depth-Estimation

  • Adabins:Depth Estimation using Adaptive Bins [CVPR 2021] [paper] [code]

Prediction

  • Multimodal Motion Prediction with Stacked Transformers [CVPR 2021] [paper] [code]
  • Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case [paper]
  • Transformer networks for trajectory forecasting [ICPR 2020] [paper] [code]
  • Spatial-Channel Transformer Network for Trajectory Prediction on the Traffic Scenes [arxiv 2021] [paper] [code]
  • Pedestrian Trajectory Prediction using Context-Augmented Transformer Networks [ICRA 2020] [paper] [code]
  • Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction [ECCV 2020] [paper] [code]
  • Hierarchical Multi-Scale Gaussian Transformer for Stock Movement Prediction [paper]
  • Single-Shot Motion Completion with Transformer [arxiv2021] [paper] [code]

NAS

PointCloud

  • Multi-Modal Fusion Transformer for End-to-End Autonomous Driving [CVPR 2021] [paper] [code]
  • Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos [CVPR 2021] [paper]

Fashion

  • Kaleido-BERT:Vision-Language Pre-training on Fashion Domain [CVPR 2021] [paper] [code]

Medical

  • Lesion-Aware Transformers for Diabetic Retinopathy Grading [CVPR 2021] [paper]

Cross-Modal

  • Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers [CVPR 2021] [paper]
  • Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning [CVPR2021] [paper] [code]
  • Topological Planning With Transformers for Vision-and-Language Navigation [CVPR 2021] [paper]
  • Multi-Stage Aggregated Transformer Network for Temporal Language Localization in Videos [CVPRR 2021] [paper]
  • VLN BERT: A Recurrent Vision-and-Language BERT for Navigation [CVPR 2021] [paper] [code]
  • Less Is More: ClipBERT for Video-and-Language Learning via Sparse Sampling [CVPR 2021] [paper] [code]

Reference

Transformer总结的更多相关文章

  1. Spatial Transformer Networks(空间变换神经网络)

    Reference:Spatial Transformer Networks [Google.DeepMind]Reference:[Theano源码,基于Lasagne] 闲扯:大数据不如小数据 这 ...

  2. ABBYY PDF Transformer+怎么标志注释

    ABBYY PDF Transformer+是一款可创建.编辑.添加注释及将PDF文件转换为其他可编辑格式的通用工具,可用来在PDF页面的任何位置添加注释(关于如何通过ABBYY PDF Transf ...

  3. OAF_文件系列6_实现OAF导出XML文件javax.xml.parsers/transformer(案例)

    20150803 Created By BaoXinjian

  4. 泛函编程(27)-泛函编程模式-Monad Transformer

    经过了一段时间的学习,我们了解了一系列泛函数据类型.我们知道,在所有编程语言中,数据类型是支持软件编程的基础.同样,泛函数据类型Foldable,Monoid,Functor,Applicative, ...

  5. 如何用Transformer+从PDF文档编辑数据

    ABBYY PDF Transformer+是一款可创建.编辑.添加注释及将PDF文件转换为其他可编辑格式的通用工具,可使用该软件从PDF文档编辑机密信息,然后再发布它们,文本和图像均可编辑,本文将为 ...

  6. ABBYY PDF Transformer+ Pro支持全世界189种语言

    ABBYY PDF Transformer+ Pro版支持189种语言,包括我们人类的自然语言.人造语言以及正式语言.受支持的语言可能会因产品的版本不同而各异.本文具体列举了所有ABBYY PDF T ...

  7. 发现PDF Transformer+转换的图像字体小了如何处理

    ABBYY PDF Transformer+转换的原始图像字体太小怎么办?为了获得最佳文本识别效果,请用较高的分辨率扫描用极小字体打印的文档,否则很容易在转换识别时出错.下面小编就给大家讲讲该怎么解决 ...

  8. ABBYY PDF Transformer+从文件选项中创建PDF文档的教程

    可使用OCR文字识别软件ABBYY PDF Transformer+从Microsoft Word.Microsoft Excel.Microsoft PowerPoint.HTML.RTF.Micr ...

  9. Could not find a transformer to transform "SimpleDataType{type=org.mule.transport.NullPayload

    mule esb报错 com.isoftstone.esb.transformer.Json2RequestBusinessObject.transformMessage(Json2RequestBu ...

  10. Transformer

    参考资料: [ERT大火却不懂Transformer?读这一篇就够了] https://zhuanlan.zhihu.com/p/54356280 (中文版) http://jalammar.gith ...

随机推荐

  1. 用微软商店商店安装 Python

    在安装 Python 时,除了在官网 www.python.org 下载,还可以用微软商店下载 安装完成后,其目录位于C:\Users\<用户名>\AppData\Local\Micros ...

  2. cn2 lab 笔记

    Ubuntu 18.04 Kafka 先启动kafka自带的zookeeper 在data/kafka_2.13-3.3.1bin目录下执行 ./zookeeper-server-start.sh . ...

  3. 9-5 额外的string操作

    9.5.1 构造string的其他方法:略 9.5.2 改变string的其他方法:略 9.5.3 string搜索操作:略 9.5.4 compare函数:略 9.5.5 数值转换 int main ...

  4. Bulk-Crap-Uninstaller:一个高效卸载,轻松管理你的应用程序的.Net开源工具

    我们在工作中,经常需要安装大量的软件,随着应用程序的不断增多,管理这些软件变得非常困难. 下面介绍一款具备高效.简洁的特点,可以帮助我们快速卸载大量不需要的应用程序,让电脑管理变得更加轻松. 01 项 ...

  5. 干货分享:通用加解密函数(crypto),Air780E篇

    一.加解密概述 加解密算法是保证数据安全的基础技术,无论是在数据传输.存储,还是用户身份验证中,都起着至关重要的作用.随着互联网的发展和信息安全威胁的增加,了解并掌握常用的加解密算法已经成为开发者和安 ...

  6. TP5的项目能正常返回数据,但是状态码是500

    翻出来以前一个很老的TP5项目,忘记是什么功能系统了,就本地启动了一下,发现返回状态码500,开始以为是index.php被我隐藏后rewrite导致的,后来搜索了一下,发现大家也有这个问题,然后看到 ...

  7. JS 正则表示式 字符串匹配 忽略大小写

    在项目中遇到了需要使用字符串进行正则匹配,同时还要忽略大小写可以按照以下方法:1 先使用new RegExp(newVal, 'i')生成需要匹配的规则,其中 'i' 表示忽略大小写2 再对相应的字符 ...

  8. dotnet学习笔记-专题01-异步与多线程-01

    专题01 异步 多线程 1. Thread类 1.1 使用Thread创建线程 namespace ConsoleApp1; internal class Program { private stat ...

  9. ArrayList源码分析(基于JDK1.6)

    不积跬步,无以至千里:不积小流,无以成江海.从基础做起,一点点积累,加油! <Java集合类>中讲述了ArrayList的基础使用,本文将深入剖析ArrayList的内部结构及实现原理,以 ...

  10. 2022-2023 ACM-ICPC Nordic Collegiate Programming Contest (NCPC 2022)

    F. Foreign Football 一共有\(n\)支队伍,每支队伍的名称为\(s_i\),给定一个\(n \times n\)的矩阵,\(a_{i,j}\)代表第\(i\)支队伍和第\(j\)支 ...