[ Wechat：Y466551 | 付费咨询，非诚勿扰 ]

论文信息

论文标题：Adaptive prototype and consistency alignment for semi-supervised domain adaptation
论文作者：Jihong Ouyang、Zhengjie Zhang、Qingyi Meng
论文来源：2023 aRxiv
论文地址：download
论文代码：download
视屏讲解：click

1 介绍

2 问题定义

　　Formally, the semi-supervised domain adaptation scenario constitutes a labeled source domain $\mathcal{D}_{s}=\left\{\left(x_{i}^{s}, y_{i}^{s}\right)\right\}_{i=1}^{n_{s}}$ drawn from the distribution $P$ . For the target domain, a labeled set $\mathcal{D}_{t}=\left\{\left(x_{i}^{t}, y_{i}^{t}\right)\right\}_{i=1}^{n_{t}}$ and an unlabeled set $\mathcal{D}_{u}=\left\{x_{i}^{u}\right\}_{i=1}^{n_{u}}$ drawn from distribution $Q$ are given. The source and target domain are drawn from the same label space $y=\{1,2, \ldots, K\}$ . Usually, the number of labeled samples in $\mathcal{D}_{t}$ is minimal, e.g., one or three samples per class. SSDA aims to train the model on $\mathcal{D}_{s}$, $\mathcal{D}_{t}$ and $\mathcal{D}_{u}$ to correctly predict labels for samples in $\mathcal{D}_{u} $.

3 方法

3.1 模型框架

3.2 Supervised training

　　原型分类器（浅层）：

　　　　$\mathbf{p}(\mathbf{x})=\sigma\left(\frac{\mathbf{W}^{\mathrm{T}} \ell_{2}(F(\mathbf{x}))}{T}\right) \quad\quad(1)$

　　源域和目标域带标签监督训练：

　　　　$\mathcal{L}_{C E}=-\mathbb{E}_{(\mathbf{x}, y) \in \mathcal{D}_{s}, \mathcal{D}_{t}} y \log (\mathbf{p}(\mathbf{x})) \quad\quad(2)$

3.3 Adaptive prototype alignment

　　利用目标域代标记数据计算原型：

　　　　$\mathbf{c}_{k}^{\mathcal{T}}=\frac{1}{\left|\mathcal{D}_{k}\right|} \sum_{\left(x_{i}^{t}, y_{i}^{t}\right) \in \mathcal{D}_{k}} F\left(x_{i}^{t}\right)\quad\quad(3)$

　　利用目标域未带标记的数据计算原型（mini-batch级别）：

　　　　$c_{k}^{u}=\frac{\sum_{i \in B_{t}} \mathbb{1}_{\left[k=\hat{y}_{i}\right]} F\left(x_{i}^{u}\right)}{\sum_{i \in B_{t}} \mathbb{1}_{\left[k=\hat{y}_{i}\right]}}\quad\quad(4)$

　　Note：目标域未带标记样本使用分类器给出伪标签；

　　　　$c_{k(m)}^{\mathcal{U}}=\eta c_{k}^{u}+(1-\eta) c_{k(m-1)}^{\mathcal{U}}\quad\quad(5)$

　　利用 EMA 修改用目标域未带标记样本计算的原型：

　　　　$c_{k(m)}^{\mathcal{U}}=\eta c_{k}^{u}+(1-\eta) c_{k(m-1)}^{\mathcal{U}}\quad\quad(6)$

　　目标域总的原型：

　　　　$c_{k}=\frac{\mathbf{c}_{k}^{\mathcal{T}}+c_{k(m)}^{\mathcal{U}}}{2}\quad\quad(7)$

　　对于源域带标记数据，可以通过目标类原型距离函数得到概率分布如下：

　　　　$p(y \mid x)=\frac{e^{-d\left(F(x), c_{y}\right)}}{\sum_{k} e^{-d\left(F(x), c_{k}\right)}}\quad\quad(8)$

　　然后，计算总体源样本的原型损失如下：

　　　　$\mathcal{L}_{A P A}=-\mathbb{E}_{\left(x_{i}^{s}, y_{i}^{s}\right) \in \mathcal{D}_{s}} \log p\left(y_{i}^{s} \mid x_{i}^{s}\right)\quad\quad(9)$

　　小结阐述：使用目标域数据（带、不带标记）计算目标域原型，然后预测源域样本的类别，并使用源域标签做监督；

3.4 Consistency alignment

　　如模型框架图所示，目标域未带标记数据被分为弱、强数据增强样本，对于弱数据增强样本，使用分类器得到硬标签，并计算交叉熵（基于阈值$\gamma$）：

　　　　$\left.\ell_{c r}=-\mathbb{1}\left(\max \left(\mathbf{p}_{w}\right)>\tau\right) \log \mathbf{p}\left(y=\hat{p} \mid \mathcal{S}\left(x_{i}^{u}\right)\right)\right)\quad\quad(10)$

　　为了避免过拟合，使用多样性损失：

　　　　$\ell_{k l d}=-\mathbb{1}\left(\max \left(\mathbf{p}_{w}\right)>\tau\right) \sum_{k=1}^{C} \frac{1}{C} \log \mathbf{p}\left(y=k \mid \mathcal{S}\left(x_{i}^{u}\right)\right)\quad\quad(11)$

　　Note：KLD正则化鼓励预测结果接近均匀分布，从而使预测结果不会过拟合伪标签。

　　因此，一致性对齐模块的整体损失函数可以表示如下：

　　　　$\mathcal{L}_{C O N}=\mathbb{E}_{x_{i}^{u} \in \mathcal{D}_{u}}\left(\ell_{c r}+\lambda_{k l d} \ell_{k l d}\right)\quad\quad(12)$

3.5 Overall framework and training objective

　　本文方法是基于MME [45]的，它采用对抗性学习来改进域间自适应的样本特征对齐。将MME[45]中提到的熵损失纳入到本文的损失函数中。总体损失函数是上述损失函数的和，如下：

　　　　$\theta_{\mathcal{F}}=\underset{\theta_{\mathcal{F}}}{\arg \min } \mathcal{L}_{C E}+\mathcal{L}_{H}+\lambda_{1} \mathcal{L}_{A P A}+\lambda_{2} \mathcal{L}_{C O N}\quad\quad(13)$

　　　　$\theta_{\mathcal{C}}=\underset{\theta_{\mathcal{A}}}{\arg \min } \mathcal{L}_{C E}-\mathcal{L}_{H}+\lambda_{1} \mathcal{L}_{A P A}+\lambda_{2} \mathcal{L}_{C O N}$

　　其中：

　　　　$\mathcal{L}_{H}=-\mathbb{E}_{x_{i}^{u} \in \mathcal{D}_{u}} \sum_{i=1}^{K} p\left(y=i \mid x_{i}^{u}\right) \log p\left(y=i \mid x_{i}^{u}\right)$

3.6 算法框架

4 实验

分类准确度

参数敏感性

消融实验

论文解读（APCA）《Adaptive prototype and consistency alignment for semi-supervised domain adaptation》的更多相关文章

论文解读（CDCL）《Cross-domain Contrastive Learning for Unsupervised Domain Adaptation》
论文信息论文标题:Cross-domain Contrastive Learning for Unsupervised Domain Adaptation论文作者:Rui Wang, Zuxuan ...
论文解读（CDTrans）《CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation》
论文信息论文标题:CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation论文作者:Tongkun Xu, Weihu ...
迁移学习（）《Attract, Perturb, and Explore: Learning a Feature Alignment Network for Semi-supervised Domain Adaptation》
论文信息论文标题:Attract, Perturb, and Explore: Learning a Feature Alignment Network for Semi-supervised Do ...
论文解读（AGC）《Attributed Graph Clustering via Adaptive Graph Convolution》
论文信息论文标题:Attributed Graph Clustering via Adaptive Graph Convolution论文作者:Xiaotong Zhang, Han Liu, Qi ...
论文解读（AGE)《Adaptive Graph Encoder for Attributed Graph Embedding》
论文信息论文标题:Adaptive Graph Encoder for Attributed Graph Embedding论文作者:Gayan K. Kulatilleke, Marius Por ...
论文解读（ToAlign）《ToAlign: Task-oriented Alignment for Unsupervised Domain Adaptation》
论文信息论文标题:ToAlign: Task-oriented Alignment for Unsupervised Domain Adaptation论文作者:Guoqiang Wei, Cuil ...
《Stereo R-CNN based 3D Object Detection for Autonomous Driving》论文解读
论文链接:https://arxiv.org/pdf/1902.09738v2.pdf 这两个月忙着做实验博客都有些荒废了,写篇用于3D检测的论文解读吧,有理解错误的地方,烦请有心人指正). 博客原 ...
CVPR2020论文解读：OCR场景文本识别
CVPR2020论文解读:OCR场景文本识别 ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network∗ 论文 ...
自监督学习(Self-Supervised Learning)多篇论文解读（上）
自监督学习(Self-Supervised Learning)多篇论文解读(上) 前言 Supervised deep learning由于需要大量标注信息,同时之前大量的研究已经解决了许多问题.所以 ...
人工智能论文解读精选 | PRGC：一种新的联合关系抽取模型
NLP论文解读原创•作者 | 小欣论文标题:PRGC: Potential Relation and Global Correspondence Based Joint Relational ...

随机推荐

Netty之数据解码
一.概况作为Java世界使用最广泛的网络通信框架Netty,其性能和效率是有目共睹的,好多大公司都在使用如苹果.谷歌.Facebook.Twitter.阿里巴巴等,所以不仅仅是因为Netty有高效的 ...
KMP字符串匹配问题
KMP算法本文参考资料:https://www.zhihu.com/question/21923021 KMP算法是一种字符串匹配算法,可以在 $O(n+m)$ 的时间复杂度内实现两个字符串的匹 ...
C++ Primer 5th 阅读笔记：前言
机器效率和编程效率 Its focus, and that of its programming community, has widened from looking mostly at machi ...
linux安装tomcat,mysql
环境:centos7.6 ssh连接工具:tabby 安装tomcat 创建目录 mkdir /opt/tomcat 获取tomcat: 1.自己百度下载 2.我这里提供百度网盘链接:https:/ ...
#Python pandas库，读取模块，代码笔记
日常数据清洗中,利用python清洗的第一步就是读取对应文件,今天一起复盘一下数据读取环节的常规操作. csv和xlsx格式读取类似,所以用csv做案例 X-MIND图
2022-09-05：作为国王的统治者，你有一支巫师军队听你指挥。 :给你一个下标从 0 开始的整数数组 strength ，其中 strength[i] 表示第 i 位巫师的力量值。对于连续的一
2022-09-05:作为国王的统治者,你有一支巫师军队听你指挥. :给你一个下标从 0 开始的整数数组 strength , 其中 strength[i] 表示第 i 位巫师的力量值. 对于连续的一 ...
2021-02-25：给定一个正数数组arr，请把arr中所有的数分成两个集合。如果arr长度为偶数，两个集合包含数的个数要一样多；如果arr长度为奇数，两个集合包含数的个数必须只差一个。请尽量让两个集合的累加和接近，返回最接近的情况下，较小集合的累加和。
2021-02-25:给定一个正数数组arr,请把arr中所有的数分成两个集合.如果arr长度为偶数,两个集合包含数的个数要一样多:如果arr长度为奇数,两个集合包含数的个数必须只差一个.请尽量让两个 ...
2021年蓝桥杯C／C++大学B组省赛真题(路径)
题目描述: 小蓝学习了最短路径之后特别高兴,他定义了一个特别的图,希望找到图中的最短路径. 小蓝的图由2021 个结点组成,依次编号1 至2021. 对于两个不同的结点a, b,如果a 和b 的差的绝 ...
如何编写一个健壮的 npm 包
无脑发布 npm 比如老王我,用npm init新建一个包,改把改把,然后来个npm publish,so easy ️! Too young too naive, baby ! 请容我讲述一些发布过 ...
2023-05-29：给你一个由 n 个正整数组成的数组 nums 你可以对数组的任意元素执行任意次数的两类操作如果元素是偶数，除以 2 例如，如果数组是 [1,2,3,4] 那么你可以对最后一
七.设计算法,仅使用三次实数乘法即可完成复数 a+bi和c+di 相乘.算法需接收a.b.c和d 为输入,分别生成实部 ac-bd 和虚部ad+bc. 文心一言: 可以使用如下算法来计算复数 a+bi ...

论文解读（APCA）《Adaptive prototype and consistency alignment for semi-supervised domain adaptation》