【深度学习论文篇 03-2】Pytorch搭建SSD模型踩坑集锦

源码地址：http://github.com/amdegroot/ssd.pytorch

环境1：torch1.9.0+CPU

环境2：torch1.8.1+cu102、torchvision0.9.1+cu102

1. StopIteration。Batch_size设置32，训练至60次报错，训练中断；Batch_size改成8训练至240次报错。

报错原因及解决方法：train.py第165行：

# 修改之前

images, targets = next(batch_iterator)

# 修改之后

try:

    images, targets = next(batch_iterator)

except:

    batch_iterator = iter(data_loader)

    images, targets = next(batch_iterator)

2. UserWarning: volatile was removed and now has no effect. Use 'with torch.no_grad():' instead.

报错原因及解决方法：Pytorch版本问题，ssd.py第34行：

# 修改之前

self.priors = Variable(self.priorbox.forward(), volatile=True)

# 修改之后

with torch.no_grad():

    self.priors = torch.autograd.Variable(self.priorbox.forward())

3. UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.

报错原因及解决方法：nn.init.xavier_uniform是以前版本，改成nn.init.xavier_uniform_即可

4. VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.

报错原因及解决方法：版本问题，augmentation.py第238行mode = random.choice(self.sample_options)报错，改为mode = np.array(self.sample_options, dtype=object)，但并没卵用。。。由于是Warning，懒得再管了

5. AssertionError: Must define a window to update

报错原因及解决方法：打开vidsom窗口更新时报错（train.py 153行）

# 报错代码（153行）

update_vis_plot(epoch, loc_loss, conf_loss, epoch_plot, None, 'append', epoch_size)

将将158行epoch+=1放在报错代码之前即可解决问题

6. KeyError: "filename 'storages' not found"。运行验证脚本eval.py和测试脚本test.py报的错

报错原因及解决方法：加载的.pth模型文件损坏

7. UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.

报错原因及解决方法：版本问题，新版本损失函数的参数中，size_average和reduce已经被弃用，设置reduction即可。_reduction.py第90行修改如下：

# 修改之前（90行）

loss_l = F.smooth_ll_loss(loc_p, loc_t, size_average=False)


# 修改之后

loss_l = F.smooth_ll_loss(loc_p, loc_t, reduction=’sum’)

8. RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

报错原因及解决方法：eval.py第425行，如果在cpu上运行则需要指定cpu模式

# 修改之前

net.load_state_dict(torch.load(args.trained_model))

# 修改之后

net.load_state_dict(torch.load(args.trained_model, map_location='cpu'))

9. RuntimeError: Legacy autograd function with non-static forward method is deprecated. Please use new-style autograd function with static forward method.

出现在eval.py和train.py ★★★★★★

(Example: https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function)

报错原因：在pytorch1.3及以后的版本需要规定forward方法为静态方法，所以在pytorch1.3以上的版本执行出错。

官方建议：在自定义的autorgrad.Function中的forward，backward前加上@staticmethod

解决方法：

方法一：pytorch回退版本至1.3以前

方法二：根据官方建议，在ssd.py中forward前加@staticmethod，结果报出另一个错误

紧接着，将eval.py第385行 detections = net(x).data 改为 detections = net.apply(x).data，执行时又报如下错误

再然后，在ssd.py第100行加forward（或apply）

output=self.detect.forward(loc.view(loc.size(0), -1, 4),

                           self.softmax(conf.view(conf.size(0), -1, self.num_classes)),

                           self.priors.type(type(x.data)))

还是报和上边同样的错误，直接弃疗。。。

在该项目issues里看到：

It has a class named 'Detect' which is inheriting torch.autograd.Function but it implements the forward method in an old deprecated way, so you need to restructure it i.e. you need to define the forward method with @staticmethod decorator and use .apply to call it from your SSD class.

Also, as you are going to use decorator, you need to ensure that the forward method doesn't use any Detect class constructor variables.

也就是在forward定义前边加@statemethod，然后调用的时候用.apply。staticmethod意味着Function不再能使用类内的方法和属性，去掉init()用别的方法代替

最终解决方案（方法三）：

detection.py改为如下，即将init()并入到forward函数中：

def forward(self, num_classes, bkg_label, top_k, conf_thresh,

            nms_thresh, loc_data, conf_data, prior_data)

然后在ssd.py中调用的时候改为：

# 修改之前（46行）

# if phase == 'test':

#    self.softmax = nn.Softmax(dim=-1)

#    self.detect = Detect(num_classes, 0, 200, 0.01, 0.45)

# 修改之后

if phase == 'test':

    self.softmax = nn.Softmax()

    self.detect = Detect()

# 修改之前（99行）

# if self.phase == "test":

#     output = self.detect(

#        loc.view(loc.size(0), -1, 4),                   # loc preds

#        self.softmax(conf.view(conf.size(0), -1,

#                     self.num_classes)),                # conf preds

#        self.priors.type(type(x.data))                  # default boxes

#     )

# 修改之后

if self.phase == "test":

    output = self.detect.apply(2, 0, 200, 0.01, 0.45,

                               loc.view(loc.size(0), -1, 4),    # loc preds

                               self.softmax(conf.view(-1, 2)),  # conf preds

                               self.priors.type(type(x.data))   # default boxes

                               )

注意：方式三中，ssd.py的Forward方法前边不能加@staticmethod，否则会报和方法二中相同的错。detection.py的Forward方法前加不加@staticmethod都没影响。

10. cv2.error: OpenCV(4.5.5) :-1: error: (-5:Bad argument) in function 'rectangle'

报错原因及解决方法：opencv版本过高，不兼容，改装4.1.2.30问题解决

总结：遇到报错别急着求助，一定要仔细阅读报错信息，先自己分析下为什么报错，一般对代码比较熟悉的话都是能找到原因的。实在解决不了再百度或Google，另外可以多多参考源码的Issues。

参考资料：

1、https://blog.csdn.net/qq_39506912/article/details/116926504（主要参考这篇博客）

2、http://github.com/amdegroot/ssd.pytorch/issues/234

【深度学习论文篇 03-2】Pytorch搭建SSD模型踩坑集锦的更多相关文章

【深度学习论文篇 02-1 】YOLOv1论文精读
原论文链接:https://gitee.com/shaoxuxu/DeepLearning_PaperNotes/blob/master/YOLOv1.pdf 笔记版论文链接:https://gite ...
【深度学习论文篇 01-1 】AlexNet论文翻译
前言:本文是我对照原论文逐字逐句翻译而来,英文水平有限,不影响阅读即可.翻译论文的确能很大程度加深我们对文章的理解,但太过耗时,不建议采用.我翻译的另一个目的就是想重拾英文,所以就硬着头皮啃了.本文只 ...
(zhuan) 126 篇殿堂级深度学习论文分类整理从入门到应用
126 篇殿堂级深度学习论文分类整理从入门到应用 | 干货雷锋网作者: 三川 2017-03-02 18:40:00 查看源网址阅读数:66 如果你有非常大的决心从事深度学习,又不想在这一行打 ...
10K+，深度学习论文、代码最全汇总！
我们大部分人是如何查询和搜集深度学习相关论文的?绝大多数情况是根据关键字在谷歌.百度搜索.想寻找相关论文的复现代码又会去 GitHub 上搜索关键词.浪费了很多时间不说,论文.代码通常也不够完整.怎么 ...
深度学习实战篇-基于RNN的中文分词探索
深度学习实战篇-基于RNN的中文分词探索近年来,深度学习在人工智能的多个领域取得了显著成绩.微软使用的152层深度神经网络在ImageNet的比赛上斩获多项第一,同时在图像识别中超过了人类的识别水平 ...
深度学习论文笔记：Fast R-CNN
知识点 mAP:detection quality. Abstract 本文提出一种基于快速区域的卷积网络方法(快速R-CNN)用于对象检测. 快速R-CNN采用多项创新技术来提高训练和测试速度,同时 ...
常用深度学习框——Caffe/ TensorFlow / Keras/ PyTorch/MXNet
常用深度学习框--Caffe/ TensorFlow / Keras/ PyTorch/MXNet 一．概述近几年来,深度学习的研究和应用的热潮持续高涨,各种开源深度学习框架层出不穷,包括Tenso ...
深度学习与CV教程(13) | 目标检测 (SSD,YOLO系列)
作者:韩信子@ShowMeAI 教程地址:http://www.showmeai.tech/tutorials/37 本文地址:http://www.showmeai.tech/article-det ...
深度学习论文TOP10，2019一季度研究进展大盘点
9012年已经悄悄过去了1/3. 过去的100多天里,在深度学习领域,每天都有大量的新论文产生.所以深度学习研究在2019年开了怎样一个头呢? Open Data Science对第一季度的深度学习研 ...

随机推荐

du 和 df 的定义，以及区别？
du 显示目录或文件的大小 df 显示每个<文件>所在的文件系统的信息,默认是显示所有文件系统.(文件系统分配其中的一些磁盘块用来记录它自身的一些数据,如 i 节点,磁盘分布图,间接块,超 ...
在java web工程中实现登入和安全验证
登入页面的话我们之前做过直接可以拿来用翻一翻之前的博客就可以找到在这个基础上添加验证功能代码如下: 1 package security; 2 /** 3 * @author 鐜嬭儨鍗? 4 */ ...
什么是多线程环境下的伪共享（false sharing）？
伪共享是多线程系统(每个处理器有自己的局部缓存)中一个众所周知的性能问题.伪共享发生在不同处理器的上的线程对变量的修改依赖于相同的缓存行,如下图所示: 伪共享问题很难被发现,因为线程可能访问完全不 ...
solr服务的搭建
首先你需要一台已经搭建好的虚拟机,下面的步骤才可以执行安装java 安装完Centos6.5的Base Server版会默认安装OpenJDK,首先需要删除OpenJDK 1.查看以前是不是安装了o ...
jsp技术之隐藏域
隐藏域 hidden:隐藏域属性,不显示到页面上,但是会提交的表单项注意:表单中增加了一个隐藏域,是用户的id.稍后修改联系人信息,提交表单时需要使用到 <!-- hidden:隐藏域,不显示 ...
攻防世界 unserialize3
unserialize3 class xctf{ public $flag = '111'; public function __wakeup(){ exit('bad requests'); } } ...
模型预测控制（MPC）简介
1.引言在当今过程控制中,PID当然是用的最多的控制方法,但MPC也超过了10%的占有率.MPC是一个总称,有着各种各样的算法.其动态矩阵控制(DMC)是代表作.DMC采用的是系统的阶跃响应曲线,其 ...
在VisualStudio调试器中使用内存窗口和查看内存分布
调试模式下内存窗口的使用在调试期间,"内存"窗口显示应用使用的内存空间.调试器窗口(如"监视"."自动"."局部变量" ...
9_根轨迹_Part3_分离点/汇合点和根的性质
面试BAT，你凭什么说你掌握了CSS
介绍项目已经开源:https://github.com/nanhupatar... 欢迎PR 推荐关注我们的公众号 display: none; 与 visibility: hidden; 的区别 ...

【深度学习 论文篇 03-2】Pytorch搭建SSD模型踩坑集锦

【深度学习 论文篇 03-2】Pytorch搭建SSD模型踩坑集锦的更多相关文章

随机推荐

热门专题

【深度学习论文篇 03-2】Pytorch搭建SSD模型踩坑集锦

【深度学习论文篇 03-2】Pytorch搭建SSD模型踩坑集锦的更多相关文章