0.参考与前言

完整题目： Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments

代码链接：无

缩写：occupancy grid map (OGM), sensor grid maps (SGMs), residual grid maps (RGMs)

1. Motivation

任务：detection and segmentation of moving obstacles

在同一个框架里实现这两个功能： ① detects and segments 场景里的动态障碍物；② predict the spatiotemporal evolution of the environment

形式为：occupancy-based environment representations

The OGMs discretize the environment into grid cells and consider the binary free or occupied hypotheses. Each cell in the OGMs contains the belief of its respective occupancy probability

对于被遮挡的表示呢，有 evidential occupancy grid map (eOGM) [2], where each cell carries an additional information on the occluded occupancy hypothesis in its information channel, in addition to the occupied and free channels

因为其和RGB images的相似 discretized spatial structure 所以大多预测方法也可以直接使用

Related work

[3] 基于卷积LSTM 重新提出了predictive coding network，可以很好的捕获 static dynamic 但是 suffers from vanishing dynamic objects in the predictions at longer time horizons.

[6] 基于 [3] 的PredNet develop a double-prong model，一个prong用作静止的OGMs [ motion model 学习静止环境的相对运动 ]，另一个主要接收动态的OGMs；输出为两个prongs合起来；缺点是需要比较精准的物体检测和跟踪信息

本篇主要是extend [6] 的工作，将static dynamic object segmentation 与 prediction一起做，这样就不需要检测和跟踪先验结果了

[11] Range images are used as an intermediate representation of the point clouds to reduce computational complexity

Contribution

方法上： segmentation with SalsaNext [11, 12]

We develop a method that integrates static-dynamic object segmentation and local environment prediction together, without assuming knowledge of static and dynamic objects in the scene.
We propose using an occupancy-based environment representation across the entire system to enable direct integration.

2. Method

与[11] range image不同的是，这里SGM和RGM作为输入

首先像[3] 一样使用Markov random field去掉地面
接下来分为两个部分：static-dynamic object segmentation 用来将静态和动态分开
prediction module，预测未来的OGMs

2.1 Framework

OGMs 是根据 [3] 生成的

2.2 Static-Dynamic Object Segmentation

输入是：SGMs 和 RGMs；输出为 discretized dynamic masks

SGMs 为 \(\R^{W \times H}\) 每个cell三个状态 free occupied occluded，使用ray tracing 可以决定 free space and occupancy class

RGMs 为 \(\R^{W \times H}\) 根据现在时刻和过去的SGMs生成的，past SGMs 先根据ego motion 转到当前坐标系下，然后对比cells里的状态变化，如果从一个已知类到另一个则设为1，否则为0；注意这里我们并不考虑 occluded 部分

然后RGMs concatenated到current SGMs上去成为一个extra channel；所以整体上是SGM提供temporal info，RGM提供spatial info

输出为：一个二进制的mask，\(M_d \in \R^{W \times H}\) 1 代表 dynamic 0 代表static

2.3 Environment Prediction

dynamic mask是\(M_d\) static 就是：\(1-M_d\)

eOGMs是一种alternative representations使用Dempster–Shafer Theory (DST)来更新grid cells

Each allowable hypothesis is associated with its corresponding Dempster–Shafer belief mass, which represents the degree of occupancy belief in that cell [10].

eOGMs 是 \(\R^{W \times H \times C}\) number of channels 包含 Dempster-Shafer beilef masses for occupied \(m({O}) \in [0,1]\) and free \(m({F}) \in [0,1]\) 也就是两个通道

3. 实验及结果

Setting

实验设置：分辨率为0.33，长度范围为42mx42m，分辨率的选择主要是为了each vehicle is covered by a sufficient number of cells

model 首先用 SGMs 和 RGMs 分割出环境中的动静态物体，RGMs 由 t 和 t-5 (0.5s eariler) 的 SGMs生成；0.6, 0.1, 0.3 train, validation and test 分布

prediction model 收到 static and dynamic OGMs生成 mask 作为输入，输出为环境的整个 complete OGM predictions，整体是 20 连续帧，也就是2s的驾驶数据

[6] 里是使用过去 5帧OGM去预测未来15帧OGM；而本文根据 [8] 建议训练分为两种模式，一种是根据现在的OGM预测下一帧OGM；第二种则是：finetuning the model to recursively predict the next 15 OGMs，然后权重参数由上一个模式初始化而来

部署的实时性在 i7-5930K 3.5GHz 和一块TITAN X上为 82ms (12 Hz)

Results

评估指标

MSE metric is used to assess how well the predicted occupancy probability for each cell corresponds to its ground truth value.
IS metric is used to measure how well the structure of the scene is maintained in the OGM predictions. To calculate the IS metric, the minimum Manhattan distance is calculated between two grid cells (one from the target OGM and the other from the predicted OGM) with the same occupancy classes (occupied, free, and unknown)

定量表格

定性分析

原文中的 Future work will consider incorporating semantic segmentation as well. We hypothesize that the model can perform better if given the ability to learn semantics in the scene, which can help with predicting the motion models of different object types.

碎碎念

IROS 2022 到时候还是 online 听一下这个 October 26, 2022 15:00-15:10, Paper WeB-3.3

Our IoU metric over the static and moving objects are 99:5% and 54:5%, respectively, and the average IoU is 77:0%.

问一下这个 moving IoU不高的原因，和static高是否是因为大部分物体是静止的

问了作者好像是 cell size 会把一些动静态点一起分到一个，但是因为zoom的声音原因我并没有能听清所有的部分…

凹的应该如何计算？IoU是以cell为单位嘛？

看了很多这个领域的好像都是这个

文中提到的参考大多是开源了的

[3] Dynamic Environment Prediction in Urban Scenes using Recurrent Representation Learning https://github.com/mitkina/EnvironmentPrediction
[6] Double-Prong ConvLSTM for Spatiotemporal Occupancy Prediction in Dynamic Environments https://github.com/sisl/Double-Prong-Occupancy

赠人点赞手有余香；正向回馈才能更好开放记录 hhh

【论文阅读】IROS2022: Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments的更多相关文章

[论文阅读笔记] Community aware random walk for network embedding
[论文阅读笔记] Community aware random walk for network embedding 本文结构解决问题主要贡献算法原理参考文献 (1) 解决问题先前许多算法都 ...
【论文阅读】Socially aware motion planning with deep reinforcement learning-annotated
目录摘要部分: I. Introduction 介绍 II. Background 背景 A. Collision Avoidance with DRL B. Characterization of ...
[论文阅读]阿里DIN深度兴趣网络之总体解读
[论文阅读]阿里DIN深度兴趣网络之总体解读目录 [论文阅读]阿里DIN深度兴趣网络之总体解读 0x00 摘要 0x01 论文概要 1.1 概括 1.2 文章信息 1.3 核心观点 1.4 名词解释 ...
论文阅读笔记六：FCN：Fully Convolutional Networks for Semantic Segmentation(CVPR2015)
今天来看一看一个比较经典的语义分割网络,那就是FCN,全称如题,原英文论文网址:https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn ...
[置顶] 人工智能（深度学习）加速芯片论文阅读笔记（已添加ISSCC17，FPGA17...ISCA17...）
这是一个导读,可以快速找到我记录的关于人工智能(深度学习)加速芯片论文阅读笔记. ISSCC 2017 Session14 Deep Learning Processors: ISSCC 2017关于 ...
论文阅读 | FoveaBox: Beyond Anchor-based Object Detector
论文阅读——FoveaBox: Beyond Anchor-based Object Detector 概述这是一篇ArXiv 2019的文章,作者提出了一种新的anchor-free的目标检测框架 ...
论文阅读 | Region Proposal by Guided Anchoring
论文阅读 | Region Proposal by Guided Anchoring 相关链接论文地址:https://arxiv.org/abs/1901.03278 概述众所周知,anchor ...
YOLO 论文阅读
YOLO(You Only Look Once)是一个流行的目标检测方法,和Faster RCNN等state of the art方法比起来,主打检测速度快.截止到目前为止(2017年2月初),YO ...
[论文阅读]阿里DIEN深度兴趣进化网络之总体解读
[论文阅读]阿里DIEN深度兴趣进化网络之总体解读目录 [论文阅读]阿里DIEN深度兴趣进化网络之总体解读 0x00 摘要 0x01论文概要 1.1 文章信息 1.2 基本观点 1.2.1 DIN的 ...
BERT 论文阅读笔记
BERT 论文阅读 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 由 @快刀切草莓君 ...

随机推荐

从 Rollover+curator 到 ILM
数据量增长,当前存在的问题: 查询与写入越来越慢,聚合的速度慢的离谱,聚合的数据量大一些的话,可能出现超时失败,甚至OOM 磁盘和内存资源以肉眼可见的速度快速消耗,甚至出现满载的情况 JVM频繁GC, ...
ITIL4 服务价值系统(SVS):一场服务管理的革新之旅
在这个数字化时代,每一家企业都在追求高效的服务管理和卓越的客户体验.今天,我们就来聊一聊ITIL4中的服务价值系统(Service Value System, SVS)--一个让服务管理变得更加直观和 ...
谷歌 hackbar 不能使用的问题
谷歌 hackbar 不能使用的问题下载 hackbar 插件:https://github.com/Mr-xn/hackbar2.1.3 解压文件,将其拖入 chrome 扩展程序中点击详情,点 ...
Linux中的umask
在Linux中,当创建一个文件或者目录的时候,系统会自动为这个文件或者目录赋予默认的权限,而umask命令就是用来控制这个默认权限的. 查看umask umask的查看有两种方式,一种不带选项-S,一 ...
hutool QrCodeUtil解析二维码出现NotFoundException
解析部分二维码时出现com.google.zxing.NotFoundException:null,解析失败的二维码手机扫是能正常打开的,后面发现这个问题是因为原二维码图片太大了,将图片缩小后正常解析 ...
AIRIOT赋能水务行业深度转型，打造智慧水务“四化建设”
水利水务与民生息息相关,随着我国智慧城市建设的推进及科学技术的不断发展,对城市供水管理产生了尤为重要的影响.面对水务行业信息化建设周期长,无统一的技术标准和数据标准,信息孤岛严重,协同工作能力受制 ...
微信小程序订阅消息开发指南(java)
微信小程序订阅消息开发指南(java) 第一步准备阶段 1.你得有一个小程序,并且认证了,个人的也行 2.开通订阅消息小程序后台->功能->订阅消息 3.公共模板库选择一个模板选择的 ...
java 反射——任意类型数组扩容
//java object[]无法转换为原对象类型,可以使用反射来做. //这里的参数不是传object[] 而是传object. public Object GoodArrayGrow(Object ...
RocketMQ阅读源码前的准备
本文将讲解如何在IDEA中导入 RocketMQ 源码,并运行 Broker 和 NameServer,编写一个消息发送与消息消费的示例. 一. 源码导入及调试 1.1 导入源码 RocketMQ 原 ...
Inno Setup Dependency Installer 安装包运行环境安装
Download and install any dependency such as .NET, Visual C++ or SQL Server during your application's ...

【论文阅读】IROS2022: Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments