Paper Reading: Stereo DSO
开篇第一篇就写一个paper reading吧,用markdown+vim写东西切换中英文挺麻烦的,有些就偷懒都用英文写了。
Stereo DSO: Large-Scale Direct Sparse Visual Odometry with Stereo Cameras
Abstract
Optimization objectives:
- intrinsic/extrinsic parameters of all keyframes
- all selected pixels' depth
Integrate constraints from static stereo (左右两个相机的立体视觉约束是静态的) into the bundle adjustment pipeline of temporal multi-view stereo.
Fixed-baseline stereo resolves scale drift.
? It also reduces the sensitivities to large optical flow and to rolling shutter effect which are known shortcomings of direct image alignment methods.
1. Introduction
stem from: working in an effective way
heuristically: 启发式的
hallucinate: 出现幻觉
strip down: reduced to its simplest form
Strasdat et al. proposed to expand the concept of keyframes to integrate scale and proposed a double window optimization (Figure out what is it)
Direct methods aim at computing geometry and motion directly from the images thereby skipping the intermediate keypoint selection step.
The key idea of LSD SLAM is to incrementally track the camera and simultaneously perform a pose graph optimization in order to keep the entire camera trajectory globally consistent. 作者认为这种方式没有减少累计误差,只是把它扩散到整个轨迹中( So the meaning of pose graph is? )。
Three drawbacks of DSO:
- The mentioned performance was gained on a photometrically calibrated dataset, in its absense, the performance would degrade.
- Scale drift
- DSO is quite sensitive to geometric distortion as those induces by fast motion and rolling shutter. While techniques for calibrating rolling shutter exist for direct SLAM algorithm, these are often quite involved and far from real-time capable.
Contribution:
- A stereo version of DSO. detail the proposed combination of temporal multi-view stereo and static stereo.
- Stereo DSO is good.
2. Direct Sparse VO with Stereo Camera
- Absolute scale can be directly calculated from static stereo from the known baseline of the stereo camera
- Static stereo can provide initial depth estimation for multi-view stereo
- Static Stereo can only accurately triangulate 3D points within a limited depth range while this limit is resolved by temporal multi-view stereo.
New stereo frames are first tracked with respect to their reference keyframe in a coarse-to-fine mannar.
A joint optimization of their poses, affine brightness (两个参数:a和b) parameters, as well as the depts of all the observed 3D points and camera intrinsics, is performed.
2.1 Notation
Nothing important.
2.2 Direct Image Alignment Formulation
\[
E_{ij}=\sum_{p\in P_i}\omega_p \left\| I_j[p'] - I_i[p] \right\|_\gamma
\]
where \(\omega_p\) is the weight which is shown as follows.(梯度越大权重越小,不知道为啥)
\[
\omega_p = \frac{c^2}{c^2+\left\| \nabla I_i(p) \right\| ^2_2}
\]
光度误差对突然的光照变化非常敏感。
2.3 Tracking
All the potins inside the active window are projected into the new frame. Then the pose of the new frame is optimized by minimizing the energy function.
在之前的单目DSO中,用随机深度值来初始化,所以都会需要一个确定模式的移动来初始化。在本文中,因为这时候stereo image pair的affine brightness transfer factor是位置的,所以用NCC在水平极限上的3*5的领域中搜索。
2.4 Frame Management
The basic idea is to check if the scene or the illumination has sufficiently changed.
- scene change: 用mean square optical flow和 mean squared optical flow without rotation between the current frame and the last keyframe来衡量。
- illumination change: 用relative brightness factor \(|a_j - a_i|\) 来衡量。
-> 一个点如果是梯度大于一个阈值并且是一个block里最大的点,那么他会被选择。
-> Before a candidate point is activated and optimized in the windowed optimization, its inverse depth is constantly refined by the following non-keyframes. (找出来怎么做的)
-> 旧去新来:在边缘化点的时候把候选点加入到联合优化中。
-> The constraints from static stereo introduce scale information into the system, and they also provide good geometric priors to temporal multi-view stereo.
2.5 Windowed Optimization
-> Temporal Multi-View Stereo: 就一般的不同时刻的图片之间的立体视觉
-> Static Stereo:
-> Stereo Coupling: 为了平衡上两种约束的权重,我们引入了\(\lambda\)参数。
-> Margninalization: 在边缘化一个关键帧之前,我们首先会边缘化所有没有被过去两个关键帧看到所有active window中的点。
3. Evaluation
暂且略过不表
4. Conclusion
未来可以做的两件事:
- Loop closuring and a database for map maintenance (LDSO半闲居士做过了)
- Dynamic object handling to further boost the VO accuracy and robustness. (用深度学习做动态物体检测然后动的点不要了?)
虽然自己在SLAM领域还有很多可以学习的,但是这样感觉直接法的东西也做完了?悲伤。。
Paper Reading: Stereo DSO的更多相关文章
- [Paper Reading]--Exploiting Relevance Feedback in Knowledge Graph
<Exploiting Relevance Feedback in Knowledge Graph> Publication: KDD 2015 Authors: Yu Su, Sheng ...
- Paper Reading: Perceptual Generative Adversarial Networks for Small Object Detection
Perceptual Generative Adversarial Networks for Small Object Detection 2017-07-11 19:47:46 CVPR 20 ...
- Paper Reading: In Defense of the Triplet Loss for Person Re-Identification
In Defense of the Triplet Loss for Person Re-Identification 2017-07-02 14:04:20 This blog comes ...
- Paper Reading - Attention Is All You Need ( NIPS 2017 ) ★
Link of the Paper: https://arxiv.org/abs/1706.03762 Motivation: The inherently sequential nature of ...
- Paper Reading - Convolutional Sequence to Sequence Learning ( CoRR 2017 ) ★
Link of the Paper: https://arxiv.org/abs/1705.03122 Motivation: Compared to recurrent layers, convol ...
- Paper Reading - Deep Captioning with Multimodal Recurrent Neural Networks ( m-RNN ) ( ICLR 2015 ) ★
Link of the Paper: https://arxiv.org/pdf/1412.6632.pdf Main Points: The authors propose a multimodal ...
- Paper Reading - Deep Visual-Semantic Alignments for Generating Image Descriptions ( CVPR 2015 )
Link of the Paper: https://arxiv.org/abs/1412.2306 Main Points: An Alignment Model: Convolutional Ne ...
- Paper Reading - Mind’s Eye: A Recurrent Visual Representation for Image Caption Generation ( CVPR 2015 )
Link of the Paper: https://ieeexplore.ieee.org/document/7298856/ A Correlative Paper: Learning a Rec ...
- Paper Reading - Show and Tell: A Neural Image Caption Generator ( CVPR 2015 )
Link of the Paper: https://arxiv.org/abs/1411.4555 Main Points: A generative model ( NIC, GoogLeNet ...
随机推荐
- [dart学习]第一篇:windows下安装配置dart编译环境,写出helloworld
前言 博主非科班出身,平时多用C语言,最近想了解学习一门第二语言,看上了可用于移动开发的目前还小众一点dart,准备用一段比较长的时间来慢慢学习.理解. 关于dart语言不再详细介绍了,大家可以访问 ...
- vue脚手架3
跟脚手架2安装都一样,已经安装脚手架2的要执行下面的命令 ,先删除 npm uninstall vue-cli -g 或 yarn global remove vue-cli 卸载 在执行下面的命令 ...
- MySQL ERROR 1698 (28000): Access denied for user 'root'@'localhost'
今天在安装MySQL的过程中竟然没有让我输入密码,登录的时候也不需要密码就能进入,这让我很困惑. 进了数据库就设置密码,用了各种方式都不行. 虽然我这数据库没啥东西但也不能没有密码就裸奔啊,有点丢人是 ...
- postman的几个问题
最近使用postman写了几个web接口测试用例,工具使用比较简单,大概步骤如下: 1.new collections——>建文件夹,类似建一个测试合集,用于方便整理,例如可以把同一个接口各种参 ...
- 算法(第四版)C# 习题题解——1.4
写在前面 整个项目都托管在了 Github 上:https://github.com/ikesnowy/Algorithms-4th-Edition-in-Csharp 这一节内容可能会用到的库文件有 ...
- 解决 httpclient 下 Address already in use: connect 的错误
最近做httpclient做转发服务,发现服务器上总是有很多close_wait状态的连接,而且这些连接都不会关闭,最后导致服务器没法建立新的网络连接,从而停止响应. 后来在网上搜索了一下,发现解决的 ...
- 【Django视图与网址进阶004】
一.在网页上做加减法 1. 采用 /add/?a=4&b=5 这样GET方法进行 django-admin.py startproject zqxt_views cd zqxt_views p ...
- JAVA循环的语法
一,有几种循环的语法 1while. while(循环条件){ 循环操作 } while(循环条件){ 循环操作 } 2.do-while do{ 循环操作 }while(循环条件); do{ 循环操 ...
- CMS收集器产生的问题和解决方案
垃圾收集器长时间停顿,表现在 Web 页面上可能是页面响应码 500 之类的服务器错误问题,如果是个支付过程可能会导致支付失败,将造成公司的直接经济损失,程序员要尽量避免或者说减少此类情况发生. 提升 ...
- 目标指定法——S.M.A.R.T.
一个有效的目标一定要是 具体的(Specific), 可测量的(Measureable), 可实现的(Attainable), 有现实意义的(Realistic), 以及有明确期限的(Time-bas ...