First, a depth spatial-temporal descriptor is developed to extract the interested local regions in depth image. Then the intensity spatial-temporal descriptor and the depth spatial-temporal descriptor are combined and feeded into a linear coding framework to get an effective feature vector, which can be used for action classification. Finally, extensive experiments are conducted on a publicly available RGB-D action recognition dataset and the proposed method shows promising results.

创新点就这个了:A linear coding framework is developed to fuse the intensity spatial-temporal descriptor and the depth spatial-temporal descriptor to form robust feature vector. In addition, we further exploit the temporal intrinsics of the video sequence and design a new pooling technology to improve the description performance.

Feature extraction

STIPs is an extension of SIFT (Scale-Invariant-Feature-Transform) in 3-dimensional space and uses one of Harris3D, Cuboid or Hessian as the detector.

http://www.di.ens.fr/~laptev/download.html

patch的分割有重叠~~

算是对depth map的预处理了 ~~

So the STIPs features in the RGB images disclose more detail characters of the subjects themselves while in the depth images they extract more characters of the shape of the subjects.

Coding approaches

vector quantization (VQ)

One disadvantage of the VQ is that it introduces significant quantization errors since only one element of the codebook is selected to represent the descriptor. To remedy this, one usually has to design a nonlinear SVM as the classifier which tries to compensate the quantization errors. However, using nonlinear kernels, the SVM has to pay a high training cost, including computation and storage. Considering the above defects, localityconstrained linear coding (LLC) –a more accurate and efficient coding approach[9]is adopted to replace VQ in this paper

Pooling strategy

Similar to the VQ coding approach, the LLC coding coefficients ci are expected to be combined into a global representation of the sample for classification.

DataSet

RGBD-HuDaAct[1]video database

The video sample consists of synchronized and calibrated RGB-D frame sequences, which contains in each frame a RGB image and a depth image, respectively. The RGB and depth images in each frame have been calibrated with a standard stereocalibration method available in OpenCV so that the points with the same coordinate in RGB and depth images are corresponded.

一片简洁的paper ,给我指明了方向 ~~

RGB-D action recognition using linear coding的更多相关文章

  1. Multi-View Region Adaptive Multi-temporal DMM and RGB Action Recognition

    论文标题:Multi-View Region Adaptive Multi-temporal DMM and RGB Action Recognition 来源/作者机构情况: 解决问题/主要思想贡献 ...

  2. 201904:Action recognition based on 2D skeletons extracted from RGB videos

    论文标题:Action recognition based on 2D skeletons extracted from RGB videos 发表时间:02 April 2019 解决问题/主要思想 ...

  3. 行为识别(action recognition)相关资料

    转自:http://blog.csdn.net/kezunhai/article/details/50176209 ================华丽分割线=================这部分来 ...

  4. 论文列表 for Action recognition

    要读的论文: https://www.cnblogs.com/hizhaolei/p/10565405.html 骨架动作识别论文汇总 https://blog.csdn.net/bianxuewei ...

  5. 【ML】Two-Stream Convolutional Networks for Action Recognition in Videos

    Two-Stream Convolutional Networks for Action Recognition in Videos & Towards Good Practices for ...

  6. 论文笔记 | A Closer Look at Spatiotemporal Convolutions for Action Recognition

    ( 这篇博文为原创,如需转载本文请email我: leizhao.mail@qq.com, 并注明来源链接,THX!) 本文主要分享了一篇来自CVPR 2018的论文,A Closer Look at ...

  7. Skeleton-Based Action Recognition with Directed Graph Neural Network

    Skeleton-Based Action Recognition with Directed Graph Neural Network 摘要 因为骨架信息可以鲁棒地适应动态环境和复杂的背景,所以经常 ...

  8. Two-Stream Adaptive Graph Convolutional Network for Skeleton-Based Action Recognition

    Two-Stream Adaptive Graph Convolutional Network for Skeleton-Based Action Recognition 摘要 基于骨架的动作识别因为 ...

  9. Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition (ST-GCN)

    Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition 摘要 动态人体骨架模型带有进行动 ...

随机推荐

  1. ByteArray的操作总结(复制、打印、位运算)

    1. 字节数组的复制 Method A:Array.Clone() Clone返回的是Object对象,需要强类型转换:Clone会创建一个新的对象,并将value赋给dec ]; ]; dst = ...

  2. ASP.NET Core学习日志1

    1.ASP.NET进行了结构化的优化,使框架更为精简,模块化更加明显. 2.ASP.NET Core不再基于System.Web.dll,而是基于细粒度.分解的NuGet包. 3.基础特性: 1.We ...

  3. gym 100971 J Robots at Warehouse

    Vitaly works at the warehouse. The warehouse can be represented as a grid of n × m cells, each of wh ...

  4. BAPC 2014 Preliminary(第一场)

    D:Lift Problems On the ground floor (floor zero) of a large university building a number of students ...

  5. WLAN RTT (IEEE 802.11mc)

    WLAN RTT (IEEE 802.11mc) Android 9 中的 WLAN 往返时间 (RTT) 功能允许设备测量与其他支持设备的距离:无论它们是接入点 (AP) 还是 WLAN 感知对等设 ...

  6. Debian9.5 系统配置持久化iptables规则

    RedHat和SUSE系列下有比较好用的iptables管理工具,可以像控制服务进程一样来对防火墙进行管理及控制,Debian系发行版默认不开启iptables,当然也没有与之相关的能直接管理的工具了 ...

  7. request.getxxxxxx()的使用方法

    request.getSchema() 可以返回当前页面使用的协议,http 或是 https; request.getServerName() 可以返回当前页面所在的服务器的名字; request. ...

  8. Python解析Socket数据流异常bytes问题

    Python解析Socket数据流异常bytes问题 -- 2019-03-12 python在通过socket发送数据时,英文字符转义后为原来本身的字符,占一个字节(如:s转移后为s),而中文字符在 ...

  9. Git学习笔记 1,GitHub常用命令1

    廖雪峰Git教程 莫烦Git教程 莫烦Git视频教程 --------------- init > apt-get install git # 安装 > mkdir /home/yzn_g ...

  10. python虚拟环境virtualenv、virtualenv下运行IDLE、powershell 运行脚本由执行策略引起的问题

    一.为什么要创建虚拟环境: 应为在开发中会有同时对一个包不同版本的需求,创建多个开发环境就能解决这个问题.或许也会有对python不同版本的需求,这就需要使用程序来管理不同的版本,virtualenv ...