Efficient Action Spotting Based on a Spacetime Oriented Structure

格式：pdf
大小：628.56 KB
文档页数：8

下载文档原格式

/ 8

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Eﬃcient Action Spotting Based on a Spacetime Oriented Structure

Representation

Konstantinos G.Derpanis,Mikhail Sizintsev,Kevin Cannons and Richard P.Wildes

Department of Computer Science and Engineering

York University

Toronto,

{

Abstract

This paper addresses action spotting,the spatiotem-

poral detection and localization of human actions in

video.A novel compact local descriptor of video dy-

namics in the context of action spotting is introduced

based on visual spacetime oriented energy measure-ments.This descriptor is eﬃciently computed directly from raw image intensity data and thereby forgoes the problems typically associated withﬂow-based features. An important aspect of the descriptor is that it allows for the comparison of the underlying dynamics of two spacetime video segments irrespective of spatial appear-ance,such as diﬀerences induced by clothing,and with robustness to clutter.An associated similarity measure is introduced that admits eﬃcient exhaustive search for an action template across candidate video sequences. Empirical evaluation of the approach on a set of chal-lenging natural videos suggests its eﬃcacy.

1.Introduction

This paper addresses the problem of detecting and localizing spacetime patterns,as represented by a sin-gle query video,in a reference video database.Speciﬁ-cally,patterns of current concern are those induced by human actions.This problem is referred to as action spotting.The term“action”refers to a simple dynamic pattern executed by a person over a short duration of time(e.g.,walking and hand waving).Potential ap-plications of the presented approach include video in-dexing and browsing,surveillance,visually-guided in-terfaces and tracking initialization.

A key challenge in action spotting arises from the fact that the same underlying pattern dynamics can yield very diﬀerent image intensities due to spatial ap-pearance diﬀerences,as with changes in clothing and live action versus animated cartoon content.Another challenge arises in natural imaging conditions where scene clutter requires the ability to distinguish relevant

…

oriented energy volumes

search (database)

ﬁlters decomposes the input videos into a distributed rep-resentation according to3D,(x,y,t),spatiotemporal orien-tation.(bottom)In a sliding window manner,the distri-bution of oriented energies of the template is compared to the search distribution at corresponding positions to yield a similarity volume.Finally,signiﬁcant local maxima in the similarity volume are identiﬁed.

pattern information from distractions.Clutter can be of two types:Background clutter arises when actions are depicted in front of complicated,possibly dynamic, backdrops;foreground clutter arises when actions are depicted with distractions superimposed,as with dy-namic lighting,pseudo-transparency(e.g.,walking be-hind a chain-link fence),temporal aliasing and weather eﬀects(e.g.,rain and snow).It is proposed that the choice of representation is key to meeting these chal-lenges:A representation that is invariant to purely spa-tial pattern allows actions to be recognized indepen-dent of actor appearance;a representation that sup-portsﬁne delineations of spacetime structure makes it

Efficient Action Spotting Based on a Spacetime Oriented Structure

合集下载

文档推荐

最新文档