Behavior classification by eigendecomposition of periodic motions
- 格式:pdf
- 大小:617.74 KB
- 文档页数:11
基于深度学习的行为识别与异常行为检测算法研究随着科技的不断进步和深度学习领域的快速发展,人工智能在各个领域得到了广泛应用。
其中,基于深度学习的行为识别与异常行为检测算法成为了近年来备受关注的研究方向。
本文将从行为识别和异常行为检测两个方面介绍该领域的研究进展,并探讨其应用前景。
首先,行为识别是深度学习中的一个重要任务。
通过分析人类和其他生物的行为,可以更好地理解他们的意图和动机,并且这对于智能系统的交互和决策具有非常重要的意义。
在行为识别任务中,主要通过深度学习模型对输入数据进行特征提取和分类。
传统的行为识别技术通常采用手工设计的特征提取方法,但这种方法存在一定的局限性。
而深度学习可以通过端到端的训练方式,自动学习数据中的特征表示,并在一定程度上提高行为识别的性能。
目前,常用的深度学习模型包括卷积神经网络(Convolutional Neural Network,CNN)和循环神经网络(Recurrent Neural Network,RNN)等。
CNN主要用于处理图像和视频数据,对于行为识别中的图像序列或视频数据具有良好的表达能力。
而RNN则适用于处理时间序列数据,可以建模动态行为以及时间依赖关系。
另外,为了提高行为识别的性能,研究者们还结合了其他技术,如注意力机制、迁移学习和强化学习等。
通过引入注意力机制,可以使模型更关注重要的行为片段或特征,从而提高行为识别的准确性。
迁移学习可以利用已有的知识来加快新任务的学习速度,这对于行为识别中数据量较小的问题非常有用。
在强化学习中,可以通过与环境的交互,使模型能够自主学习并优化行为策略。
除了行为识别,异常行为检测也是基于深度学习的重要研究方向。
异常行为通常指的是与正常行为不一致或具有潜在风险的行为。
在许多实际应用中,如视频监控、工业安全等领域,对异常行为的检测具有重要意义。
相比于传统的异常检测方法,基于深度学习的异常行为检测算法具有更高的准确性和鲁棒性。
人工智能生态相关的专业名词解释下载温馨提示:该文档是我店铺精心编制而成,希望大家下载以后,能够帮助大家解决实际的问题。
文档下载后可定制随意修改,请根据实际需要进行相应的调整和使用,谢谢!并且,本店铺为大家提供各种各样类型的实用资料,如教育随笔、日记赏析、句子摘抄、古诗大全、经典美文、话题作文、工作总结、词语解析、文案摘录、其他资料等等,如想了解不同资料格式和写法,敬请关注!Download tips: This document is carefully compiled by the editor. I hope that after you download them, they can help yousolve practical problems. The document can be customized and modified after downloading, please adjust and use it according to actual needs, thank you!In addition, our shop provides you with various types of practical materials, such as educational essays, diary appreciation, sentence excerpts, ancient poems, classic articles, topic composition, work summary, word parsing, copy excerpts,other materials and so on, want to know different data formats and writing methods, please pay attention!人工智能技术的快速发展,推动着人工智能生态系统的形成和完善。
(19)中华人民共和国国家知识产权局(12)发明专利申请(10)申请公布号 (43)申请公布日 (21)申请号 202011101287.9(22)申请日 2020.10.15(71)申请人 南京邮电大学地址 210003 江苏省南京市鼓楼区新模范马路66号(72)发明人 张晖 李可欣 赵海涛 孙雁飞 朱洪波 (74)专利代理机构 南京苏高专利商标事务所(普通合伙) 32204代理人 曹坤(51)Int.Cl.G06K 9/00(2006.01)G06K 9/62(2006.01)G06N 3/04(2006.01)G06T 7/269(2017.01)(54)发明名称基于微表情分析的抑郁症识别系统(57)摘要本发明公开了一种基于微表情分析的抑郁症识别系统。
属于计算机视觉领域;具体步骤:1、训练深度多任务识别网络;2、对人脸的重要的局部区域进行划分,剔除与微表情无关的区域;3、训练自适应的双流神经网络,对微表情运动的开始帧、Apex帧、结束帧进行定位;4、根据在不同背景下对微表情的分析判断该人是否患有抑郁症。
本发明以深度多任务神经网络为基础,对图像进行预处理,从而进行人脸重要局部区域划分,提高双流神经网络的识别速度,满足实时性的要求;并通基于注意力机制的BLSTM ‑CNN神经网络提取重要的帧图片特征以及自适应融合双流神经网络提取到的双流特征提高微表情运动帧的定位,进而提高微表情识别的速度和准确性。
权利要求书1页 说明书6页 附图2页CN 112232191 A 2021.01.15C N 112232191A1.基于微表情分析的抑郁症识别系统,其特征在于,具体步骤包括如下:步骤(1.1)、对视频信息预处理,通过训练深度多任务识别网络得到人脸状态;步骤(1.2)、根据得到的输出人脸状态结果,对人脸的局部区域进行划分,剔除与微表情无关的区域;步骤(1.3)、将划分的局部区域作为原始信息,对其进行提取光流,后将原始信息与提取的光流信息输入到自适应的双流神经网络中,对双流神经网络进行训练,进而对微表情运动的开始帧、峰值帧、结束帧进行定位;步骤(1.4)、通过输出的定位结果对微表情进行分析,根据在不同背景下对微表情的分析判断该人是否患有抑郁症。
classification名词解释Classification(分类)是一种在计算机科学中常用的数据挖掘主题,是学习者根据训练数据集中的某些特征,将潜在的数据分类、组织和区分到多个类别群体中的一种算法。
是在统计学、机器学习和模式识别领域,发展起来的学科,专门用来解决分类、聚类问题。
二、Classification(分类)的方法Classification(分类)的方法有很多,但最常用的方法之一就是朴素贝叶斯(Naive Bayes)分类,它是一种贝叶斯统计方法,可以把一组特征关联到多个类别的分类算法。
它基于概率论,依据给定的特征和类别,基于概率原理确定最有可能的类别。
朴素贝叶斯(Naive Bayes)分类属于监督学习,在给定训练集上可以有效地进行分类,能够良好地处理多维特征和较多的类别。
另外,还有 K邻算法(K-nearest neighbour,KNN),它是一种基于实例的学习方法,可以根据现有数据集中的模式去预测新的数据,不需要先构建模型。
KNN类器会搜索训练集中的数据点,并以它们为基础来确定新数据点的分类,它的好处是很容易实现,但是属于比较慢的算法,而且会产生大量的计算量。
决策树(Decision tree)也是一种常用的分类算法,它可以根据一定特征属性对目标变量进行划分,属于监督式学习,是典型的树形模型,实现的较为简单。
它的优点是在某些数据集上表现出较高的准确率,而且它可以很好地理解和可视化,在实际工作中使用较多。
最后还有神经网络(Neural Networks),它是一种模拟人类脑细胞的网络,由于它的“深度”,可以用来解决比较复杂的分类问题,而且可以自动从训练数据中学习分类规则,省去了人工定义特征的工作量。
三、Classification(分类)在实际应用中的作用Classification(分类)技术已经被用于许多实际应用,包括机器学习、数据挖掘、信息检索、网页排名、文本处理、自然语言处理、图像处理等场景中。
Gender discrimination is a pervasive issue that affects individuals of all genders,but it is particularly prevalent against women and nonbinary individuals.Addressing this issue requires a multifaceted approach that includes education,legislation,and societal change.Here are some strategies to combat gender discrimination:cation and Awareness:The first step in solving gender discrimination is to raise awareness about its existence and cational programs in schools and workplaces can help to challenge stereotypes and promote understanding of gender equality.2.Legislation and Policies:Governments and organizations should enact and enforce laws that prohibit gender discrimination.This includes equal pay for equal work, maternity and paternity leave,and protection against harassment and violence.3.Workplace Practices:Companies should implement policies that promote diversity and inclusion,such as blind recruitment processes,diversity training,and mentorship programs for underrepresented groups.4.Representation in Leadership:Encouraging and supporting women and nonbinary individuals in leadership positions can help to break down barriers and set a precedent for gender equality.5.Media Representation:The media plays a significant role in shaping societal attitudes. By promoting diverse and positive representations of all genders,the media can help to challenge and change harmful stereotypes.munity Engagement:Grassroots movements and community initiatives can be powerful tools for change.Encouraging dialogue and organizing events that bring attention to gender discrimination can help to mobilize support for change.7.Support Networks:Creating safe spaces and support networks for individuals who have experienced gender discrimination can provide a platform for sharing experiences and advocating for change.8.Men as Allies:Engaging men in the conversation and encouraging them to be allies in the fight against gender discrimination can help to shift societal norms and expectations.9.Intersectionality:Recognizing that gender discrimination often intersects with other forms of discrimination,such as race or sexual orientation,is crucial.Addressing these intersections holistically can lead to more comprehensive solutions.10.Technology and Innovation:Utilizing technology to create platforms for reporting and tracking gender discrimination can help to hold individuals and institutions accountable.11.International Cooperation:Gender discrimination is a global issue that requires international cooperation.Working with international organizations to set standards and share best practices can help to drive global progress.12.Cultural Shift:Ultimately,solving gender discrimination requires a fundamental shift in cultural attitudes.This involves challenging traditional gender roles and expectations, and promoting a culture of respect and equality.By implementing these strategies,societies can work towards a more equitable future where all individuals are valued and treated equally,regardless of their gender.。
基于堆叠降噪自编码器的神经–符号模型及在晶圆表面缺陷识
别
刘国梁;余建波
【期刊名称】《自动化学报》
【年(卷),期】2022(48)11
【摘要】深度神经网络是具有复杂结构和多个非线性处理单元的模型,通过模块化的方式分层从数据提取代表性特征,已经在晶圆缺陷识别领域得到了较为广泛的应用.但是,深度神经网络在应用过程中本身存在“黑箱”和过度依赖数据的问题,显著地影响深度神经网络在晶圆缺陷识别的工业可应用性.提出一种基于堆叠降噪自编码器的神经–符号模型.首先,根据堆叠降噪自编码器的网络特点采用了一套符号规则系统,规则形式和组成结构使其可与深度神经网络有效融合.其次,根据网络和符号规则之间的关联性提出完整的知识抽取与插入算法,实现了深度网络和规则之间的知识转换.在实际工业晶圆表面图像数据集WM-811K上的试验结果表明,基于堆叠降噪自编码器的神经–符号模型不仅取得了较好的缺陷探测与识别性能,而且可有效提取规则并通过规则有效描述深度神经网络内部计算逻辑,综合性能优于目前经典的深度神经网络.
【总页数】15页(P2688-2702)
【作者】刘国梁;余建波
【作者单位】同济大学机械与能源工程学院
【正文语种】中文
【中图分类】TP3
【相关文献】
1.基于混合模型与流形调节的晶圆表面缺陷识别
2.基于局部与非局部线性判别分析和高斯混合模型动态集成的晶圆表面缺陷探测与识别
3.晶圆表面缺陷模式识别的二维主成分分析卷积自编码器
4.基于稀疏堆叠降噪自编码器-深层神经网络的语音DOA估计算法
5.基于堆叠稀疏降噪自编码器的暂态稳定评估模型
因版权原因,仅展示原文概要,查看原文内容请购买。
Quizzes for Chapter11单选(1分)图灵测试旨在给予哪一种令人满意的操作定义得分/总分A.人类思考B.人工智能C.机器智能1.00/1.00D.机器动作正确答案:C你选对了2多选(1分)选择以下关于人工智能概念的正确表述得分/总分 A.人工智能旨在创造智能机器该题无法得分/1.00B.人工智能是研究和构建在给定环境下表现良好的智能体程序该题无法得分/1.00C.人工智能将其定义为人类智能体的研究该题无法得分/1.00D.人工智能是为了开发一类计算机使之能够完成通常由人类所能做的事该题无法得分/1.00正确答案:A、B、D你错选为A、B、C、D3多选(1分)如下学科哪些是人工智能的基础?得分/总分A.经济学0.25/1.00B.哲学0.25/1.00C.心理学0.25/1.00D.数学0.25/1.00正确答案:A、B、C、D你选对了4多选(1分)下列陈述中哪些是描述强AI(通用AI)的正确答案?得分/总分A.指的是一种机器,具有将智能应用于任何问题的能力0.50/1.00 B.是经过适当编程的具有正确输入和输出的计算机,因此有与人类同样判断力的头脑0.50/1.00C.指的是一种机器,仅针对一个具体问题D.其定义为无知觉的计算机智能,或专注于一个狭窄任务的AI正确答案:A、B你选对了5多选(1分)选择下列计算机系统中属于人工智能的实例得分/总分A.Web搜索引擎B.超市条形码扫描器C.声控电话菜单该题无法得分/1.00D.智能个人助理该题无法得分/1.00正确答案:A、D你错选为C、D6多选(1分)选择下列哪些是人工智能的研究领域得分/总分A.人脸识别0.33/1.00B.专家系统0.33/1.00C.图像理解D.分布式计算正确答案:A、B、C你错选为A、B7多选(1分)考察人工智能(AI)的一些应用,去发现目前下列哪些任务可以通过AI来解决得分/总分A.以竞技水平玩德州扑克游戏0.33/1.00B.打一场像样的乒乓球比赛C.在Web上购买一周的食品杂货0.33/1.00D.在市场上购买一周的食品杂货正确答案:A、B、C你错选为A、C8填空(1分)理性指的是一个系统的属性,即在_________的环境下做正确的事。
创造力的影响因素分析徐树鹏(摘要:研究从三个方面总结了影响创造力的因素:紧张性刺激刺激,个体因素,和社会影响。
结果发现同一个因素对创造力的影响模式很少有统一的观点,不少研究甚至得出了截然相反的结论。
目前的研究试图采用元分析的技术和分析调节变量来解决这些冲突。
本文还简要阐述了当前的一些研究趋势,并提供了未来研究的一些建议。
关键词: 创造力;紧张性刺激;弱连带关系;元分析;调节变量0 前言目前学者们对于创造力的定义较为一致,一般将其定义为产生新颖的和合适的观点,问题解决方案或见解的能力(Runco, 2004)。
创造力的研究已有60年的历史,期间关于创造力影响因素的研究浩如烟海,且结论大多不统一。
如今越来越多地研究试图去弄清这些影响模式,且不再局限于简单的文献综述,而是采用更高级的研究方法,比如元分析技术。
Feist(1998)就用元分析技术分析了人格特质对科学和艺术领域创造力的影响;Kristin等人(2010)也使用元分析技术去研究紧张性刺激与创造力之间的复杂关系。
本研究总结了影响创造力的诸多因素,尤其关注了其中有冲突的观点,为今后进一步的元分析研究提供一些可供分析的领域。
此外本外还指出了目前研究的一些思路和方向,为后续的研究提供一些线索。
Zorana(2009)认为个体潜能和社会环境之间的交互作用将会决定创造力是否被表达以及如何表达。
本文根据该观点并结合已有文献内容从以下三个方面来分析创造力的影响因素:紧张性刺激刺激,个体因素,和社会影响。
1.紧张性刺激紧张性刺激指能够引起紧张或压力的刺激(Kristin,Shalini,& Deborah,2010),包括竞争,时间压力,表现评估等。
关于紧张性刺激和创造力的关系已做过大量的研究,但是研究结果却并不一致。
Little (引自Kristin,et al.2010)的研究表明与他人的竞争可以提高被试的创造力表现,但是也有研究表明竞争会导致创造力表现的降低(Gerrard, Poteat, & Ironsmith, 1996)。
Pattern Recognition 38(2005)1033–1043/locate/patcogBehavior classification by eigendecomposition of periodic motionsRoman Goldenberg ∗,Ron Kimmel,Ehud Rivlin,Michael RudzskyComputer Science Department,Technion—Israel Institute of Technology,Technion City,Haifa 32000,IsraelReceived 23January 2004;received in revised form 25October 2004;accepted 8November 2004AbstractWe show how periodic motions can be represented by a small number of eigenshapes that capture the whole dynamic mech-anism of the motion.Spectral decomposition of a silhouette of a moving object serves as a basis for behavior classification by principle component analysis.The boundary contour of the walking dog,for example,is first computed efficiently and accurately.After normalization,the implicit representation of a sequence of silhouette contours given by their corresponding binary images,is used for generating eigenshapes for the given motion.Singular value decomposition produces these eigen-shapes that are then used to analyze the sequence.We show examples of object as well as behavior classification based on the eigendecomposition of the binary silhouette sequence.᭧2005Pattern Recognition Society.Published by Elsevier Ltd.All rights reserved.Keywords:Visual motion;Segmentation and tracking;Object recognition;Activity recognition;Non-rigid motion;Active contours;Periodicity analysis;Motion-based classification;Eigenshapes1.IntroductionFuturism is a movement in art,music,and literature that began in Italy at about 1909and marked especially by an effort to give formal expression to the dynamic energy and movement of mechanical processes.A typical example is the ‘Dynamism of a Dog on a Leash’by Giacomo Balla,who lived during the years 1871–1958in Italy,see Fig.1[1].In this painting one could see how the artist captures in one still image the periodic walking motion of a dog on a leash.Following a similar philosophy,we show how periodic mo-tions can be represented by a small number of eigenshapes that capture the whole dynamic mechanism of periodic mo-tions.Singular value decomposition of a silhouette of an ob-ject serves as a basis for behavior classification by principle∗Corresponding author.Tel.:+97248294940;fax:+97248293900.E-mail addresses:romang@cs.technion.ac.il (R.Goldenberg),ron@cs.technion.ac.il (R.Kimmel),ehudr@cs.technion.ac.il (E.Rivlin),rudzsky@cs.technion.ac.il (M.Rudzsky).0031-3203/$30.00᭧2005Pattern Recognition Society.Published by Elsevier Ltd.All rights reserved.doi:10.1016/j.patcog.2004.11.024component analysis.Fig.2present a running horse video sequence and its eigenshape decomposition.One can see the similarity between the first eigenshapes—Fig.2(c and d),and another futurism style painting “The Red Horseman”by Carlo Carra [1]—Fig.2(e).The boundary contour of the moving non-rigid object is computed efficiently and accu-rately by the fast geodesic active contours [2].After normal-ization,the implicit representation of a sequence of silhou-ette contours given by their corresponding binary images,is used for generating eigenshapes for the given motion.Sin-gular value decomposition produces the eigenshapes that are used to analyze the sequence.We show examples of object as well as behavior classification based on the eigendecom-position of the sequence.2.Related workMotion-based recognition received a lot of attention in the field of image analysis.The main reason is that direct usage of temporal data can improve our ability to solve a1034R.Goldenberg et al./Pattern Recognition 38(2005)1033–1043number of basic computer vision problems.Examples are image segmentation,tracking,object classification,etc.In general,when analyzing a moving object,one can use two main sources of information:changes of the moving object position (and orientation)in space,and the object’s deformations.Object position is an easy-to-get characteristic,applica-ble both for rigid and non-rigid bodies.It is provided by most of existing target detection and tracking systems.A number of techniques [3–6]were proposed for the detection of motion events and recognition of various types of mo-tions based on the analysis of the moving object trajectory and its derivatives.Detecting object orientation is a more challenging problem.It is usually solved by fitting a model that may vary from a simple ellipsoid [6]to a complex 3-D vehicle model [7]or a specific aircraft-class model adapted for noisy radar images as in Ref.[8].Fig.1.‘Dynamism of a Dog on a Leash’1912,by Giacomo Balla,Albright-Knox Art Gallery,Buffalo.Fig.2.(a)Running horse video sequence,(b)first 10eigenshapes,(c,d)first and second eigenshapes enlarged,(e)‘The Red Horseman’,1914,by Carlo Carra,Civico Museo d’Arte Contemporanea,Milan.While object orientation characteristic is more applicable for rigid objects,it is object deformation that contains the most essential information about the nature of the non-rigid body motion.This is especially true for natural non-rigid ob-jects in locomotion that exhibit substantial changes in their apparent view.In this case the motion itself is caused by these deformations,e.g.walking,running,hopping,crawl-ing,flying,etc.There exists a large number of papers dealing with the classification of moving non-rigid objects and their motions,based on their appearance.Lipton et al.describe a method for moving target classification based on their static appear-ance [9]and using their skeletons [10].Polana and Nelson [11]used local motion statistics computed for image grid cells to classify various types of activities.An approach using the temporal templates and motion history images (MHI)for action representation and classification was sug-gested by Davis and Bobick in Ref.[12].Cutler and Davis [13]describe a system for real-time moving object classifi-cation based on periodicity analysis.Describing the whole spectrum of papers published in this field is beyond the scope of this paper,and we refer the reader to the surveys [14–16].Related to our approach are the works of Yacoob and Black [17]and Murase and Sakai [18].In Ref.[17],human activities are recognized using a parameterized representa-tion of measurements collected during one motion period.The measurements are eight motion parameters tracked for five body parts (arm,torso,thigh,calf and foot).Murase and Sakai [18]describe a method for moving object recognition for gait analysis and lip reading.Their method is based on parametric eigenspace representation of the object silhouettes.Moving object segmentation is per-formed by using simple background subtraction.The eigen-basis is computed for one class of objects.Murase and Sakai used an optimization technique for finding the shift and scale parameters which is not trivial to implement.R.Goldenberg et al./Pattern Recognition38(2005)1033–10431035In this paper we concentrate on the analysis of defor-mations of moving non-rigid bodies.We thereby attempt to extract characteristics that allow us to distinguish be-tween different types of motions and different classes of objects.Essentially we build our method based on approaches suggested in Refs.[18,17]by revisiting and improving some of the components.Unlike previous related methods,we first introduce an advanced variational model for accurate segmentation and tracking.The proposed segmentation and tracking method allows extraction of moving object con-tours with high accuracy and,yet,is computationally effi-cient enough to be implemented in a real time system.The algorithm analyzing high-precision object contours achieves better classification results and can be easier adopted to var-ious object classes than the methods based on control points tracking[17]or background subtraction[18].We then compute eigenbasis for several classes of objects (e.g.humans and animals).Moreover,at thisfirst stage we select the right class by comparing the distance to the right feature space in each of the eigenbases.Next,we create a principle basis not only for static object view(single frames), but also for the whole period subsequences.For example,we create a principle basis for one human motion step and then project every period onto this basis.Unlike[18],we are not trying to build a separate basis for every motion type,but instead,build a common basis for the whole human motion class.Different types of human motions are then detected by looking at the projections of motion samples onto this basis.Apparently,this type of analysis can only be possible due to the high quality of segmentation and tracking results. We propose an explicit temporal alignment process for finding the shift and scale.Finally,we use a simple clas-sification approach to demonstrate the performances of our framework and test it on various classes of moving objects.3.Our approachOur basic assumption is that for any given class of moving objects,like humans,dogs,cats,and birds,the apparent object view in every phase of its motion can be encoded as a combination of several basic body views or configurations. Assuming that a living creature exhibits a pseudo-periodic motion,one motion period can be used as a comparable information unit.The basic views are then extracted from a large training set.An observed sequence of object views collected from one motion period is projected onto this basis. This forms a parameterized representation of the object’s motion that can be used for classification.Unlike[17]we do not assume an initial segmentation of the body into parts and do not explicitly measure the motion parameters.Instead,we work with the changing apparent view of deformable objects and use the parameterization induced by their form variability.In what follows we describe the main steps of the process that include,•Segmentation and tracking of the moving object that yield an accurate external object boundary in every frame.•Periodicity analysis,in which we estimate the frequency of the pseudo-periodic motion and split the video se-quence into single-period intervals.•Frame sequence alignment that brings the single-period sequences to a standardized form by compensating for temporal shift,speed variations,different object sizes and imaging conditions.•Parameterization by building an eigenshape basis from a training set of possible object views and projecting the apparent view of a moving body onto this basis.3.1.Segmentation and trackingOur approach is based on the analysis of deformations of the moving body.Therefore the accuracy of the segmentation and tracking algorithm infinding the target outline is crucial for the quality of thefinal result.This rules out a number of available or easy-to-build tracking systems that provide only a center of mass or a bounding box around the target and calls for more precise and usually more sophisticated solutions.Therefore,we decided to use the geodesic active contour approach[19]and specifically the‘fast geodesic active con-tour’method described in Ref.[2].The segmentation prob-lem is expressed as a geometric energy minimization.We search for a curve C that minimizes the functionalS[C]=L(C)g(C)d s,where d s is the Euclidean arclength,L(C)is the total Eu-clidean length of the curve,and g is a positive edge indi-cator function in a3-D hybrid spatial-temporal space that depends on the pair of consecutive frames I t−1(x,y)and I t(x,y),where I t(x,y): ⊂R2×[0,T]→R+.It gets small values along the spatial-temporal edges,i.e.moving object boundaries,and higher values elsewhere.In addition to the scheme described in Ref.[2],we also use the background information whenever a static back-ground assumption is valid and a background image B(x,y) is available.In the active contours framework this can be achieved either by modifying the g function to reflect the edges in the difference image D(x,y)=|B(x,y)−I t(x,y)|, or by introducing additional area integration terms to the functional S(C):S[C]=L(C)g(C)d s+ 1C|D(x,y)−c1|2d a + 2\ C|D(x,y)−c2|2d a,1036R.Goldenberg et al./Pattern Recognition 38(2005)1033–1043Fig.3.Non-rigid moving object segmentation and tracking.where C is the area inside the contour C , 1and 2are fixed parameters and c 1,c 2are given by c 1=average C [D(x,y)],c 2=average \ C [D(x,y)].The latter approach is inspired by the ‘active contours with-out edges’model proposed by Chan and Vese [20].It forces the curve C to close on a region whose interior and exte-rior have approximately uniform D(x,y)values.A different approach to utilize the region information by coupling be-tween the motion estimation and the tracking problem was suggested by Paragios and Deriche in Ref.[21].Fig.3shows some results of moving object segmentation and tracking using the proposed method.Contours can be represented in various ways.Here,in or-der to have a unified coordinate system and be able to apply a simple algebraic tool,we use the implicit representation of a simple closed curve as its binary image.That is,the contour is given by an image for which the exterior of the contour is black while its interior is white.3.2.Periodicity analysisHere we assume that the majority of non-rigid moving objects are self-propelled alive creatures,whose motion is almost periodic.Thus,one motion period,like a step of a walking man or a rabbit hop,can be used as a natural unit of motion.Extracted motion characteristics can by normalized by the period size.The problem of detection and characterization of periodic activities was addressed by several research groups and the prevailing technique for periodicity detection and measure-ments is the analysis of the changing 1-D intensity signals along spatio-temporal curves associated with a moving ob-ject or the curvature analysis of feature point trajectories [22–25].Here we address the problem using global charac-teristics of motion such as moving object contour deforma-tions and the trajectory of the center of mass.By running frequency analysis e.g.Refs.[13,26]on such 1-D contour metrics as the contour area,velocity of the cen-ter of mass,principal axes orientation,etc.we can detect the basic period of the motion.Figs.4and 5present global motion characteristics derived from segmented moving ob-jects in two sequences.One can clearly observe the common dominant frequency in all three graphs.The period can also be estimated in a straightforward manner by looking for the frame where the external object contour best matches the object contour in the current frame.Fig.6shows the deformations of a walking man contour during one motion period (step).Samples from two different steps are presented and each vertical pair of frames is phase synchronized.One can clearly see the similarity between the corresponding contours.An automated contour matching can be performed in a number of ways,e.g.by comparing contour signatures or by looking at the correlation between the object silhouettes in different frames as we chose to do in our work.Fig.7shows four graphs of inter-frame silhouette correlation values measured for four different starting frames taken within one motion period.It is clearly visible that all four graphs nearly coincide and the local maxima peaks are approximately evenly spaced.The period,therefore,was estimated as the average distance between the neighboring peaks.R.Goldenberg et al./Pattern Recognition 38(2005)1033–10431037020406080100120140-0.2 -0.100.10.2r a d020406080100120140600800100012001400s q . p i x e l s20406080100120140-4 -2024p i x e l s /f r a m eframe numberWalking man − global motion parametersPrincipal axis inclination angleContour areaCenter of mass − vertical velocityFig.4.Global motion characteristics measured for walking man sequence.02040608010012000.050.10.150.2r a d0204060801001207000750080008500s q . p i x e l s020406080100120-2-1012p i x e l s /f r a m eframe numberWalking cat − global motion parametersPrincipal axis inclination angleContour areaCenter of mass − vertical velocityFig.5.Global motion characteristics measured for walking cat sequence.1038R.Goldenberg et al./Pattern Recognition 38(2005)1033–1043Fig.6.Deformations of a walking man contour during one motion period (step).Two steps synchronized in phase are shown.One can see the similarity between contours in corresponding phases.1020304050600.550.60.650.70.750.80.850.90.951frame offseti n t e r −f r a m e c o r r e l a t i o n Fig.7.Inter-frame correlation between object silhouettes.Four graphs show the correlation measured for four initial frames.3.3.Frame sequence alignmentOne of the most desirable features of any classification system is the invariance to a set of possible input transfor-mations.As the input in our case is not a static image,but a sequence of images,the system should be robust to both spatial and temporal variations.In the section below we de-scribe the methods for spatial and temporal alignment of input video sequence bringing it to a canonical,comparable form allowing later classification.3.3.1.Spatial alignmentWe want to compare every object to a single pattern re-gardless of its size,viewing distance and other imaging con-ditions.This is achieved by cropping a square bounding box around the center ofmass of the tracked target silhouette and re-scaling it to a predefined size (see Fig.8).Note that this is not a scale invariant transformation unlike other reported normalization techniques e.g.Ref.[27].Fig.8.Scale alignment.A minimal square bounding box around the center of the segmented object silhouette (a)is cropped and re-scaled to form a 50×50binary image (b).One way to have orientation invariance is to keep a col-lection of motion samples for a wide range of possible motion directions and then look for the best match.This approach was used by Yacoob and Black in Ref.[17]to distinguish between different walking directions.Although here we experiment only with motions nearly parallel to the image plane,the system proved to be robust to small vari-ations in orientation.Since we do not want to keep models for both left-to-right and right-to-left motion directions,the right-to-left moving sequences are converted to left-to-right by horizontal mirror flip.3.3.2.Temporal alignmentA good estimate of the motion period allows us to com-pensate for motion speed variations by re-sampling each period subsequence to a predefined duration.This can be done by interpolation between the binary silhouette im-ages themselves or between their parameterized represen-tation as explained below.Fig.9presents an original and re-sampled one-period subsequence after scaling from 11to 10frames.Temporal shift is another issue that has to be addressed in order to align the phase of the observed one-cycle sample and the models stored in the training base.In Ref.[17]it was done by solving a minimization problem of finding theR.Goldenberg et al./Pattern Recognition38(2005)1033–10431039Fig.9.Temporal alignment.Top:original11frames of one period subsequence.Bottom:re-sampled10framessequence.Fig.10.Temporal shift alignment:(a)average starting frame of all the training set sequences,(b)temporally shifted single-cycle test sequence,(c)correlation between the reference starting frame and the test sequence frames.optimal parameters of temporal scaling and time shift trans-formations so that the observed sequence is best matched to the training samples.Polana and Nelson[11]handled this problem by matching the test one-period subsequence to reference template at all possible temporal translations. Murase and Sakai[18]used an optimization technique for finding the shift and scale parameters.Here we propose an explicit temporal alignment process.Let us assume that in the training set all the sequences are accurately aligned.Then wefind the temporal shift of a test sequence by looking for the starting frame that best matches the generalized(averaged)starting frame of the training samples,as they all look alike.Fig.10shows(a)the reference starting frame taken as an average over the tem-porally aligned training set,(b)a re-sampled single-period test sequence and,(c)the correlation between the reference starting frame and the test sequence frames.The maximal correlation is achieved at the seventh frame,therefore the test sequence is aligned by cyclically shifting it7frames to the left.3.4.ParameterizationIn order to reduce the dimensionality of the problem we first project the object image in every frame onto a low-dimensional base.The base is chosen to represent all pos-sible appearances of objects that belong to a certain class, like humans,four-leg animals,etc.Let n be number of frames in the training base of a certain class of objects and M be a training samples matrix,where each column corresponds to a spatially aligned image of a moving object written as a binary vector.In our experiments we use50×50normalized images,therefore,M is a2500×n matrix.The correlation matrix MM T is decomposed using Singular Value Decomposition as MM T=U V T,where U is an orthogonal matrix of principal directions and the is1040R.Goldenberg et al./Pattern Recognition38(2005)1033–1043Fig.11.The principal basis for the‘dogs and cats’training set formed of20first principal component vectors.a diagonal matrix of singular values.In practice,the decom-position is performed on M T M,which is computationally more efficient[28].The principal basis{U i,i=1,...,k}for the training set is then taken as k columns of U correspond-ing to the largest singular values in .Fig.11presents a principal basis for the training set formed of800sample im-ages collected from more than60sequences showing dogs and cats in motion.The basis is built by taking the k=20first principal component vectors.We build such representative bases for every class of objects.Then,we can distinguish between various object classes byfinding the basis that best represents a given ob-ject image in the distance to the feature space(DTFS)sense. Fig.12shows the distances from more than1000various im-ages of people,dogs and cats to the feature space of people and to that of dogs and cats.In all cases,images of people were closer to the people feature space than to the animals’feature space and vise a versa.This allows us to distinguish between these two classes.A similar approach was used in Ref.[29]for the detection of pedestrians in traffic scenes. If the object class is known(e.g.we know that the object is a dog),we can parameterize the moving object silhouette image I in every frame by projecting it onto the class basis. Let B be the basis matrix formed from the basis vectors {U i,i=1,...,k}.Then,the parameterized representation of the object image I is given by the vector p of length k as p=B T v I,where v I is the image I written as a vector. The idea of using a parameterized representation in motion-based recognition context is certainly not a new one.To name a few examples we mention again the works of Yacoob and Black[17],and of Murase and Sakai[18].0510******** 051015202530Distance to the "people" feature spaceDistancetothe"dogs&cats"featurespacedogs and catspeopleFig.12.Distances to the‘people’and‘dogs and cats’feature spaces from more than1000various images of people,dogs and cats.Cootes et al.[30]used similar technique for describing fea-ture point locations by a reduced parameter set.Baumberg and Hogg[31]used PCA to describe a set of admissible B-spline models for deformable object tracking.Chomat and Crowley[32]used PCA-based spatio-temporalfilter for human motion recognition.Fig.13shows several normalized moving object images from the original sequence and their reconstruction from a parameterized representation by back-projection to the im-age space.The numbers below are the norms of differences between the original and the back-projected images.These norms can be used as the DTFS estimation.Now,we can use these parameterized representations to distinguish between different types of motion.The refer-ence base for the activity recognition consists of temporally aligned one-period subsequences.The moving object sil-houette in every frame of these subsequences is represented by its projection to the principal basis.More formally,let {I f:f=1,...,T}be a one-period,temporally aligned set of normalized object images,and p f,f=1,...,T a pro-jection of the image I f onto the principal basis B of size k.Then,the vector P of length kT formed by concatenation of all the vectors p f,f=1,...,T,represent a one-period subsequence.By choosing a basis of size k=20and the nor-malized duration of one-period subsequence to be T=10 frames,every single-period subsequence is represented by a feature point in a200-dimensional feature space.In the following experiment we processed a number of sequences of dogs and cats in various types of locomotion. From these sequences we extracted33samples of walking dogs,9samples of running dogs,9samples with gallop-ing dogs and14samples of walking cats.Let S200×m beR.Goldenberg et al./Pattern Recognition 38(2005)1033–10431041Fig.13.Image sequence parameterization.Top:11normalized target images of the original sequence.Bottom:the same images after the parameterization using the principal basis and back-projecting to the image basis.The numbers are the norms of the differences between the original and the back-projected images.Fig.14.Feature points extracted from the sequences with walking,running and galloping dogs and walking cats and projected to the 3D space for visualization.the matrix of projected single-period subsequences,where m is the number of samples and the SVD of the correla-tion matrix is given by SS T =U S S V S .In Fig.14we de-pict the resulting feature points projected for visualization to the 3-D space using the three first principal directions {U S i :i =1,...,3},taken as the column vectors of U S cor-responding to the three largest eigen-values in S .One can easily observe four separable clusters corresponding to the four groups.Another experiment was done over the ‘people’class of images.Fig.15presents feature points corresponding to several sequences showing people walking and running parallel to the image plane and running at oblique angle to the camera.Again,all three groups lie in separable clusters.The classification can be performed,for example,using the k -nearest-neighbor algorithm.We conducted the ‘leave one out’test for the dogs set above,classifying every sample by taking them out from the training set one at a time.The three-nearest-neighbors strategy resulted in 100%successrate.Fig.15.Feature points extracted from the sequences showing peo-ple walking and running parallel to the image plane and at 45◦angle to the camera.Feature points are projected to the 3-D space for visualization.Another option is to further reduce the dimensionality of the feature space by projecting the 200-dimensional feature vectors in S to some principal basis.This can be done using a basis of any size,exactly as we did in 3-D for visualiza-tion.Fig.16presents learning curves for both animals and people data sets.The curves show how the classification er-ror rate achieved by one-nearest-neighbor classifier changes as a function of principal basis size.The curves are obtained by averaging over a hundred iterations when the data set is randomly split into training and testing sets.The principal basis is built on the training set and the testing is performed using the ‘leave one out’strategy on the testing set.4.Concluding remarksWe presented a new framework for motion-based clas-sification of moving non-rigid objects.The technique is based on the analysis of changing appearance of moving ob-jects and is heavily relying on high accuracy results of seg-mentation and tracking by using the fast geodesic contour。