当前位置：文档之家› 人体检测与追踪

人体检测与追踪

Human Motion Detection,Tracking and

Analysis For Automated Surveillance

Eyup Gedikli,Murat Ekinci

Computer Vision Lab.

Department of Computer Engineering

Karadeniz Technical University

Trabzon61080,Turkiye,

gediklie,ekinci@https://www.doczj.com/doc/b114758015.html,.tr

Abstract.This paper presents a silhouette based human motion de-

tection and analysis for real-time visual surveillance system.In order to

detect foreground objects,?rst,background scene model is statistically

learned even the background is not completely stationary.A background

maintenance model is also proposed for preventing some kind of falsies.

Then,the candidate foreground regions are detected using thresholding,

noise cleaning and their boundaries extracted using morphological?lters.

For human motion detection,object detection and classi?cation approach

for distinguishing a person,a group of person from detected foreground

objects(e.g.,cars)using silhouette shape and periodic motion cues is

performed.Finally,the trajectory of the people in motion and several

motion parameters produced from the cyclic motion of silhouette of the

object under tracking are implemented for analyzing people activities

such as walking and running,in the video sequences.Experimental re-

sults on the di?erent test image sequences demonstrate that the proposed

algorithm has an encouraging real-time background modeling based hu-

man motion detection and analysis performance with relatively robust

and low computational cost.

1Introduction

Visual analysis of human motion has recently persuaded more studies in the computer vision area.It attempts to detect,track,and identify people,and more generally,to understand human behaviors from image sequences involving humans[1][2],[12].Some companies such as IBM,Microsoft,and Mitsubishi are also investing on research on human motion analysis[5],[4],[6].One of the important research areas is automatically human behavior understanding[7]. Biometrics is also a technology that makes use of the physiological or behavioral characteristics to authenticate the identi?es of people[8].

Moving human detection is the?rst step processes for nearly every system of vision-based human analysis[17].The aim on moving human detection is to segment the regions corresponding to people from the rest of an image sequence. It is known to be a signi?cant and di?cult research problem[13].Background

subtraction is a particularly popular method for motion segmentation[12][10]. W4[2]uses dynamic appearance models to track people.A recursive convex hull algorithm is used to?nd body part locations for single person.Symmetry and periodicity analysis of each silhouette is used to determine if a person is carrying an object.Ricquerberg and Bouthemy[14]proposed tracking people by exploit-ing spatio-temporal slices.Their detection scheme involves the combined use of intensity,temporal di?erences between three successive images and of compari-son of the current image to a background reference image which is reconstructed and updated on line.

The following process after successfully tracked the moving humans from one frame to another in an image sequence in the video surveillance applications is human behaviors understanding from image sequences.Human behavior under-standing is to analyze and recognize the motion region segments by reason of human actions in frames and to produce high-level description of human actions. There has been considerable interest in the area of human motion analysis in recent years[1],[2],[7].Further works are also focused to human identi?cation based on gait analysis[15],[16].

This paper presents a set of techniques integrated into a low-cost PC based real time visual surveillance system for simultaneously human motion detection, tracking people,and analysis their activities in monochromatic video.Human motion detection and classi?cation approach for distinguishing a person,a group of person from detected foreground objects(e.g.,cars)using silhouette shape and periodic motion cues is presented.People tracking in a single camera is performed using background subtraction,followed by region correspondence.This takes into account multiple cues including velocity,sizes,distances of bounding box,and skeleton structures of the silhouette.Objects can be classi?ed based on the type of the cues and their motion characteristics.Finally an algorithm depending on four parameters produced from the skeleton of the silhouette is developed in order to human motion analysis for distinguishing walking and running actions. Then experimental results and discussion are presented in the?nal section.

2Motion Detection and Classi?cation

The aim on moving human detection is to segment the regions corresponding to people from the rest of an image sequence.Background estimation is a partic-ularly popular method for motion segmentation.The background scene model is statistically learned using the redundancy of the pixel intensities in a train-ing stage,even the background is not completely stationary.This redundancy information of the each pixel is separately stored in an history map shows the intensity variations on the pixel locations.Then the highest ratio of the redun-dancy(HRR)on the pixel intensity values in the history map in the training sequence is determined to have initial background model of the scene.Then an adaptive background model is updated to accommodate changes to the back-ground while maintaining the ability to detect independently moving objects (person(s)).More details are given in[11][10].The basis idea on the foreground

3 region detection based on adaptive background subtraction is to maintain a run-ning statistical background value of the intensity at each pixel.When the value of a pixel in a new image di?ers signi?cantly from the background value,the pixel is?agged as potentially containing a foreground region.Then foreground objects are segmented from the background in each frame of the video sequence by a four stage process:thresholding,noise cleaning,morphological?lters,and object detection.Each pixel is the?rst classi?ed as either a background or a foreground pixel using the background model,as shown in?gure1.a-b.

People have very distinctive shape,appearance,and motion patterns com-pared to other objects(car,animal,etc.).One can use a static shape analysis, such as aspect ratio,area,size perimeter,or dynamic motion analysis,such as speed,or periodicity of the motion to distinguish people from other objects.All cues in the static shape analysis can easily be obtained from the binary data pro-duced by thresholding and the bounding box parameters.But the cues can be produced from the dynamic motion analysis presents more reliable information for classi?cation,especially for the motion analysis.A skeleton based approach similar with[3]but more developed is presented in this study for the region classi?cation and for the motion

analysis.

Fig.1.Skeleton production from motion silhouette detected.(a)Detected motion re-gion,(b)Binarized region,(c)Border of the region,and its center of the gravity,(d) local maximal points,then the points projected onto silhouette.

(a)(b)(c)(d)

The process is?rst to binarized to the foreground region then to extract the outline of the silhouette region using morphological processes as shown in ?gure1a-c.Then a star skeleton is produced by detecting extremal points on the boundary of the silhouette detected,as shown in?gure1d,see to[11]for more details.The next goal in this section is to classify each moving object visible in the video sequence as a single person,a group of persons,or a vehicle.One of the advantages of video for classi?cation is its temporal component.To exploit this,the static and dynamic features over time are computed in each bounding box as it is detected through the sequence of frames(3-10frames).

The static shape features are directly measured from the silhouette and its bounding box.They are the bounding box aspect ratio,the axis of second mo-ment of the silhouette,the bounding box dispersedness(perimeter2/area).The dynamic shape features are also produced from the skeleton of the silhouette of

Fig.2.The classi?cation process on detected foreground regions,the dynamic struc-tures of a single human,a group of a person,and a car.

the region detected over time.Both the structure of the skeleton and repetitive changing on the skeleton gives important cues in analyzing di ?erent types of ob-jects.Figure 2shows the structure of the skeleton for di ?erent types of objects (a person,a group of person,and a car).It is considered,the repetitive changing on the structures of the skeletons has another good enough feature for classi?cation of the detected foreground regions in the image sequence.The local maximal of the distances (see ?gure 1d-f)are determined as the structure of the skeleton.Then this feature can be used for human model criteria for deciding whether the object detected is human or not as shown in ?gure 2.To more reliable classi?ca-tion another feature in the natural motion of people is also considered.It is that people exhibit periodic motion while they are moving.A periodic motion can be determined by self-similarity of the characteristics of silhouettes over time using the skeleton features of the silhouettes.As a result,static shape cues with a dynamic periodicity analysis and a periodicity of the motion analysis can be combined to distinguish human from the objects,such as a group of person,a car.

3Human Motion Tracking

It is assumed in this study that the regions can enter and exit the scene and they can also get occluded by other regions.Regions carry informations like shape and size of the silhouette,and colors data on a bounding box location estimated for each person.Each region is de?ned by the 2D coordinates of the centroid,P ,a ratio between the total number of foreground pixels (T)and the size of the bounding box (B),R =T/B ,and the color/gray level characteristic,D .The regions,for which correspondence has been established,have also an associated velocity,V .In frame t of a sequence,there are M regions with centroids P t i (where i number of regions)whose correspondences to previous frame are unknown.

There are K regions with centroids P t?1

L (where L is the label)in frame t-1

whose correspondences have been established with the previous frames.The number of regions in frame t can be less than the number of regions in frame t-1due to entries and it might be less due to exits or occlusion.The task is to establish correspondence between regions in frame t and frame t-1,and to determine entries and exits in these frames.The minimum cost criteria is used to establish correspondence.The cost function between two regions is de?ned as

C Li=P t?1

+V t?1

P t i

t?1

R t i

t?1

D t i

(1)

where L is the labels of region in frame t-1,i is index of non-corresponded region in frame t.The cost is calculated for all(L,i)pairs.Correspondence is established between the pair that gives the lowest cost,with the cost being less than a threshold.The all parameters of each region are updated using linear low pass?lter prediction models.

The process on correspondence continues till no pairs are left or the minimum cost rises above the threshold.In other words,the correspondences have been found between all regions in frames t-1and t,or there might be regions in frame t-1,which have not been corresponded to in frame t due to exist from the scene or due to occlusion,or there may be regions in frame t,which have not been corresponded to regions in frame t-1,because they just entered the frame.The position plus predicted velocity of the region exit/enter from/to scene are easily used for determining to have exited/entered the scene.If this is not the case, then a check for occlusion is made.While an occluded is determined,all the regions in occluded have merged in a single region in frame t.Now we need to update the parameters of the occluded region.For occluded region same cost function is applied for tracking.

Fig.3.Tracking results on our data set(frames2555-2606).Two person moving toward to each other,turn around each other,then go back away.Thick and thin white lines separately represent their trajectories produced by tracking algorithm.

In addition,at the tracking process,the center of the detected foreground region in the sequences is stored in a trajectory map.This is shown in?gure 3as white lines.In?gure3,two people moves toward each other.When they are occluded,they turn around each other,then go back away.The thick and thin white lines separately represent the trajectories of each object.The stored data in the trajectory map is used to implement each foreground region motions in the scene.When any foreground object in tracking is hidden to any non-motion area(like passing at the behind of parked car),or temporarily occluded

(a)Frame809(b)Frame821(c)Frame836

Fig.4.Occlusion example frames.

by other foreground object as they pass,the detected data of that object may not be obtained at the low level processing.At that or similar situations,high level implementation procedure is activated to estimate the possible position of that object using the previous tracked data obtained from trajectory.At the following frames,if the low level data about it is not obtained,its con?dence is reduced.If the con?dence of that object drops below a given threshold,it is considered lost,and is dropped from the tracking list stored in the trajectory map.High con?dence objects(ones that have been tracked for a reasonable period of time)will persist for several frames,so if an object is momentarily occluded but then reappears,the foreground object tracker will reacquire it. An example experimental results on a test image sequence includes occlusion example between the person and a group of person is shown in?gure4.In?gure 4-a,a person and a group of person are tracked in the image sequence.The person under tracking enters to the group of person,another a small group contains two persons exits from the big group,and second person enters to the scene in the following frame sequence,as shown in?gure4-b.Final frame in?gure4-c,?rst and second persons occluded with the big group separately exit from the big group,and the small group is still in scene and tracked.The skeleton structures of the foreground regions are also illustrated in?gure4,respectively.

4Human Motion Analysis

For the human motion analysis,using the geometrical shape models has the ad-vantage that they have much information than directly obtained features.But the di?culty and cost of calculation in extracting the models from the input frames are disadvantage of using shape models for real time video surveillance applications.Those di?culties prevent researches from concentrating on cogni-tion part of motion analysis process.Consequently,an approach depends on the variations of the features produced from the silhouette motions in frames is pre-sented in this study.The features implemented for human motion analysis are

directly obtained from dynamic variations on the star skeleton structures of the silhouette shape.This basic idea behind of the human motion analysis presented is similar to the study in [3],and is an attempt to make motion analysis more robust for real time applications.

The star skeleton consist of the centroid of a motion blob and three local extremal points that are recovered when traversing the boundary (see section 2).The three local extremal points correspond to head,and two legs.A hu-man is moving in an upright position,it can be assumed that the uppermost skeleton segment represents the torso,and the lower segments determined by two extremal points represent two legs.Then the angle θmeasures between the upper-most extremal point and vertical,the angle αmeasures between the lower two extremal points,and the angle βalso measures moving variations in time between end locations of two extremal points in 2D space corresponding to the ankles,as shown in ?gure 5.a.(x c ,y c )is the centroid of the motion blob (silhouette of the object under

tracking).

Fig.5.(a)Determining of posture features from the skeleton,(b)Silhouette and skele-ton motion sequences of a walking and running person,respectively.

(a)

(b)

An approach to distinguish the walking and running actions was developed and tested on the di ?erent test sequences in our database.The most impor-tant features for distinguishing walking and running person can be produced by moving types of the foots in time,the characteristics on the foot cyclic and their speed variations [18].That features can be easily and simply obtained by ma-nipulating the star skeleton properties.Figure 5.b shows silhouette and skeleton motion sequences for walking and running person.Figure 6a-d plots the values θn ,αn ,an acceleration of the centroid of the silhouette,and βn over time.The actions characterized in ?gure 6in the sequence are walking before frames num-bered with 200and after that running,respectively.Examining the cyclic values of each angles shows that each angle has signi?cant meaning for distinguishing both actions (walking and running),but the more robust human behavior un-derstanding issue can be obtained by fusion the results of all that angles rather than implementing of the features alone.That is,the feature produced by the

θangle (represents posture of the person in action)can be manipulated to dis-tinguish the running person from that of the walking person.But not all people lean forward when they run.In other words,there may be no big di ?erenc-ing between on some people lean forward when they run and walk.Figure 6-b plots θangle variations in time on the skeleton motion sequence,some samples are shown in ?gure 5.b,for a walking and then a running people,respectively.There is no big enough signi?cant di ?erencing between two sequences for the posture of the person in action,however,the βangle,as shown in ?gure 6-d,has good enough signi?cant and also presented an encouraged feature to distinguish walking-running actions in the test sequences.The producing of the reliable skeleton from the silhouette is important task because the βangle is directly obtained from both end points correspondence to the location of the ankles in the silhouette.Otherwise,the reliability of the implementation depends on the βangle might be possible reduced.

When the variation in time on the αangle is considered,it is also giving one of good signi?cant feature for distinguishing both actions.The acceleration on the silhouette in time is also producing the other good enough signi?cant characteristics for analyzing both actions,as shown in ?gure 6-c.Consequently,to be able to produce more robust and reliable human motion analysis,a fusion task which takes from each feature characteristics then fuses all the features together into a fused motion analysis using a weighted averaging process might be better [9].This is one of the next studies for human motion analysis in the project presented.Implementation of the signals produced by each features of the skeleton is also another future work.

-0.5

0.5

1.5

2.5

33.5

100

150

200

250

300

A n g e [r a d ]

Time [Frames]

"angle_between_legs.txt"

-0.4

-0.3-0.2-0.1

00.10.2

0.30.40.5050

100

150

200250300

A n g e [r a d ]

Time[Frames]

"torso_angle.txt"

010

100

150

200

250

300

H p o c a t o n s o n y (r o w o f t h e m a g e ) d r e c t o n n t m e

Time [Frames]

"hip_position.txt"

-1.5

-1

-0.5

0.5

1.5

050100

150

200250300

A n g e [r a d ]

Time [Frames]

"angle_on_ankle.txt"

Fig.6.The varations data of the angles produced from star skeleton shape.(a)leg angle α,(b)torso angle θ,(c)acceleration on the centroid point,(d)ankle angle β.

(a)

(b)

(c)(d)

9 5Experimental Results and Discussion

A video surveillance database is established for our experimental results.The database mainly contains video sequences on di?erent days in outdoor and indoor environments.A digital camera(Sony DCR-TRV355E)?xed on a tripod and a CCD camera?xed on a pan-tilt motor platform are used to capture the video sequences.The algorithm for human motion detection,tracking and analysis pre-sented in this paper has been implemented in C++and runs under Windows2K operating system at96/133MByte/MHz RAM,850MHz Celeron PC without using any special hardware.Currently,for240x180resolution gray-scale image sequences,the algorithm code without optimizing runs at13-22fps depending on the number of people in its?eld of view.Tests were performed on several sequences(each at least30minutes or more)representative of situations which might be commonly encountered in surveillance video.

For object tracking(a person and a group of person),after the object is detected,the tracking algorithm calculates the bounding box,the centroid and correspondence of each object over the frames.The tracking algorithm success-fully handled occlusions between people.Entry of a group of people was detected as a single entry,however,as soon as one person separated from the group he was tracked separately as shown in?gure4.

For human motion analysis,an approach similar in[3]but more reliable by adding more important features was presented.Walking and Running actions in the surveillance scene were only considered to di?erentiate from each other. Four main parameters(θ,α,β,acceleration variations on the centroid of the motion blob)are basically implemented to analysis two actions.In addition,the speed of the bounding box surrounding of the silhouette detected could be more considered for analyzing.But the basic approach presented will be developed to extrapolate for the future studies to analysis the other possible people actions such as jumping,sitting,standing,lying,etc.For that reasons,the basic idea on the human motion analysis has been focused to the features of the silhouettes. The four parameters basically involve the features of the silhouette for two action (walking,running)analysis[18].Test results shown in?gures6a-d,encourage to implement this kind of parameters for human motion analysis for real time video surveillance applications.But the shadow is important problem for the silhouette based motion identi?cation and analysis because the structure shape of the silhouette may be?uctuated by the shadow types.Test results produced for human motion detection and analysis were obtained in the surveillance area without having a shadow.

6Acknowledgment

This research work is supported by Karadeniz Technical University Research Foundation grant to(KTU-2002.112.009.1).

References

1.R.T.Collins,A.J.Lipton,H.Fujiyoshi,T.Kanade.:Algorithms for Cooperative

Multi sensor Surveillance.Proc.of IEEE Vol.89.No.10,(2001).

2.I.Haritaoglu,D.Harwood,L.S.Davis.:W4:Real-Time Surveillance of People and

Their Activities.IEEE Trans.on PAMI,Vol.22,No.8,(2000).

3.H.Fujiyoshi,A.J.Lipton,T.Kanade.:Real-Time Human Motion Analysis by Image

Skeletonization.IEICE Trans.Inf.&SYST.,Vol.E87-D,No.1,pp.113-120,January 2004.

4.P.Perez,C.Hue,J.Vermaak,M.Gangnet,Color-Based Probabilistic Tracking,

Proc.of European Conference on Computer Vision,Copenhagen,27May-2June 2002,Denmark.

5.I.Haritaoglu,M.Flickner,Detection and Tracking of Shopping Groups in Stores,

Proceeding of the2001IEEE Computer Vision and Pattern Recognition,Vol.1, 8-14December,2001.

6.Porikli F.,Tuzel O.,Human Body Tracking by Adaptive Background Models and

Mean-Shift Analysis,IEEE Int.Workshop on Performance Evaluation of Tracking and Surveillance,March,2003.

7.Mubarak Shah,Understanding human behavior from motion imagery,Machine

Vision and Applications,Special Issue:Human modeling,analysis,and synthesis, Vol.14,Issue4,pp.210-214,September2003.

8. A.K.Jain,A.Ross,S.Prabhakar,An Introduction to Biometric Recognition,IEEE

Transactions on Circuit and Sytems for Video Technology,Special Issue on Image-and Video-Based Biometrics,Vol.14,No.1,pp.4-20,January2004.

9.M.Ekinci,F.W.Gibbs,B.T.Thomas.:Knowledge-Based Navigation for Au-

tonomous Road Vehicles.Turkish Journal of Electrical Engineering and Computer, Vol.8,No.1,2000.

10.M.Ekinci,E.Gedikli,A New Algorithm of Background Model Initialization and

Maintenance for Real-Time Video Surveillance Applications.To be Appear in Turk-ish Journal Electrical and Computer Science,June,2005.

11.M.Ekinci,E.Gedikli,Background Estimation Based People Detection and Track-

ing for Video Surveillance.Springer LNCS2869,ISCIS2003,Computer and Infor-mation Sciences,18th Int.Symp.,pp.421-429,Turkey,November,2003.

12.L.Wang,T.Tan,H.Ning,W.Hu,Silhouette Analysis-Based Gait Recognition

for Human Identi?cation,IEEE Transactions on Pattern Analysis and Machine Intelligence,Vol.25,No.12,December,2003.

13.K.Toyama,J.Krumn,B.Brumit,B.Meyers.:Wall?ower:Principles and Practice

of Background Maintenance.7th IEEE Inter.Conf.on Computer Vision,Nov., (1999).

14.Y.Ricqueburg and P.Bouthemy,The Recognition of Human Movements Using

Temporal Templates,IEEE Transactions on Pattern Analysis and Machine Intelli-gence,Vol.22,No.8,August2000.

15.L.Wang,W.Hu,T.Tan,Recent developments in human motion analysis,Pattern

Recognition,Vol.36,pp.585-601,2003.

16. B.Bhanu,J,Han,Individual Recognition by Kinematics-Based Gait Analysis,in

Proceeding of International Conference on Pattern Recognition,pp.343-346,2002.

17.J.K.Aggarwal,Q.Cai,Human Motion Analysis:A Review,Computer Vision and

Image Understanding,vol.73,n.3pp.428-440,March1999.

18.T.Mori,K.Tsujioka,T.Sato,Human-like Action Recognition System on Whole

Body Motion-captured File Proc.IEEE/RSJ Int.Conf.on Intelligent Robots and Systems,Mani,Hawai,USA,Oct.29-Nov.03,2001.

行人检测与跟踪国内外研究现状

行人检测与跟踪国内外研究现状 1.2行人检测与跟踪国内外研究现状视觉跟踪和目标检测是计算机视觉领域内较早开始的研究方向。经过几十年的积累，这两个方向已经取得了显著的发展。然而，很多方法只是在相对较好地程度上解决了一些关键问题。并且仍旧有不少一般性的关键问题未得到有效的解决。国内外很多研究机构都在致力于研究和发展这两个方向。近些年这两个方向持续发展，涌现了很多比较优秀的方法。国外的很多大学和研究机构（如卡内基梅隆大学、南加州大学和法国国家计算机科学与控制研究所等）都有计算机视觉小组，长期地研究视频跟踪和目标检测。国内的很多大学和研究所等（如清华大学、上海交大和自动化所等）也有相关的研究小组，并取得了一些优秀的研究成果。 1.2.1行人检测技术国内外研究现状中科院计算机科学重点实验室孙庆杰等人利用基于侧影的人体模型及其对应的概率模型,提出了一种基于矩形拟合的人体检测算法。中科院自动化所谭铁牛等对人运动进行视觉分析,其核心是利用计算机视觉技术从图像序列中检测、跟踪、识别人并对其行为进行理解与描述,它主要应用在视觉监控领域和基于步态的身份鉴定。步态识别就是根据人们走路的姿势进行身份鉴定,依据人体行走运动很大程度上依赖于轮廓随着时间的形状变化的直观想法,提出一种基于时空轮廓分析的步态识别算法；基于行走运动的关节角度变化包含着丰富的个体识别信息的思想,提出一种基于模型的步态识别算法。实验结果表明该算法不仅获得了令人鼓舞的识别性能,而且拥有相对较低的计算代价。但是该方法只能检测出运动的行人。西安交通大学郑南宁等研究了利用支持向量机识别行人的方法,通过稀疏Gabor滤波器提取行人样本图像中行人的特征,然后利用支持向量机来训练所提取的样本特征,并用训练得到的分类器通过遍历图像的方式将图像中可能属于行人的窗口提取出来。尽管用Gabor滤波器提取特征效果相对较好,但耗时很长,不适合于实时图像的处理。上海交通大学田广等提出了一种coarse-to-fine的行人检测方法,将一个人建模成人体自然部位的组装,人体的所有部位包括头肩、躯干和腿、采用绝对值类Haar特征集和Edgelet特征集,在这些特征集上,采用softcascade训练各个部位的检测器和全身检测器。首先采用全身检测器在整个图像中产生候选行人区域,然后用基于贝叶斯决策的组合算法进一步确定候选区域中的行人。实验结果表明该算法有很好的检测性能能在杂乱的自然场景中有效的检测行人。但该方法的识别率是78.3%,识别率不高，且该模型比较难构建,模型求解也比较复杂。目前,在国外许多文献中提出了基于机器视觉的行人检测方法,意大利帕尔玛大学的AlbertoBroggi教授在ARGO项目中采用一种基于外形的行人检测算法。算法首先根据行人相对于垂直轴有很强的垂直边缘对称性、尺寸和外貌比例等在

人体运动的检测和识别研究

模式识别中文核心期刊《微计算机信息》（测控自动＇ｆ：．１５）２００８年第２４卷第２－１期文章编号：１００８—０５７０（２００８）０２…１０２１００２人体运动的检测和识别研究ＳｔｕｄｙｏｎＤｅｔｅｃｔｉｏｎａｎｄＩｄｅｎｔｉｆｉｃａｔｉｏｎｏｆＨｕｍａｎＭｏｖｅｍｅｎｔ（北京榭吏大学）宋修雷王志良ＳＯＮＧＸＩＵＬＥＩＷＡＮＧＺＨＩＬＩＡＮＧ摘要：本文针对人体运动视觉分析中的行为理解和分析等高层视觉问题进行分析，研究了一种静止摄像机条件下的行为理解和分析的算法，它以运动序列中的关键帧为基础，针对关键帧提取人体的骨架信息，然后通过Ｈｕ不变矩来提取特征，最后组成特征向量，通过对ＨＭＭ模型的训练来识别特定运动序列的语义。关键词：运动识别：计算机视觉；Ｉ－ＥＶｌｌＭ中图分类号：ＴＰ３９１．４文献标识码：ＡＡｂｓｔｒａｃｔ：Ｔｈｉｓｐａｐｅｒｆｏｃｕｓｅｓｏｎｖｉｓｕａｌａｎａｌｙｓｉｓｏｆｈｕｍａｎｍｏｖｅｍｅｎｔａｎｄｔｈｅｕｎｄｅｒｓｔａｎｄｉｎｇｏｆｈｉｇｈｌｅｖｅｌｖｉｓｕａｌｐｒｏｂｌｅｍｓ．ＩｔｇｉｖｅｓａｎａｌｇｏｒｉｓｍｔｏｃｏｍｐｒｅｈｅｎｄａｎｄａｎａｌｙｓｉｓｐｅｏｐｌｅＳｍｏｖｅｍｅｎｔｉｆｔｈｅｃａｍｅｒａｉｓｎｏｔｍｏｖｅｄ．Ｗｅｐｒｅｓｅｎｔａｎａｌｇｏｒｉｔｈｍｂａｓｅｄｏｎｋｅｙｆｒａｍｅｓ，ｔｈｉｓｔｅｃｈｎｉｑｕｅｃａｎｂｅｍｏｒｅｅｆｆｅｃｔｉｖｅｉｎｔｈｅｏｒｉｇｉｎａｌｓｅｑｕｅｎｃｅｂｙｒｅｄｕｃｉｎｇｔｈｅｉｎｔｅｒｆｅｒｅｎｃｅｏｆｄｅｔｅｃｔｉｏｎａｎｄｉｄｅｎｔｉｆｉｃａｔｉｏｎ．Ｋｅｙｗｏｒｄｓ：ｍｏｖｅｍｅｎｔｉｄｅｎｔｉｆｉｃａｔｉｏｎ，ｃｏｍｐｕｔｅｒｖｉｓｉｏｎ，ＨＭＭ１引言计算机视觉是计算机科学和人工智能领域的一个重要分支．它研究的主要内容是怎样利用各种成像系统代替视觉器官作为信号输入手段，用计算机代替大脑完成对信息的处理和解释。计算机视觉的最终研究目标，就是使计算机能象人那样通过视觉观察和理解世界。运动物体的检测、跟踪和行为的理解与描述是计算机视觉领域的一个重要课题。也是计算机能否象人那样通过视觉观察和理解世界的关键所在。目前在运动物体检测领域，国内外有关这方面的研究很多，但是目前的许多方法都受到了一定条件的局限性。比如我们在使用背景差对运动目标进行检测时，发现了这种方法受光线亮度变化的影响很大，同时当背景中有物体移人或移出时．这种方法检测的效果正确性受到很大影响。对行为的理解与描述国内外的文献相对较少。可以说是一个比检测和跟踪更加困难的研究领域。针对这些问题，我们提出一种能够在简单背景下对人体行为进行理解和描述的方法。该方法将运动的检测、跟踪和行为的理解和描述联系到一起，使两者相辅相成。解决了两者分离情况下研究中的难点问题。２关键帧算法直接比较的人体的运动来识别运动的语义是不可能的，因为人体区域随着肢体的摆动而呈现非刚性的变化。这里我们对运动序列进行分解，原则是提取运动序列中的关键帧进行分析。关键帧的定义是在运动方向发生变化的时刻对应的图像帧。从运动中来看，运动方向发生变化的时刻，必定是序列图像在水平或者是垂直方向的投影在时间轴上出现了极值。所以我们就可宋修雷：硕士研究生基金项目：本论文得到国家自然科学基金（Ｎ０．６０５７３０５９）、北京市“现代信息科学与网络技术”重点实验室基金（Ｎｏ．ＴＤＸＸ０５０３）和北京科技大学重点基金的支持以根据这条规则来从运动序列中提取关键帧。一个完整的具有具体语义的运动序列可以由一个相对应的关键帧序列来表示。通过关键帧的方法完成了对运动序列的第一次特征提取。图１是对关键帧提取的一个示例。图２是一个行走运动序列的关键帧表示。图１关键帧的提取彳彳ｋｋ图２行走运动序列的关键帧表示３特征提取３．１骨架算法首先要从待处理的序列图像中抽取出目标人体的轮廓，获一２１０—３６０元，年邮局订阅号：８２－９４６　万方数据

基于视频的运动人体检测技术研究

基于视频的运动人体检测技术研究摘要在视频监控领域中，快速准确地检测出运动人体，是后续进行运动分析的初级处理。本文将单摄像头拍摄的视频流，首先转换成静止的图像帧，通过MATLAB利用中值法进行背景模型的重建，背景减除来进行运动目标检测，并通过图像后处理技术，将运动人体快速准确地检测出来。通过对公共视频数据库及自拍视频的检测实验，均得到了较理想的运动人体图像，证明了该技术的可行性。关键词视频序列；运动人体；MATLAB；目标检测前言当今社会，智能视频监控已分布到各行各业，是安全防范系统的重要组成部分。运动人体的检测是视频监控系统进行运动分析的最底层，是后续进行各种高级处理如目标分类、行为理解等的基础。本文以视频监控系统的应用为目的，将单摄像头拍摄的彩色视频流，首先转换成静止的图像帧，通过MATLAB利用中值法进行背景模型的重建，背景减除来进行运动目标检测，并通过图像后处理技术，检测到了较理想的运动人体图像。 1 运动目标检测背景减除是当前最简单也是最常用的一种检测方法。该方法通过将当前图像帧与背景图像的灰度值直接进行相减操作，并将得到的差值与某一阈值T进行比较，大于阈值T的即被认定为是目标点，赋值为1；反之，认定为是背景点，赋值为0，进而检测出运动人体目标。 1.1 中值法背景建模根据视频序列的特点，在时间序列上，运动人体经过视频图像上某一位置的时间是非常短暂的，大部分时间在该位置上显示的都是背景图像，因此本文利用中值法[1-2]来进行背景模型的重新建立，该方法能够利用图像序列中的一部分图像重新构造出精确的背景图像。其思想就是将图像序列中的部分图像按照其中像素的顺序进行排列，然后选出中间的像素值以此作为背景图像中对应位置处的像素值，遍历图像序列中所有的像素，即可以获得精确的背景图像。 1.2 差分及二值化采用前述的背景减除法对图像进行差分操作，得到的差分结果为灰度图像，而在后续的处理过程中，用到更多的是二值化图像。将灰度图像进行二值化的常用方法是阈值分割法，其中阈值的选取至关重要。根据阈值选取的不同一般分为局部阈值算法和全局阈值算法。全局阈值算法

人体目标检测与跟踪算法研究

人体目标检测与跟踪算法研究摘要：近些年以来，基于视频中人体目标的检测与跟踪技术研究越来越被重视。然而，由于受到目标自身特征多样性和目标所处环境的复杂性和不确定性的影响，现存算法的性能受到很大的限制。本文对目前所存在的问题进行了分析，并提出了三帧差分法和改进阈值分割法相结合的运动目标检测算法和多特征融合的改进运动目标跟踪算法。这两种算法不仅可以准确有效的检测出运动目标而且能够满足实时性的要求，有效的解决了因光照变化和目标遮挡等情况造成的运动目标跟踪准确度下降或跟踪目标丢失等问题。关键词：三帧差分，Camshift，阈值分割 Research Based on Human Target Detectionand Tracking Algorithm Abstract: In recent years, human object detection and tracking become more and more important. However the complexity, uncertainty environment and the target’s own diversity limit the performance of existing algorithms. The main works of this paper is to study and analysis the main algorithm of the human object detection and tracking, and proposes a new moving target detection method based on three-frame difference method and threshold segmentation and improved Camshift tracking algorithm based on multi-feature fusion. These algorithm can satisfy the real-time, while accurately and efficiently detect moving targets, and also effectively solves the problem of tracking object lost or misplaced under illumination change or target occlusion. Keywords: three-frame difference, Camshift, threshold segmentation 一、绪论（一）选题的背景和意义人类和动物主要通过眼睛来感受和认知外部世界。人类通过视觉所获取的信息占了60%[1]，因此，在开发和完善人工智能的过程中，赋予机器视觉的功能这一操作极不可缺少。完善上述功能需要以许多技术为基础，特别是运动目标的检测与跟踪技术。近些年以来，此技术受到了越来越多的关注[2]。目前，此技术也在各领域得到了充分的应用，涵盖的领域有智能交通、导航、智能视频监控、精确制导、人机交互和多媒体视频编码压缩技术等。

人体行为识别技术

成误报)。另外一种“智能”是指系统能够监视一定场所中人的活动，并对其行为进行分析和识别，跟踪可疑行为(如经常在重要地点徘徊等等行为)从而采取相应的报警措施。通常把报警系统设置于银行、机场、车站、码头、超市、办公大楼、住宅小区等地，以实现对这些场所的智能监控。 ②虚拟现实跟踪现实世界人的姿态，从而创建一个虚拟的仿真场景，实现人与这个虚拟世界的交互。该领域的具体应用涉及视频游戏、虚拟摄影棚、计算机动画等方面。 ③高级用户接口指可以通过对用户手势的识别来代替传统的鼠标和键盘输入，从而实现人与计算机之间的智能交互。此外，通过对手势语言的理解，还可以进行聋人与计算机之间的手语交流。 ④运动分析人体运动分析可以运用于基于容的视频检索领域。例如可以检索在运动会上单杠比赛中运动员的杠上动作。这样可以节省用户大量的查询视频资料的时间和精力。另外一种应用是用于各种体育项目中，提取运动员的各项技术参数(如关节位置、角度和角速度，等等)，通过分析这些信息，可以为运动员的训练提供指导和建议，有助于提高运动员的训练水平。此外，还可以用于体育舞蹈动作的分析，以及临床矫形术的研究等领域。 ⑤基于模型的视频编码通过提取一定的静态场景中人物的形态特征参数和3D姿态参数，以较低的数据量对视频数据流加以描述，实现视频数据的压缩和低比特率传送。可以用于在因特网上展开远程视频会议以及VOD（Video-On-Demand）视频点播。

机器视觉的辅助驾驶系统的视频中行人检测跟踪

机器视觉的辅助驾驶系统的视频中行人实时检测识别研究文献综述 1机器视觉发展国外机器视觉发展的起点难以准确考证，其大致的发展历程是：20世纪50年代提出机器视觉概念，20世纪70年代真正开始发展，20世纪80年代进入发展正轨，20世纪90年代发展趋于成熟，20世纪90年代后高速发展。在机器视觉发展的历程中，有3个明显的标志点，一是机器视觉最先的应用来自“机器人”的研制，也就是说，机器视觉首先是在机器人的研究中发展起来的；二是20世纪70年代CCD图像传感器的出现，CCD摄像机替代硅靶摄像是机器视觉发展历程中的一个重要转折点；三是20世纪80年代CPU、DSP等图像处理硬件技术的飞速进步，为机器视觉飞速发展提供了基础条件。国内机器视觉发展的大致历程：真正开始起步是20世纪80年代，20世纪90年代进入发展期，加速发展则是近几年的事情。中国正在成为世界机器视觉发展最活跃的地区之一，其中最主要的原因是中国已经成为全球的加工中心，许许多多先进生产线己经或正在迁移至中国，伴随这些先进生产线的迁移，许多具有国际先进水平的机器视觉系统也进入中国。对这些机器视觉系统的维护和提升而产生的市场需求也将国际机器视觉企业吸引而至，国内的机器视觉企业在与国际机器视觉企业的学习与竞争中不断成长。未来机器视觉的发展将呈现下列趋势: （1）技术方面的趋势是数字化、实时化、智能化图像采集与传输的数字化是机器视觉在技术方面发展的必然趋势。更多的数字摄像机，更宽的图像数据传输带宽，更高的图像处理速度，以及更先进的图像处理算法将会推出，将会得到更广泛的应用。这样的技术发展趋势将使机器视觉系统向着实时性更好和智能程度更高的方向不断发展。（2）产品方面：智能摄像机将会占据市场主要地位智能摄像机具有体积小、价格低、使用安装方便、用户二次开发周期短的优点，非常适合生产线安装使用，越来越受到用户的青睐，智能摄像机所采用的许多部件与技术都来自IT行业，其价格会不断降低，逐渐会为最终用户所接受。因此，

行人检测与目标跟踪算法研究

基于opencv中光流法的运动行人目标跟踪与检测一、课题研究背景及方法行人检测具有极其广泛的应用：智能辅助驾驶，智能监控，行人分析以及智能机器人等领域。从2005年以来行人检测进入了一个快速的发展阶段，但是也存在很多问题还有待解决，个人觉得主要还是在性能和速度方面还不能达到一个权衡。早期以静态图像处理中的分割、边缘提取、运动检测等方法为主。例如（1）以Gavrila为代表的全局模板方法：基于轮廓的分层匹配算法，构造了将近2500个轮廓模板对行人进行匹配, 从而识别出行人。为了解决模板数量众多而引起的速度下降问题，采用了由粗到细的分层搜索策略以加快搜索速度。另外，匹配的时候通过计算模板与待检测窗口的距离变换来度量两者之间的相似性。（2）以Broggi为代表的局部模板方法：利用不同大小的二值图像模板来对人头和肩部进行建模，通过将输入图像的边缘图像与该二值模板进行比较从而识别行人，该方法被用到意大利Parma大学开发的ARGO智能车中。（3）以Lipton为代表的光流检测方法：计算运动区域内的残余光流；（4）以Heisele为代表的运动检测方法：提取行人腿部运动特征；（5）以Wohler为代表的神经网络方法：构建一个自适应时间延迟神经网络来判断是否是人体的运动图片序列；以上方法，存在速度慢、检测率低、误报率高的特点。 2、行人检测的研究现状

（1）基于背景建模的方法：分割出前景，提取其中的运动目标，然后进一步提取特征，分类判别；在存在下雨、下雪、刮风、树叶晃动、灯光忽明忽暗等场合，该方法的鲁棒性不高，抗干扰能力较差。且背景建模方法的模型过于复杂，对参数较为敏感。（2）基于统计学习的方法：根据大量训练样本构建行人检测分类器。提取的特征一般有目标的灰度、边缘、纹理、形状、梯度直方图等信息，分类器包括神经网络、SVM，adaboost等。该方法存在以下难点：（a）行人的姿态、服饰各不相同；（b）提取的特征在特征空间中的分布不够紧凑；（c）分类器的性能受训练样本的影响较大；（d）离线训练时的负样本无法涵盖所有真实应用场景的情况；尽管基于统计学习的行人检测方法存在着诸多的缺点，但依然有很多人将注意力集中于此。行人检测国外研究情况：法国研究人员Dalal在2005的CVPR发表的HOG+SVM的行人检测算法(Histograms of Oriented Gradients for Human Detection, Navneet Dalel,Bill Triggs, CVPR2005)。 Dollar 在 2010 年 BMVC 的《The fastest pedestrian detector in the west》一文中提出了一种新的思想，这种思想只需要训练一个标准 model,检测N/K（K ≈10）然后其余的 N-N/K 种大小的图片的特征不需要再进行这种复杂的计算，而是跟据这 N/K 次的结果，由另外一种简单的算法给估计出来，这种思想实现的基础是大小相近的图像的特征可以被足够精确的估计出来；同年，德国

运动人体识别技术

二、运动人体识别技术 1.概念运动人体识别技术是一种以图像处理，模式识别，计算机视觉等技术为基础，为运动人体进行识别处理的一项技术。其中图像处理（影像处理）是用计算机对图像进行分析，以达到所需结果的技术；模式识别是通过计算机用数学技术方法来研究模式的自动处理和判读，其中环境与客体统称为“模式”；计算机视觉技术是一门研究如何使机器看的学科，简单的说，就是用摄影机和电脑代替人眼对目标进行识别、跟踪和测量等机器视觉，并进一步作图像处理，使电脑处理成更适合人眼观察或传送仪器检测的图像。 2.运动人体识别的研究进展与现状运动人体识别的研究主要包括图像处理、多传感技术、虚拟现实、模式识别、计算机视觉和图形学、计算机辅助技术、可视化技术以及智能机器人等。针对人体运动图像系列进行分析处理的运动人体视觉分析技术，一般情况下可分为：运动目标检测，运动目标特征提取以及识别复杂背景下的运动目标身份。其主要的研究方法为结构化分量和动态分量。其算法又分为基于统计的方法和基于模型的方法。现状是运动人体科学由宏观向微观理论研究深入发展，与运动人体科学相关的一些学科，快速成长为深入研究性学科，运动人体科学的竞技体育和体育保健。研究方向为：智能安全监控、人机接口、视频会议等方面，这些方面具有广泛的应用前景和巨大的潜在经济价值。 3.运动人体识别算法概述（1）运动人体识别过程一般分为目标检测和处理、特征提取和分析、模式分类和识别。（2）主要方法：目标检测和处理的特点为检测出原始图像中的运动目标，在一副图像中，局部目标的表象和形状能够被梯度或者边缘的方向密度分布很好的描述；特征提取和分析的特点为通过映射和变换的方法可以将高维空间中的特征描述用低维空间的特征来描述；模式分类和识别的特点是通过计算机用数学技术方法来研究模式的自动处理和判读，其中环境与客体统称为“模式” （3）特征提取模式识别的方法在很多实际问题中，往往不容易找到那些最重要的特征，或者因为外界干扰不能提取出自己想要的特征信息。因此在测量时，我们总希望能够获取更多的信息来加以判断。除此之外，我们还能够用数据，比值，梯形图等等的一些展现方法来突出自己想要的特征信息。为了设计出效果好的分类器，通常需要对原始的测量值集合进行分析，经过选择和变换处理，组成有效的识别特征。 4.未来的发展趋势以及存在的问题人体识别技术的发展趋势是：面对着全球化、信息化越来越提倡的社会背景下，识别技术会越来越广，深度也会不断加深。其虽然有着自己独特的优势，但并不是没有缺点，举个例子，在指纹识别上，面临指纹膜冒充指纹蒙混过关的问题；人面识别也许简单易个容就能过关；虹膜识别技术对黑眼睛存在识别难得问题；对于盲人和眼睛有疾病的患者实在是无能为力；声音、笔记也并不难以模仿；静脉识别也存在着易受温度干扰影响识别率的问题。

基于单目视频运动跟踪的三维人体动画

第29卷第5期2008年5月微　计　算　机　应　用 M I CROCOMP UTER APP L I CATI O NS Vol129No15 M ay12008基于单目视频运动跟踪的三维人体动画3 吴　玥田兴彦 (湖南大学软件学院　长沙　410082) 摘要:针对传统人体动画制作成本高、人体运动受捕获设备限制等缺陷,提出了一种基于单目视频运动跟踪的三维人体动画方法。首先给出了系统实现框架,然后采用比例正交投影模型及人体骨架模型来恢复关节的三维坐标,关节的旋转欧拉角由逆运动学计算得到,最后采用H-ani m标准对人体建模,由关节欧拉角驱动虚拟人产生三维人体动画。实验结果表明,该系统能够对人体运动进行准确的跟踪和三维重建,可应用于人体动画制作领域。关键词:运动跟踪　单目视频　三维人体动画　逆运动学 3D Hu man An ima ti on Ba sed on Hu man M oti on Track i n g from M onocul ar V i deo Sequences WU Yue,TI A N Xing yan (Soft w are School,Hunan University,Changsha,410082,China) Abstract:The traditi onal app r oaches t o make hu man ani m ati on are suffering fr om the p r oble m s of expensive cost and hu man body mo2 ti on li m ited by moti on cap ture equi pment,t o overcome these shortcom ings,a ne w3D hu man ani m ati on based on hu man moti on tracking fr om monocular video sequences is p r oposed1Firstly,the fra me work of our syste m is p resented,then,the3D coordinates of j oints are esti m ated by hu man skelet on constrain under scaled orthographic p r ojecti on,the r otati on Euler angles of j oints are calculated by inverse kine matics1Finally,we use H-ani m t o rep resent virtual hu man,the moti on of virtual hu man is dr ove by the Euler angles of j oints1 The experi m ents show that tracking and reconstructing results made by our syste m are accurate and effective1This syste m can be ap2 p lied t o hu man ani m ati on1 Keywords:Moti on tracking,Monocular video sequences,3D hu man ani m ati on,I nverse kine matics 1　引言如何方便地生成高逼真度的三维人体动画已成为当前计算机动画的一个重要研究方向。按照运动建模方式不同三维人体动画可以分为以下四类:关键帧方法、基于运动学和逆运动学、基于动力学和逆动力学、运动捕获方法。基于运动捕获方法的人体动画具有逼真度高、数据可重用等特点,在动画技术中得到广泛应用。在商业产品中一般使用硬件设备(如V icon)来捕获人体运动,要求运动员身穿紧身衣并在关节位置粘贴反光小球或反光片,这样限制了运动员的运动;另一方面硬件设备比较昂贵,制作成本较高。在前人研究的基础上,本文提出了一种基于单目视频运动跟踪的三维人体动画方法,具有使用方便、制作成本低廉、动画效果较好等特点。基于视频运动跟踪的运动捕获方法按照其采用摄像机数目多少可以分为两类:①基于单目视频的方　本文于2008-1-15收到。　3本文受基于重建人脸面部器官三维模型的关键技术研究基金项目(06JJ2065)资助。

运动人体图像识别

学习报告一．意义和背景随着信息技术的快速发展壮大和应用的普及,利用计算机视觉的技术在图像处理方面和模式识别领域中研究,并对视频图像进行人体运动特征提取与有效识别已成为人们关注的热点问题。计算机视觉技术对人体运动的视频或者图像进行识别是基于对其视频或者图像的序列进行分析处理；对检测出的人体运动目标进行运动特征提取和分类识别,从而达到理解和描述其行为的目的。基于视频图像的人体运动特征分析在智能视频监控、智能接口、虚拟现实等领域有着相当广阔的应用前景。人体运动特征的提取与识别需要结合生物识别技术来识别和判断运动中人的行为、区别个体身份。所谓生物识别技术,其具体操作就是利用人体与生俱来的生物特征进行个体身份认证,最显著的特点是具有不变性和唯一性。人体运动特征包括：肢体摆动特征,步态特征,人体轮廓投影特征,人体对称特征等,其中从视觉监控的角度来看,步态特征是远距离场景条件下最具有代表性最典型的人体运动特征,近年来备受关注,同时也涌现出大量富有意义的步态识别算法。二．人体运动特征识别研究运动特征识别在当今的科研领域中涉及面广泛，主要涉及到图像处理，多传感器技术，虚拟现实，模式识别，计算机视觉和图形学，

计算机辅助设计，可视化技术，智能机器人等一系列研究领域。针对人体运动图像序列进行分析处理的运动人体视觉分析技术，一般情况下可分为以下几个过程，运动目标检测，运动目标特征提取以及识别复杂背景下的运动目标身份。图1 典型的运动特征识别系统运动特征识别的主要研究方法目前运动特征识别中的运动特征包含了两种分量：结构化分量和动态分量。其中结构化分量也就是静态分量，它负责记录运动人体的身高，步幅等身体形状信息；而动态分量则形象地表征出了在运动过程中人体的胳膊摆动，肢体倾斜度，迈腿方式等运动特征，依据上述两种类型分量，现有的运动特征识别算法大致分为两类：基于统计的方法和基于模型的方法。

行人检测和行人跟踪

行人检测和行人跟踪行人检测方法 1概述基于计算机视觉的行人检测由于其在车辆辅助驾驶系统中的重要应用价值成为当前计算机视觉和智能车辆领域最为活跃的研究课题之一. 其核心是利用安装在运动车辆上的摄像机检测行人，从而估计出潜在的危险以便采取策略保护行人。基于视觉的行人检测系统一般包括两个模块：感兴趣区分割和目标识别[1] 行人检测除了具有一般人体检测具有的服饰变化、姿态变化等难点外，由于其特定的应用领域还具有以下难点：摄像机是运动的，这样广泛应用于智能监控领域中检测动态目标的方法便不能直接使用；行人检测面临的是一个开放的环境，要考虑不同的路况、天气和光线变化，对算法的鲁棒性提出了很高的要求；实时性是系统必须满足的要求，这就要求采用的图像处理算法不能太复杂. 根据分割所用的信息，可将ROIs 分割的方法分为基于运动、基于距离、基于图像特征和基于摄像机参数四种方法。基于运动的方法通过检测场景中的运动区域来得到ROIs。基于距离的方法通过测量目标到汽车的距离来得到ROIs . 可以用来测距的传感器主要包括雷达和立体视觉。基于图像特征的方法指通过检测与行人相关的图像特征从而得到ROIs 。对于可见光图像来说，常用的特征包括竖直边缘、局部区域的熵和纹理等. 对于红外图像来说，主要根据人体尤其是人脸的温度比周围环境温度较高这一特征，通过检测一些“热点”(Hot spot) 来得到ROIs。摄像机的安装位置和摄像机参数也是一个很重要的考虑因素. 它对行人在图像上出现的位置和每个位置上目标的大小给出了很多限制, 合理利用这些限制可以大大地缩小搜索空间。根据利用的信息的不同，目标识别可以分为基于运动的识别和基于形状的识别两种方法。基于运动的识别方法指通过分析人运动时的步态(Gait) 特征来识别行人. 人体的步态具有特定的周期性，通过分析图像序列的周期性, 然后与行人步态的周期性的模式相比较, 就可以识别出行人。基于形状的识别方法指通过分析目标的灰度、边缘和纹理信息来对目标进行识别。基于形状的方法包括：基于明确人体模型的方法，基于模板匹配的方法，基

人体检测与追踪

Human Motion Detection,Tracking and Analysis For Automated Surveillance Eyup Gedikli,Murat Ekinci Computer Vision Lab. Department of Computer Engineering Karadeniz Technical University Trabzon61080,Turkiye, gediklie,ekinci@https://www.doczj.com/doc/b114758015.html,.tr Abstract.This paper presents a silhouette based human motion de- tection and analysis for real-time visual surveillance system.In order to detect foreground objects,?rst,background scene model is statistically learned even the background is not completely stationary.A background maintenance model is also proposed for preventing some kind of falsies. Then,the candidate foreground regions are detected using thresholding, noise cleaning and their boundaries extracted using morphological?lters. For human motion detection,object detection and classi?cation approach for distinguishing a person,a group of person from detected foreground objects(e.g.,cars)using silhouette shape and periodic motion cues is performed.Finally,the trajectory of the people in motion and several motion parameters produced from the cyclic motion of silhouette of the object under tracking are implemented for analyzing people activities such as walking and running,in the video sequences.Experimental re- sults on the di?erent test image sequences demonstrate that the proposed algorithm has an encouraging real-time background modeling based hu- man motion detection and analysis performance with relatively robust and low computational cost. 1Introduction Visual analysis of human motion has recently persuaded more studies in the computer vision area.It attempts to detect,track,and identify people,and more generally,to understand human behaviors from image sequences involving humans[1][2],[12].Some companies such as IBM,Microsoft,and Mitsubishi are also investing on research on human motion analysis[5],[4],[6].One of the important research areas is automatically human behavior understanding[7]. Biometrics is also a technology that makes use of the physiological or behavioral characteristics to authenticate the identi?es of people[8]. Moving human detection is the?rst step processes for nearly every system of vision-based human analysis[17].The aim on moving human detection is to segment the regions corresponding to people from the rest of an image sequence. It is known to be a signi?cant and di?cult research problem[13].Background

一种运动人体行为识别的改进方法

一种运动人体行为识别的改进方法摘要：隐马尔可夫模型主要用于根据系统外部观测量来预测该事件的未知序列，本文将它引入运动人体行为识别中。在获得了运动目标的整体轮廓以后，论述了怎样对规则行为建模，并结合隐马尔可夫原理识别出运动人体行为。针对人体轮廓单通道图像对比的局限，提出了一种轮廓对比和质心跟踪相结合的方法改进了算法，通过实验证明该改进的算法具有较好的性能。关键词：行为模型隐马尔可夫模型质心轮廓中图法分类号:TP3文献分类标识码:C4 An Improved Method for Human Motion Recognition Abstract:The Hidden Markov Models is widely used in forecasting the unknown sequence based on observation on outside system. In this paper, it is applied in human motion recognition. With the human’s silhouettes, the paper deals with how to get the models of regular actions and combine with HMM to recognize the motions of mobile human. As for the localization on gray images of silhouettes, silhouettes contrasting and center of mass tracking algorithm is put forward to solve this problem. The results show that the new algorithm has better performance. Key words:Motion models, HMM, Centroid, Silhouette 1引言目前，“人的观察”（looking at people）向“理解人”（understanding people）转变是计算机视觉领域中最活跃的研究主题之一，其核心是利用计算机视觉技术从图像序列中检测、跟踪、识别人并对其行为进行理解与描述，其重要目标是摆脱传统的人机交互方式（如:键盘、鼠标等设备信息输入），让计算机系统具备自动分析，获取外部信息的能力，并通过分析做出相应的响应，让计算机系统更加智能化和人性化。动态现场视觉监控是计算机视觉领域一个新兴的应用方向。视觉监控区别于传统意义上的监控系统在于其智能性[1]。简单而言，不仅用摄像机代替人眼，而且用计算机代替人、协助人，来完成监视或控制任务，从而减轻人的负担.视觉监控具有广泛的应用前景和潜在的经济价值。运动目标行为识别作为整个视觉监控的重点和难点，国内外提出了许多方法，比如区域分割[2]，关节点法等，但实现起来相当复杂。针对一些常见的非规则行为（比如跳，爬行，打斗，偷窥等）本文采用了一种隐马尔可夫模型和质心理论相结合的方法，简单的运用自然语言理解，能够比较高效的识别出这些非规则行为，并且具有比较大的可扩展空间。

人体行为识别技术讲解学习

人体行为识别技术

人体行为识别技术在计算机视觉领域中，人体运动行为识别是一个被广泛关注的热点问题，在智能监控、机器人、人机交互、虚拟现实，智能家居，智能安防，运动员辅助训练等方面有巨大应用价值。行为识别问题一般遵从如下基本过程：数据图像预处理，运动人体检测、运动特征提取、特征训练与分类、行为识别。着重从这几方面逐一回顾了近年来人体行为识别的发展现状和常有方法。并对当前该研究方向上待解决的问题和未来趋势做了分析。行为理解可以简单地认为是时变数据的分类问题，即将测试序列与预先标定的代表典型行为的参考序列进行匹配。通过对大量行为理解研究文献的整理发现：人行为理解研究一般遵从特征提取与运动表征、行为识别、高层行为与场景理解等几个基本过程。特征提取与运动表征是在对目标检测、分类和跟踪等底层和中层处理的基础上，从目标的运动信息中提取目标图像特征并用来表征目标运动状态；行为识别则是将输入序列中提取的运动特征与参考序列进行匹配，判断当前的动作处于哪种行为模型；高层行为与场景理解是结合行为发生的场景信息和相关领域知识，识别复杂行为，实现对事件和场景的理解。【2】 1、行为识别的应用从应用领域的分类来讲，可以将人体运动分析的应用分成如下几个领域： ①智能监控这里所指的“智能”包含两个方面的含义。一种“智能”是指系统能够在一定的场景中检测是否有人的出现(如通过检测人脸的方法)防止只是简单的通过运动目标检测所造成的错误报警(例如因为动物活动或者刮风摇动树枝等等而造成误报)。另外一种“智能”是指系统能够监视一定场所中人的活动，并对其行为进行分析和识别，跟踪可疑行为(如经常在重要地点徘徊等等行为)从而采取相应的报警措施。通常把报警系统设置于银行、机场、车站、码头、超市、办公大楼、住宅小区等地，以实现对这些场所的智能监控。 ②虚拟现实跟踪现实世界人的姿态，从而创建一个虚拟的仿真场景，实现人与这个虚拟世界的交互。该领域的具体应用涉及视频游戏、虚拟摄影棚、计算机动画等方面。 ③高级用户接口指可以通过对用户手势的识别来代替传统的鼠标和键盘输入，从而实现人与计算机之间的智能交互。此外，通过对手势语言的理解，还可以进行聋人与计算机之间的手语交流。 ④运动分析人体运动分析可以运用于基于内容的视频检索领域。例如可以检索在运动会上单杠比赛中运动员的杠上动作。这样可以节省用户大量的查询视频资料的