Efficient and Robust Detection of Duplicate Videos in a Large Database
- 格式:pdf
- 大小:4.75 MB
- 文档页数:16
一种基于nvme热插拔检测方法NVME (Non-Volatile Memory Express) is a high-performance and scalable storage protocol designed to take advantage of the performance of solid-state drives (SSDs) over the older and slower interfaces. NVME drives are becoming more and more popular due to their high speed and low latency. However, one common issue with NVME drives is the hot-plug detection method.NVME是一种高性能和可扩展的存储协议,旨在充分利用固态硬盘(SSD)的性能,相比之下,旧的和慢速的接口就显得不足够强大。
由于其高速和低延迟,NVME驱动器越来越受欢迎。
然而,NVME驱动器的一个常见问题是热插拔检测方法。
The hot-plug detection method for NVME drives refers to the ability to safely remove and add NVME drives from a system while it's powered on. This is an important feature for servers and data centers, as it allows for quick and easy maintenance and upgrading of storage devices without having to power down the entire system. However, the current hot-plug detection methods for NVME drivescan be unreliable, leading to potential data loss or drive damage if not properly handled.NVME驱动器的热插拔检测方法指的是在系统运行时安全地移除和添加NVME驱动器的能力。
电力故障诊断方法研究的一些参考文献Research on the method of power failure diagnosis has been a crucial area of focus in the field of electrical engineering. One important reference in this area is the paper "Power Proxy: Anomaly Detection in Power Usage Data" by Zhang et al. This paper proposes a novel approach using deep learning techniques to detect anomalies in power usage data, which can aid in the diagnosis of power failures. The authors demonstrate the effectiveness of their method through experiments on real-world power usage datasets.电力故障诊断方法的研究一直是电气工程领域的一个重要研究方向。
张等人的论文《Power Proxy: Anomaly Detection in Power Usage Data》是这方面的一个重要参考文献。
这篇论文提出了一种新颖的方法,利用深度学习技术来检测电力使用数据中的异常,有助于诊断电力故障。
作者通过对真实电力使用数据集的实验验证了他们方法的有效性。
In addition to Zhang et al.'s work, another valuable reference is the paper "A Survey on Fault Diagnosis Techniques Through EE Stream Processing" by Wang et al. This survey paper provides a comprehensive overview of fault diagnosis techniques in the contextof electrical engineering stream processing. The authors discuss various methods such as Bayesian networks, neural networks, and decision trees, highlighting their applications in diagnosing power failures. This paper serves as a useful guide for researchers interested in exploring different fault diagnosis techniques.除了张等人的工作,王等人的论文《A Survey on Fault Diagnosis Techniques Through EE Stream Processing》也是一个很有价值的参考文献。
机器视觉行业的职业发展目标与技能特长In the rapidly advancing field of machine vision, professionals can pursue various career development goals and cultivate specific skill sets to stay competitive and excel in their roles. Here are some key career development objectives and sought-after skills in the machine vision industry:1. Develop expertise in image processing algorithms and computer vision techniques.Image processing involves manipulating digital images to extract useful information or enhance visual quality. Professionals can focus on learning various image filtering and enhancement techniques, segmentation algorithms, object detection, recognition, and tracking methods. These skills are essential for enabling machines to understand and interpret visual data.中文翻译:在快速发展的机器视觉领域,专业人士可以追求各种职业发展目标,并培养特定的技能集,以保持竞争力并在其岗位上取得优异成绩。
互相关zncc原理Cross-correlation (互相关) and zero-normalized cross-correlation (ZNCC) are fundamental principles in the field of signal processing and image analysis. These techniques are widely used in various applications such as pattern recognition, feature matching, motion tracking, and image registration.互相关和零标准化互相关(ZNCC)是信号处理和图像分析领域的基本原理。
这些技术广泛应用于模式识别、特征匹配、运动跟踪和图像配准等各种应用中。
From a mathematical perspective, cross-correlation measures the similarity between two signals or images by sliding one over the other and computing the summation of the product of corresponding elements. It is a fundamental operation in signal processing that allows us to detect patterns and identify similarities between different sets of data. ZNCC, on the other hand, is a normalized version of cross-correlation that takes into account the mean and standard deviation of the signals to provide a more robust measure of similarity.从数学角度来看,互相关通过将一个信号或图像滑过另一个,并计算相应元素的乘积之和来衡量它们之间的相似性。
信道信源编码英文文献English:In the field of communication systems, the concepts of channel coding and source coding are fundamental. Channel coding, also known as error-correcting coding, involves adding redundancy to data before transmission to enable error detection and correction at the receiver. This technique is essential for reliable data transmission over noisy communication channels. Source coding, on the other hand, focuses on compressing data to reduce the number of bits required for representation without significant loss of information. Efficient source coding techniques are crucial for optimizing bandwidth usage and storage in various applications such as image and video compression, speech encoding, and data transmissionover networks. Understanding the principles and techniques of both channel coding and source coding is vital for designing efficient and robust communication systems.中文翻译:在通信系统领域,信道编码和信源编码是基础概念。
C S T e c h n i c a l R e p o r t #395, A p r i l 8, 2003Institute of ScientificComputingVolumetric Collision Detectionfor Derformable ObjectsBruno Heidelberger, Matthias Teschner, Markus GrossComputerScience DepartmentETH Zurich, Switzerland e-mail: {heidelberger, teschner, grossm}@inf.ethz.chhttp://graphics.ethz.chVolumetric Collision Detection for Deformable ObjectsBruno Heidelberger Matthias Teschner Markus GrossComputer Graphics LaboratoryETH ZurichFigure1:This sequence shows two deforming objects with150k and5k faces.An explicit representation of the red intersection volume can be computed at interactive frame rate.AbstractWe present a new algorithm for the efficient detection of colli-sions of geometrically complex objects.Our approach requires nei-ther expensive setup nor sophisticated spatial data structures and is hence specifically suitable for handling deformable objects with ar-bitrarily shaped,closed surfaces.The algorithm is based on a Lay-ered Depth Image(LDI)decomposition of the intersection volume. Currently,we have implemented two types of collision queries.The first one comprises an explicit representation of the intersection vol-ume.The second one computes vertex-in-volume tests.All queries are processed on the LDI–based volume representation.Our algo-rithm is very easy to implement and can be accelerated in graphics hardware by using OpenGL.Keywords:Collision Detection,Computational Geometry,Lay-ered Depth Image,Shape Approximation,Deformable Modeling, Physically–based Modeling,Graphics Hardware1IntroductionThe detection of collisions of geometric objects constitutes a fun-damental problem in computer graphics,computational geometry, modeling,robotics,and many other areas.Most research is devoted to the design of algorithms for the efficient computation of queries, such as intersecting surfaces or penetration depth.For complex ob-jects,such queries are computationally expensive,because they re-quire the testing of many individual graphics primitives.Hence,ef-ficient collision detection algorithms are accelerated by spatial data structures including bounding–box hierarchies,distancefields,or other ways of spatial partitioning.Such object representations are built in a pre–processing stage,and once they are created,perform very well for rigid objects.Modern physics–based simulation,as for instance used in game engines or computational surgery,demand the detection of colli-sions of objects deforming over time.In the case of deformable objects,the aforementioned acceleration structures have to be up-dated to adapt to changes of the geometry.While some of these algorithms have been successfully employed in such cases,the ac-celeration structures constitute an additional overhead in memory consumption and computational efficiency.We propose a simple and efficient algorithm based on Layered Depth Images(LDI)[Shade et al.1998]as the fundamental repre-sentation to accelerate collision detection of rigid and deformable objects.While most existing approaches consider object surfaces, our method is inherently volumetric.Rather than explicitly detect-ing surface intersections,the LDI provides a discrete representation of the intersection volume and allows for volume–based collision queries as shown in Fig.1.The generation of LDIs is extremely efficient and can be accelerated by graphics hardware.Various op-timizations minimize buffer read–backs.Our algorithm proceeds in three stages.First,we calculate the intersection of axis–aligned bounding boxes of pairs of objects.If an intersection is non–empty,a second stage computes an LDI rep-resentation of each object within the bounding–box intersection.Fi-nally,we obtain the overall intersection volume by simple Boolean operations on the LDI representation.The simplicity of the algorithm,the absence of expensive setup and the ease of implementation in OpenGL makes it especially at-tractive for real–time simulation and computer games.1.1ContributionsThe contributions of this work can be summarized as follows:•We present a new algorithm for the computation of volumetric penetrations of geometrically complex,deformable objects in real time.Our method handles arbitrarily shaped,closed sur-faces of manifold geometry.•We currently process two different types of collision queries: Thefirst one includes the explicit computation of the intersec-tion volume of two objects.The second one performs vertex-in-volume tests.Our approach is also robust in cases,where one object lies completely inside another object.•Unlike most existing approaches,our method does not require an involved setup or the implementation of a sophisticatedspatial data structure.It is memory efficient,computation-ally economical and performs on deformable models of up to hundred thousands of primitives.1.2Background and Related WorkDue to the importance of the topic,the interference detection of geometrically complex objects has been investigated extensively in various areas.Collision algorithms.In graphics,efficient collision detection is an essential component in physically–based simulationtion [Baraff 2001],[Debunne et al.2001],including cloth [Bridson et al.2002].Further applications can be found in computer animation,medical simulations,and games.Collision detection algorithms based on hierarchies have proven to be very efficient and many BVs have been investigated.Among the acceleration find spheres [Hubbard 1993],[Quinlan 1994],ing boxes [Hughes et al.1996],[van den Bergen oriented bounding boxes [Gottschalk et al.1996],and oriented polytopes [Klosowski et al.1996].Initially,BV approaches were designed for rigid context,the hierarchy is computed in a pre–processing case of deforming objects,however,this hierarchy must at run time.While effort has been spent to optimize BV for deformable objects [Larsson and Akenine-Moeller still pose a substantial computational burden and storage for complex objects.As an additional limitation for cations,BV approaches typically detect intersecting computation of the intersection volume requires an As an alternative to object partitioning,other ploy discretized distance fields as volumetric object for collision detection.The presented results,however,this data structure is less suitable for real–time metrically complex objects [Hirota et al.2000].Recently,various approaches have been introduced that employ graphics hardware for collision detection.In [Baciu et al.1999],a multi–pass rendering method is proposed for collision detection.However,this algorithm is restricted to convex objects.In [Lom-bardo et al.1999],the interaction of a cylindrical tool with de-formable tissue is accelerated by graphics hardware.The method is restricted to collisions of simplified surgical tools with object surfaces.[Hoff III et al.2001]proposes a multi–pass rendering approach for collision detection of 2–D objects,while [Kim et al.2002]and [Kim et al.2003]perform closest–point queries using bounding–volume hierarchies along with a multipass–rendering ap-proach.The aforementioned approaches decompose the objects into convex polytopes.Layered Depth Images (LDI).LDIs have been introduced as an efficient image–based rendering technique.The LDI data structure essentially stores multiple depth values per pixel.Thus,an LDI can be used to approximate the volume of an object.A method for generating LDIs is presented,for instance,in the depth–peeling approach to order–independent transparency [Everitt 2001].This implementation,however,demands a complex setup including two active depth buffers and texture shaders.In contrast,our approach to LDI generation is much simpler by employing a more efficient rendering setup based on plain OpenGL 1.4.Hardware–based approaches to interactive CSG rendering,e.g.[Goldfeather et al.1986]and [Rappoport and Spitz 1997],could also be employed for LDI generation.However,since such CSG rendering implementations have to handle more complex modeling operations they feature more involved rendering setups.Further-more,these CSG rendering approaches do not explicitly compute the resulting solid.2Method This section presents an overview of our algorithm followed by a detailed description of the three stages.Input.Our approach takes 3–D closed objects of manifold geome-try as an input.Although the approach is not confined to triangular (a)Stage 1.(b)Stage 2.(c)Stage 3a.Figure 2:Algorithm overview in 2–D and 3–D.(a)AABB inter-section.(b)LDI generation within the V oI.(c)Computation of the intersection volume.Algorithm overview.Our approach proceeds in three stages.Stage 1computes the AABB intersection for a pair of objects.If the intersection is empty,the two objects do not overlap.If it is not,stages 2and 3are applied to the AABB intersection volume.We will refer to it as V olume-of-Interest (V oI).Stage 2computes two LDIs,one for each object.LDI gener-ation is restricted to the V oI.The depth values of the LDI can be interpreted as the intersections of parallel rays or 3–D scan lines entering or leaving the object.Thus,an LDI classifies the V oI into inside and outside regions with respect to an object.Likewise,we classify the corresponding intersections into entry points and leav-ing points .The concept is similar in spirit to well–known scan–conversion algorithms to fill concave polygons [Foley et al.1990],where intersections of a scan line with a polygon represent transi-tions between interior and exterior.Stage 3performs the actual collision detection.Currently,we distinguish two different types of queries:a)Both LDIs are com-bined using Boolean intersection.If the intersection of all inside regions is not empty,a collision is detected.This operation also pro-vides an explicit representation of the intersection volume.b)Indi-vidual vertices within the V oI of one object are transformed into the LDI of the second object.If such a vertex lies in an inside region,a collision is detected.This query can also be used to check atomic or point–sampled objects for intersection.Fig.2illustrates the three stages of our algorithm.For a detailed description of all stages,refer to Sections2.1–2.3.2.1AABB IntersectionAABB intersections are computed for pairs of objects.If two AABBs do not overlap,the corresponding objects cannot interfere. Otherwise,the intersection volume provides a V oI,which is con-sidered for further processing.Although AABBs do not provide an optimal,i.e.smallest,bounding volume,they can be computed very efficiently.Furthermore,the intersection of two AABBs is again an AABB which keeps subsequent stages of the algorithm simple.Alternative BVs,such as object–oriented bounding boxes [Gottschalk et al.1996]or discrete–oriented polytopes[Klosowski et al.1996],provide tighterfitting object approximations.How-ever,the computation of these representations is more involved and the resulting intersection volume can have a more complex shape making it much more difficult for further processing.In addition, such structures are impractical for deforming objects,which require a dynamic adaptation of the bounding box.In order to simplify and accelerate further computations,the V oI has to meet the following requirements:Firstly,to guarantee proper boundary conditions,we have to render the LDI for each object from the outside.We select one of the six faces of the V oI,the so–called outside face,as the near rendering plane for each object(see Fig.3).If a sorted LDI is generated with respect to this face,en-try points and leaving points alternate.For closed objects,thefirst layer is assumed to contain entry points,the second layer leaving points and so on.Secondly,outside faces for both objects should correspond to opposite sides of the V oI(see Fig.3(a)).This sim-plifies the combination of both LDIs for the computation of the intersection volume.(a)V oI for A and B.(b)Extended V oI to obtainan outside face for A. Figure3:The intersection of the AABBs of two objects A and B provides the V oI.The intersection volume of two AABBs,i.e.the V oI,is bounded by parts of their faces.Fig.3(a)shows a V oI with corresponding outside faces.In some cases,for instance when one box is en-tirely within another box,appropriate outside faces for both objects cannot be found.This problem is solved by extending the V oI.If the outside face of one object isfixed,the opposite face of the V oI can be scaled to touch the bounding box of the other object(see Fig.3(b)).If several eligible pairs of outside faces exist,we choose the ones with the smallest distance between them.This criterion is a fast heuristic to optimize depth complexity for LDI generation assum-ing that depth complexity scales with the distance between the two outside faces.and their opposite faces defining the far planes.The remaining faces define top,bottom,left,and right of the rendering volume. Orthographic projection is used for rendering.The resolution of the LDI,which defines the accuracy of the object approximation,is specified by the user.Objects are rendered n max times for LDI gen-eration,where n max denotes the depth complexity of the relevant part of the object within the V oI.Thus,the computational com-plexity of the LDI generation is O(n max).For details regarding the OpenGL implementation,refer to Sec.3.Thefirst rendering pass generates a single LDI layer and com-putes n max.Depth testing and face culling are disabled.Only the stencil test is employed to discard fragments.The stencil test con-figuration allows thefirst fragment per pixel to pass,while the cor-responding stencil value is incremented.Subsequent fragments are discarded by the stencil test,but still increment the stencil buffer. Hence,after thefirst rendering pass the depth buffer contains the first object layer per pixel and the stencil buffer contains a map rep-resenting the depth complexity per pixel.Thus,n max is found by searching the maximum stencil value.Note,that the max()com-putation constitutes a very small computational burden,given the typical LDI resolution.If n max>1,additional rendering passes2to n max generate the remaining layers.The rendering setup is similar to thefirst pass. However,during the n-th rendering pass,thefirst n fragments per pixel pass the stencil test and,as a consequence,the resulting depth buffer contains the n-th LDI layer.During these passes,the stencil buffer is not incremented,if the stencil test fails.Since fragments are rendered in arbitrary order,the algorithm generates an unsorted LDI.For further processing,the LDI is sorted per pixel.Only thefirst n p layers per pixel are considered,where n p is the depth complexity of this pixel.n p is taken from the stencil buffer as computed in thefirst rendering pass.If n p is smaller than n max,layers n p+1to n max do not contain valid LDI values for this pixel and are discarded(see Fig.5).Although the order in which the fragments of an object are ren-dered is arbitrary,our algorithm relies on consistency across the individual passes.Our LDIs store normalized values.For advanced collision queries,additional information,such as viewing direction and viewing volume are stored as well.The accuracy of the method can be adjusted.In(x,y)-direction, accuracy corresponds with the user–defined LDI resolution.The accuracy in z-direction is given by the depth–buffer resolution.Figure 5:LDI for two pixels,unsorted and sorted with respect to depth values.Compared to alternative data structures,such as bounding–volume hierarchies or distance fields,the LDI representation is very memory efficient.Both the memory consumption and the perfor-mance of the LDI generation depend on the geometric complexity and the depth complexity of the objects.However,our results pre-sented in Sec.4suggest,that the number of LDI layers does not exceed reasonable values even for complex objects.Further,col-lision queries performed on the generated LDIs are very efficient even in the case of a large number of LDI layers.2.3Collision DetectionThe computed LDIs can be utilized to process a variety of collision queries in an efficient way.Our current implementation comprises two different types of collision queries:Intersection volume.Two LDIs can be combined to compute an intersection volume.The LDIs for two colliding objects are dis-cretized with the same resolution on corresponding sampling grids,but with opposite viewing directions.Hence,pixels in bothLDIs represent corresponding volume spans.Therefore,intersection vol-umes can be computed by a pixelwise intersection of the inside re-gions of both LDIs.Ifthe resulting intersection is non–empty,a collision is detected.The sum of all pixelwise intersection regions constitutes a discrete representation of the intersection volume (see Fig.6(a)).(a)Intersection volume.(b)Vertex-in-volume.Figure 6:Two types of collision queries.Pixelwise intersection and testing of the LDIs.Vertex-in-volume.Individual vertices can also be tested against an LDI.To this end,the vertex is transformed into the local coordinate system of the LDI.If the transformed vertex intersects with an in-side region,a collision is detected (see Fig.6(b)).As demonstrated in Fig.11and Fig.12,this query can be used to compute collisionswith atomic objects,such as particles.3Implementation The implementation of V oI computation (stage 1)and collision queries (stage 3)can be easily accomplished by following the de-scriptions in Sec.2.1and Sec.2.3,respectively.This section de-scribes some details regarding the efficient implementation of the LDI generation (see Sec.2.2)using core OpenGL 1.4functionality.The LDI generation is performed in a multi–pass rendering pro-cess following the rendering setup as described in Sec.2.2.To im-prove performance,the color buffer is disabled.Each LDI layer is generated in one rendering pass.Except for the first pass,which computes the depth complexities per pixel,all passes are identical.Our actual OpenGL implementation is very similar to the following pseudo code.rendering setup;//pass 1computes n max and the first LDI layer glClear(GL DEPTH BUFFER BIT |GL STENCIL BUFFER BIT);glStencilFunc(GL GREATER,1,0xff);glStencilOp(GL INCR,GL INCR,GL INCR);render object;depth complexities ←stencil buffer;n max ←max(depth complexities);layer [1]←depth buffer;//passes 2to n max compute the remaining LDI layers n ←2;while n ≤n max begin glClear(GL DEPTH BUFFER BIT |GL STENCIL BUFFER BIT);glStencilFunc(GL GREATER,n,0xff);glStencilOp(GL KEEP,GL INCR,GL INCR);render object;layer [n ]←depth buffer;n ←n +1;end;Each rendering pass actually requires a buffer read–back.Read-ing back buffers from the GPU,however,can be expensive de-pending on the actual architecture.We propose two methods for optimizing the data transfer using OpenGL extensions.Firstly,depth and stencil values of a pixel are usually stored in the same 32–bit word in the framebuffer.This allows to read–back both buffers in a single pass using an OpenGL extension,such as GL NV packed depth stencil.Secondly,all rendering passes can be performed independently from each other except for the first pass.We exploit this fact to render individual layers into differ-ent regions of the framebuffer.Once finished,the whole frame-buffer is read back in a single pass.This optimization reduces the number of read–backs to a maximum of two,assuming that the framebuffer memory is sufficiently large.It reduces stalls in the rendering pipeline and substantially improves the performance of the algorithm.4Results All of the experiments described in this section have been per-formed on a PC with 2.8GHz Intel Pentium IV ,2GB main mem-ory,and NVidia GeForce 4Ti 4600graphics card with 128MB on–board memory.Results for the two types of collision queries are presented using five different test objects (see Fig.7).Intersection volume.In Tab.1,computing times for six cases are given.Since the performance of our algorithm depends on the ge-ometrical complexity of the object and on the depth complexities of both objects within the V oI,we have carried out experiments asshown in Fig.8and Fig.9.In the worst case,when both AABBs coincide,the V oI and the depth complexity are maximal.Hence, both LDIs are computed for the entire object.The performance of these cases can be derived from Tab.2,where timings for the LDI generation for entire objects are given.Objects Faces DepthCom-plexityStage1+2[ms]Stage3a[ms]Fig.Knot/Rabbit12k/50k3/11918 Knot/Rabbit12k/50k4/65619 Rabbit/Reptile50k/110k7/53618 Rabbit/Reptile50k/110k4/24819 Reptile/Santa110k/150k1/32518 Reptile/Santa110k/150k8/611419 Table1:Computation of the intersection volume using various ob-ject pairs as shown in Fig.8and Fig.9.The resolution of the LDI is 128x128.The depth complexities are given for both objects within the V oI.Stage1and2compute the AABBs,the V oI,and both LDIs. Stage3a generates the intersection volume.In the case of rigid ob-jects,stages1and2can be pre–computed.An additional scenario with a very complex intersection volume is illustrated in Fig.10.The Knot and the Dragon have12k faces and520k faces,respectively.Depth complexities within the V oI are 7and8.With an LDI resolution of128x128the intersection volume can be computed in256ms.puting times for the vertex-in-volume test are given in Tab.2.For each object,100,000arbitrarily positioned vertices within the AABB of an object are tested.Fig.11illustrates the experiment for one of the test objects.Object FacesDepthComplexityStage1+2[ms]Stage3b[ms]Knot12k81126Rabbit50k82927Reptile110k128328Santa150k87926Dragon520k1030727 Table2:Vertex-in-volume test for various objects.The V oI com-puted in stage1coincides with the AABB of the object.Stage2 computes the LDI representation for the entire object with a reso-lution of128x128.In stage3,100,000vertices with arbitrary posi-tions within the AABB of the object are tested.In the case of rigid objects,stages1and2can be pre–computed.Fig.11illustrates the test for one object.An application of the vertex-in-volume test is shown in Fig.12. The simulation environment includes20,000particles and the de-forming Santa with10k faces.The object deformation,the particle dynamics,the collision detection and response can be computed and rendered in real time with29fps.5Conclusions5.1DiscussionWe have presented a fast,hardware–accelerated method for volu-metric collision detection.As a central acceleration structure our algorithm computes an LDI that approximates the intersection vol-ume.Unlike many existing collision detection schemes,our method does not require expensive setup or pre–processing and is,hence, specifically suited for deformable objects and dynamic simulation. Currently,we have implemented two different types of collision queries and demonstrated that the method performs at interactive rates up to very complex geometry.As a limitation,our collision detection approach is restricted to objects with closed,watertight surfaces,since it relies on the notion of inside and outside.5.2Future WorkCurrently,collision queries are computed on the CPU.The perfor-mance of the algorithm could be improved by using the GPU.Em-ploying pixel shaders with multiple render targets for LDI genera-tion might support GPU–based queries.While we compute an explicit representation of the intersection volume,additional queries,such as penetration depth or closest point,might be needed for advanced collision response.This is also subject to future research.At this point,self–collisions are not detected.Actually,our al-gorithm could be extended towards self–collisions by treating the front and back faces separately during the rendering process.This would allow us to efficently detect an invalid ordering of front and back faces.6AcknowledgementsThis project is funded by the National Center of Competence in Research“Computational Medicine”(NCCR Co–Me). ReferencesB ACIU,G.,W ONG,W.S.-K.,AND S UN,H.1999.RECODE:an image–based col-lision detection algorithm.The Journal of Visualization and Computer Animation 10,181–192.B ARAFF,D.2001.Collision and contact.SIGGRAPH2001Course Notes.B RIDSON,R.,F EDKIW,R.,AND A NDERSON,J.2002.Robust treatment of colli-sions,contact and friction for cloth animation.In Proc.of the29th annual confer-ence on Computer graphics and interactive techniques,ACM Press,594–603.D EBUNNE,G.,D ESBRUN,M.,C ANI,M.-P.,AND B ARR,A.H.2001.Dynamicreal-time deformations using space&time adaptive sampling.In Proc.of the28th annual conf.on Computer graphics and interactive techniques,ACM Press,31–36.E VERITT, C.2001.Interactive order–independent transparency.Technical report,NVIDIA Corporation.F OLEY,J.D.,VAN D AM,A.,F EINER,S.K.,AND H UGHES,puterGraphics:Principles and Practics.Addison-Wesley Publishing Company.G OLDFEATHER,J.,H ULTQUIST,J.P.M.,AND F UCHS,H.1986.Fast constructive–solid geometry display in the pixel–powers graphics system.In Proc.of the13th annual conference on Computer graphics and interactive techniques,ACM Press, 107–116.G OTTSCHALK,S.,L IN,M.C.,AND M ANOCHA,D.1996.Obbtree:a hierarchicalstructure for rapid interference detection.In Proc.of the23rd annual conference on Computer graphics and interactive techniques,ACM Press,171–180.H IROTA,G.,F ISHER,S.,AND L IN,M.C.2000.Simulation of non–penetratingelastic bodies using distancefields.Technical Report TR00-018,University of North Carolina at Chapel Hill.H OFF III,K.E.,Z AFERAKIS,A.,L IN,M.C.,AND M ANOCHA,D.2001.Fast andsimple2d geometric proximity queries using graphics hardware.In Proc.of2001 Symposium on Interactive3D Graphics,145–148.H UBBARD,P.M.1993.Interactive collision detection.In Proc.IEEE Symposium onResearch Frontiers in Virtual Reality,24–31.H UGHES,M.,D I M ATTIA,C.,L IN,M.C.,AND M ANOCHA,D.1996.Efficient andaccurate interference detection for polynomial deformation and soft object anima-tion.In Proc.of Computer Animation,155–166.K IM ,Y.J.,O TADUY ,M.A.,L IN ,M.C.,AND M ANOCHA ,D.2002.Fast pen-etration depth computation for physically-based animation.In Proc.of the ACM SIGGRAPH symposium on Computer animation ,ACM Press,23–31.K IM ,Y.J.,H OFF III,K.E.,L IN ,M.C.,AND M ANOCHA ,D.2003.Closest point query among the union of convex polytopes using rasterization hardware.Journal of Graphics Tools ,to appear.K LOSOWSKI ,J.T.,H ELD ,M.,M ITCHELL ,J.S.B.,S OWIZRAL ,H.,AND Z IKAN ,K.1996.Efficient collision detection using bounding volume hierarchies of k–DOPs.In Proc.of the 23rd annual conference on Computer graphics and interac-tive techniques ,ACM Press,171–180.L ARSSON ,T.,AND A KENINE -M OELLER ,T.2001.Collision detection for continu-ously deforming bodies.In Proc.of Eurographics 2001,325–333.L OMBARDO ,J.C.,C ANI ,M.-P.,AND N EYRET ,F.1999.Real–time collision detec-tion for virtual surgery.In Proc.of Computer Animation ’99,33–39.Q UINLAN ,S.1994.Efficient distance computation between non–convex objects.In Proc.IEEE Int.Conference on Robotics and Automation ,IEEE,3324–3329.R APPOPORT ,A.,AND S PITZ ,S.1997.Interactive boolean operations for conceptual design of 3–D solids.In Proc.of the 24th annual conference on Computer graphics and interactive techniques ,ACM Press/Addison-Wesley Publishing Co.,269–278.S HADE ,J.,G ORTLER ,S.,WEI H E ,L.,AND S ZELISKI ,yered depth im-ages.In Proc.of the 25th annual conference on Computer graphics and interactive techniques ,ACM Press,231–242.VAN DEN B ERGEN ,G.1997.Efficient collision detection of complex deformable models using aabb trees.Journal of Graphics Tools 2,4,1–13.Figure 7:Test objects:Dragon,Santa,Rabbit,Reptile,Knot.Figure 8:Computation of the intersection volume.AABBs,V oI,and intersection volume are shown for three examples with simpleintersections.Figure 9:Computation of the intersection volume.AABBs,V oI,and intersection volume are shown for three examples with complex intersections.Figure 10:Intersection volume for Knot andDragon.Figure 11:Vertex-in-volume test.Left:The LDI representation is computed for the entire object.V oI and AABB coincide.Right:The vertex-in-volume test is processed for 100,000arbitrarily posi-tionedvertices.Figure 12:Deforming Santa and particles.Red color on the right–hand side indicates actual or recent contact.。
KDD CUP 2009英文关键词:Customer Relationship Management (CRM), marketing databases, French Telecom company Orange, propensity of customers,中文关键词:客户关系管理(CRM)、营销数据库,法国电信公司橙、倾向的客户,数据格式:TEXT数据介绍:Customer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offers the opportunity to work on large marketing databases from the French Telecom company Orange to predict the propensity of customers to switch provider (churn), buy new products or services (appetency), or buy upgrades or add-ons proposed to them to make the sale more profitable (up-selling).The most practical way, in a CRM system, to build knowledge on customer is to produce scores. A score (the output of a model) is anevaluation for all instances of a target variable to explain (i.e. churn, appetency or up-selling). Tools which produce scores allow to project, on a given population, quantifiable information. The score is computed using input variables which describe instances. Scores are then used by the information system (IS), for example, to personalize the customer relationship. An industrial customer analysis platform able to build prediction models with a very large number of input variables has been developed by Orange Labs. This platform implements several processing methods for instances and variables selection, prediction and indexation based on an efficient model combined with variable selection regularization and model averaging method. The main characteristic of this platform is its ability to scale on very large datasets with hundreds of thousands of instances and thousands of variables. The rapid and robust detection of the variables that have most contributed to the output prediction can be a key factor in a marketing application.The challenge is to beat the in-house system developed by Orange Labs. It is an opportunity to prove that you can deal with a very large database, including heterogeneous noisy data (numerical and categorical variables), and unbalanced class distributions. Time efficiency is often a crucial point. Therefore part of the competition will be time-constrained to test the ability of the participants to deliver solutions quickly.[以下内容来由机器自动翻译]客户关系管理(CRM) 是现代营销战略的一个关键要素。
hiltiHilti: A Leader in Construction Technology and SolutionsIntroductionIn the realm of construction and engineering, quality tools and equipment are vital for both efficiency and safety. Hilti is a well-known global brand that specializes in providing innovative solutions and technologies for the construction industry. Established in 1941, Hilti has become synonymous with quality, reliability, and excellence. With a strong focus on customer satisfaction and advanced engineering, Hilti continues to be a leader in the construction field.History and BackgroundHilti was founded in the small town of Schaan, Liechtenstein, by brothers Eugen and Martin Hilti. Initially, the company operated as a workshop, producing tools and machines primarily for the local construction industry. However, with a keen eye for innovation, the Hilti brothers quickly realized the potential of their products and expanded their operations.In the 1950s, Hilti introduced the world's first electric hammer drill, revolutionizing the construction industry. This breakthrough technology made drilling faster and more efficient than ever before. As word spread about the superior performance and durability of Hilti tools, the company's reputation grew exponentially.Product Range and SolutionsHilti offers a wide range of products and solutions tailored to the various needs of the construction industry. Their product portfolio includes power tools, anchoring systems, measuring and detection tools, cutting and grinding equipment, and much more.One of Hilti's flagship products is the Hilti TE 1000-AVR Breaker. This powerful and robust tool is highly valued by professionals for its exceptional performance in concrete demolition and renovation tasks. With its vibration reduction system and innovative design, the TE 1000-AVR Breaker ensures both operator comfort and efficiency.Hilti also provides specialized anchoring systems that ensure secure and reliable fastening in concrete, masonry, and steel structures. Their patented Hilti SafeSet™ technology offers aunique combination of speed, precision, and safety in anchor installation. These systems have been used in landmark projects around the world, enabling safe and efficient construction.Another notable product from Hilti is their advanced line of cordless tools, including drills, impact drivers, and saws. These tools combine high performance with long battery life, allowing professionals to work without the limitations of cords and power outlets. With their exceptional durability and ergonomic design, Hilti cordless tools are trusted by construction professionals worldwide.Innovation and ResearchContinuous innovation is at the heart of Hilti's success. The company invests heavily in research and development to provide its customers with cutting-edge solutions. Hilti's engineers and scientists work closely with construction professionals to understand their challenges and develop products that can overcome them.One notable example of Hilti's dedication to innovation is the ON!Track asset management system. This cloud-based software allows construction companies to track and managetheir tools and equipment, leading to improved productivity and cost efficiency. With real-time information on tool locations, maintenance schedules, and usage history, construction managers can optimize their workflows and reduce downtime.Sustainability and Corporate Social ResponsibilityHilti recognizes the importance of sustainability and the need to minimize the environmental impact of construction activities. The company designs its tools and equipment with energy efficiency and long-lasting durability in mind. Hilti also actively promotes recycling initiatives, ensuring that their products can be disposed of responsibly at the end of their lifecycle.In addition, Hilti is committed to supporting the communities in which it operates through various corporate social responsibility initiatives. These include vocational training programs, charitable donations, and volunteering efforts. By empowering individuals and investing in education, Hilti aims to make a positive impact on society.ConclusionHilti's long-standing commitment to innovation, quality, and customer satisfaction has solidified its position as a global leader in the construction industry. With a diverse range of products and solutions, Hilti continues to drive progress and efficiency in construction projects around the world. As the demand for sustainable and innovative technologies increases, Hilti is well-positioned to meet and exceed the expectations of its customers in the years to come.。
Robust Control and Estimation Robust control and estimation are critical aspects of engineering and technology, playing a pivotal role in ensuring the stability and performance of complex systems. These techniques are employed in a wide range of applications, including aerospace, automotive, robotics, and industrial automation. The primary goal of robust control is to design systems that can effectively operate in the presence of uncertainties and variations, while robust estimation focuses on accurately inferring the state of a system despite disturbances and measurement errors. From a technical perspective, robust control involves the design of control systems that can accommodate variations and uncertainties in the system dynamics and external disturbances. This is particularly important in real-world applications where the exact mathematical model of the system may not be known with certainty. Robust control techniques, such as H-infinity control and mu-synthesis, aim to guarantee system stability and performance under a wide range of operating conditions. These methods often involve the use of advanced mathematical tools and optimization algorithms to synthesize controllers that are resilient to uncertainties. In the realm of robust estimation, the focus is on accurately determining the state of a system based on noisy and imperfect measurements. This is crucial for feedback control systems, where the state information is used to compute control actions. Robust estimation techniques, such as Kalman filtering and robust observers, are employed to mitigate the effects of measurement noise and disturbances, providing an accurate estimate of the system state. These methods are essential for applications such as autonomous navigation, sensor fusion, and fault detection, where reliable state estimation is critical for decision-making. Beyond the technical aspects, the significance of robust control and estimation extends to the broader societal and economic impact. In the field of transportation, robust control plays a crucial role in ensuring the safety and efficiency of vehicles, ranging from airplanes to autonomous cars. By enabling systems to operate reliably under varying conditions, robust control contributes to reducing the risk of accidents and improving overall transportation infrastructure. Similarly, in industrial automation, robust estimation techniques enhance the reliability and performance of manufacturing processes, leading tohigher product quality and increased productivity. Moreover, the development of robust control and estimation techniques requires a multidisciplinary approach, encompassing aspects of mathematics, control theory, signal processing, and computer science. This fosters collaboration and knowledge exchange among experts from diverse backgrounds, driving innovation and advancements in the field. Furthermore, the application of these techniques often involves the integration of cutting-edge technologies, such as artificial intelligence, machine learning, and data-driven modeling, leading to the development of more sophisticated and adaptive control systems. From a personal standpoint, the pursuit of robust control and estimation represents a fascinating and intellectually stimulating endeavor. The challenges inherent in dealing with uncertainties and disturbances in real-world systems demand creative and innovative solutions, pushing the boundaries of knowledge and technology. The ability to design control systems that exhibit resilience and adaptability to unforeseen circumstances is not only intellectually rewarding but also holds the potential to make a tangible impact on society by enhancing the safety, reliability, and efficiency of critical systems. In conclusion, robust control and estimation stand as indispensable pillars in the realm of engineering and technology, addressing the complexities and uncertainties inherent in modern systems. From a technical standpoint, these techniques enable the design of control systems that can operate effectively under varying conditions and provide accurate state estimation despite disturbances. Moreover, their impact extends to diverse domains, including transportation, manufacturing, and robotics, where they contribute to safety, reliability, and performance. The multidisciplinary nature of robust control and estimation fosters collaboration and innovation, while the personal and societal implications underscore their significance in shaping the future of technology and engineering.。
Efficient and Robust Detection of Duplicate Videosin a Large DatabaseAnindya Sarkar,Member,IEEE,Vishwarkarma Singh,Pratim Ghosh,Student Member,IEEE,Bangalore S.Manjunath,Fellow,IEEE,and Ambuj Singh,Senior Member,IEEEAbstract—We present an efficient and accurate method for duplicate video detection in a large database using videofinger-prints.We have empirically chosen the color layout descriptor,a compact and robust frame-based descriptor,to createfingerprints which are further encoded by vector quantization(VQ).We propose a new nonmetric distance measure tofind the simi-larity between the query and a database videofingerprint and experimentally show its superior performance over other distance measures for accurate duplicate detection.Efficient search cannot be performed for high-dimensional data using a nonmetric distance measure with existing indexing techniques.Therefore,we develop novel search algorithms based on precomputed distances and new dataset pruning techniques yielding practical retrieval times.We perform experiments with a database of38000videos, worth1600h of content.For individual queries with an average duration of60s(about50%of the average database video length), the duplicate video is retrieved in0.032s,on Intel Xeon with CPU 2.33GHz,with a very high accuracy of97.5%.Index Terms—Color layout descriptor(CLD),duplicate de-tection,nonmetric distance,vector quantization(VQ),video fingerprinting.I.IntroductionC OPYRIGHT infringements and data piracy have re-cently become serious concerns for the ever growing online video repositories.Videos on commercial sites e.g., and ,are mainly tex-tually tagged.These tags are of little help in monitoring the content and preventing copyright infringements.Approaches based on content-based copy detection(CBCD)and water-marking have been used to detect such infringements[20], [27].The watermarking approach tests for the presence of a certain watermark in a video to decide if it is copyrighted.The other approach,CBCD,finds the duplicate by comparing the Manuscript received March16,2009;revised June10,2009,September25, 2009,and December25,2009.Date of publication March18,2010;date of current version June3,2010.The work of A.Sarkar was supported by the Office of Naval Research,under Grants#N00014-05-1-0816and#N00014-10-1-0141.The work of V.Singh and P.Ghosh were supported by the National Science Foundation,under Grants ITR#0331697and NSF III#0808772, respectively.This paper was recommended by Associate Editor L.Guan. A.Sarkar,P.Ghosh,and B.S.Manjunath are with the Department of Electrical and Computer Engineering,University of California,Santa Bar-bara,CA,93106USA(e-mail:anindya@;pratim@; manj@).V.Singh and A.Singh are with the Department of Computer Sci-ence,University of California,Santa Barbara,CA93106USA(e-mail: vsingh@;ambuj@).Color versions of one or more of thefigures in this paper are available online at .Digital Object Identifier10.1109/TCSVT.2010.2046056fingerprint of the query video with thefingerprints of the copy-righted videos.Afingerprint is a compact signature of a videowhich is robust to the modifications of the individual framesand discriminative enough to distinguish between videos.Thenoise robustness of the watermarking schemes is not ensuredin general[27],whereas the features used forfingerprintinggenerally ensure that the best match in the signature spaceremains mostly unchanged even after various noise attacks.Hence,thefingerprinting approach has been more successful.We define a“duplicate”video as the one consisting entirelyof a subset of the frames in the original video—the individualframes may be further modified and their temporal ordervaried.The assumption that a duplicate video contains framesonly from a single video has been used in various copydetection works,e.g.,[32],[39],[42].In[42],it is shown thatfor a set of24queries searched in YouTube,Google Videoand Yahoo Video,27%of the returned relevant videos areduplicates.In[7],each web video in the database is reported tohave an average offive similar copies—the database consistedof45000clips worth18h of content.Also,for some popularqueries to the Yahoo video search engine,there were two orthree duplicates among the top ten retrievals[32].The block diagram of our duplicate video detection systemis shown in Fig.1.The relevant symbols are explained inTable I.The database videos are referred to as“model”videosin the paper.Given a model video V i,the decoded framesare sub-sampled at a factor of5to obtain T i frames and ap dimensional feature is extracted per frame.Thus,a model video V i is transformed into a T i×p matrix Z i.We empirically observed in Section III that the color layout descriptor(CLD)[24]achieved higher detection accuracy than other features.To summarize Z i,we perform k-means based clustering and store the cluster centroids{X i j}F ij=1as itsfingerprint.The number of clusters F i isfixed at a certain fraction of T i, e.g.,afingerprint size of5×means that F i=(5/100)T i. Therefore,thefingerprint size varies with the video length. k-means based clustering produces compact video signatures which are comparable to those generated by sophisticated summarization techniques as discussed in[36].In[2],we have compared different methods for keyframe selection for creating the compact video signatures.The duplicate detection task is to retrieve the best matchingmodel videofingerprint for a given queryfingerprint.Themodel-to-query distance is computed using a new nonmetricdistance measure between thefingerprints as discussed in1051-8215/$26.00c 2010IEEEFig.1.Block diagram of the proposed duplicate detection framework.Symbols used are explained in Table I.TABLE IGlossary of NotationsNotation Definition Notation DefinitionN Number of database videos V i i th model video in the datasetV i∗Best matched model video for a given query p Dimension of the feature vector computed per videoframeZ i∈R T i×p,T i Z i is the feature vector matrix of V i,where V i has T iframes after temporal sub-sampling X i∈R F i×p,F i X i is the k-means based signature of V i,which hasF i keyframesX i j j th vector of videofingerprint X i U Size of the vector quantizer(VQ)codebook used toencode the model video and query video signaturesQ orig∈R T Q×p Query signature created after sub-sampling,where T Qrefers to the number of sub-sampled query frames Q∈R M×p Keyframe-based signature of the query video(queryfingerprint),where M is the number of querykeyframesC i i th VQ codevector−→x i,−→q−→x i is VQ based signature of V i,while−→q is VQ basedquery signatureS X ij VQ symbol index to which X i j is mapped A Set of N“model signature to query signature”dis-tancesD∈R U×U Inter VQ-codevector distance matrix,for L1distancebetween VQ codevectors D∗∈R N×U Lookup distance matrix of shortest distance valuesfrom each model to each VQ codevectorC(i)Cluster containing the video indices whose VQ-basedsignatures have the i th dimension as nonzero||−→x1−−→x2||p L p norm of the vector(−→x1−−→x2).|E|Cardinality of the set E Fractional query length=(number of queryframes/number of frames for the actual source video)Section IV.We also empirically show that our distance mea-sure results in significantly higher detection accuracy than traditional distance measures(L1,partial Hausdorff distance [18],[19],Jaccard and cosine distances).We design access methods for fast and accurate retrieval of duplicate videos. The challenge in developing such an access method is two-fold.Firstly,indexing using such distances has not been well-studied till date—the recently proposed distance-based hashing[3]performs dataset pruning for arbitrary distances. Secondly,videofingerprints are generally of high dimension and varying length.Current indexing structures(M-tree[10],R-tree[17],kd tree[4])are not efficient for high-dimensionaldata.To perform efficient search,we propose a two phase pro-cedure.Thefirst phase is a coarse search to return the top-Knearest neighbors(NN)which is the focus of the paper.Weperform vector quantization(VQ)on the individual vectors inthe model(or query)fingerprint X i(or Q)using a codebookof size U(=8192)to generate a sparse histogram-basedsignature−→x i(or−→q).This is discussed in Section V-B. Coarse search is performed in the VQ-based signature space.Various techniques proposed to improve the search are theuse of precomputed information between VQ symbols,partial distance-based pruning,and the dataset pruning as discussed in Section V.The second phase uses the unquantized features (X i)for the top-K NN videos tofind the best matching video V i∗.Thefinal module(Section VII)decides whether the query is indeed a duplicate derived from V i∗.The computational cost for the retrieval of the best matching model video has two parts(Fig.1).1)Offline Cost:Consists of the un-quantized modelfinger-print generation,VQ design and encoding of the model signatures,and computation and storing of appropriate distance matrices.2)Online Cost:The query video is decoded,sub-sampled,keyframes are identified,and features are computed per keyframe—these constitute the query preprocessing cost.The reported query time comprises of the time needed to obtain k-means based compact signatures, perform VQ-based encoding on them to obtain sparse histogram-based representations,compute the relevant lookup tables,and then perform two-stage search to return the best matched model video.This paper is organized as follows.Section II presents literature survey.Feature selection forfingerprint creation is discussed in Section III.Section IV introduces our pro-posed distance measure.The various search algorithms, along with the different pruning methods,are presented in Section V.The dataset creation is explained in Section VI-A.Section VI-B contains the experimental results while Section VII describes thefinal decision module which makes the“duplicate/nonduplicate decision.”The main contributions of this paper are as follows.1)We propose a new nonmetric distance function forduplicate video detection when the query is a noisy subset of a single model video.It performs better than other conventional distance measures.2)For the VQ-based model signatures retained after datasetpruning,we reduce the search time for the top-K can-didates by using suitable precomputed distance tables and by discarding many noncandidates using just the partially computed distance from these model video signatures to the query.3)We present a dataset pruning approach,based on ourdistance measure in the space of VQ-encoded signatures, which returns the top-K NN even after pruning.We obtain significantly higher pruning than that provided by distance-based hashing[3]methods,trained on our distance function.In this paper,the terms“signature”and“fingerprint”have been used interchangeably.“Fractional query length”( in Table I)refers to the fraction of the model video frames that constitute the query.Also,for a VQ of codebook size U,the 1-NN of a certain codevector is the codevector itself.II.Literature SurveyA survey for video copy detection methods can be found in[27].Many schemes use global features for a fast initial search for prospective duplicates[42].Then,keyframe-based features are used for a more refined search.A.Keypoint-Based FeaturesIn an early duplicate detection work by Joly et al.[22], the keyframes correspond to extrema in the global intensity of motion.Local interest points are identified per keyframe using the Harris corner detector and local differential descriptors are computed around each interest point.These descriptors have been subsequently used in other duplicate detection works[20],[21],[26],[27].In[42],principal component analysis(PCA)–scale invariant feature transform(SIFT)fea-tures[25]are computed per keyframe on a host of local keypoints obtained using the Hessian-Affine detector[34]. Similar local descriptors are also used in[45],where near-duplicate keyframe(NDK)identification is performed based on matching,filtering,and learning of local interest points.A recent system for efficient large-scale video copy detection, the Eff2Videntifier[11],uses Eff2descriptors[29]from the SIFT family[33].In[44],a novel measure called scale-rotation invariant pattern entropy is used to identify similar patterns formed by keypoint matching of near-duplicate image pairs.A combination of visual similarity(global histogram for coarser search and local point forfiner search)and temporal alignment is used for duplicate detection in[40].VQ-based techniques are used in[8]to build a SIFT histogram-based signature for duplicate detection.B.Global Image FeaturesSome approaches for duplicate detection consider the simi-larity between sets of sequential video keyframes.A combina-tion of MPEG-7features such as the scalable color descriptor, CLD[24]and the edge histogram descriptor(EHD)has been used for video-clip matching[5].For image duplicate detection,the compact Fourier–Mellin transform(CFMT)[13] has also been shown to be very effective in[15]and the com-pactness of the signature makes it suitable forfingerprinting.C.Video-Based Features“Ordinal”features[6]have been used as compact signatures for video sequence matching[35].Li et al.[30]used a binary signature,by merging color histogram with ordinal signatures, for video clip matching.Yuan et al.[43]also used a similar combination of features for robust similarity search and copy detection.UQLIPS,a video clip detection system[39],uses RGB and HSV color histograms as the video features.A localized color histogram(LCH)-based global signature is proposed in[32].D.Indexing MethodsEach keyframe is represented by a set of interest point-based descriptors.Matching involves comparison of a large number of interest point pairs which is computationally intensive. Several indexing techniques have been proposed for efficient and faster search.Joly et al.[22]use an indexing method based on the Hilbert’s spacefilling curve principle.In[20], the authors propose an improved index structure based onstatistical similarity search(S3).A new approximate similarity search technique was used in[21],[23],[26]where the probabilistic selection of regions in the feature space is based on the distribution of the feature distortion.In[45],an index structure LIP-IS is proposed for fastfiltering of keypoints under one-to-one symmetric matching.E.Hash-Based IndexThe above mentioned indexing methods are generally com-pared with locality sensitive hashing(LSH)[12],[16],a popular approximate search method for L2distances.Since our proposed distance function is nonmetric,LSH cannot be used in our setup as the locality sensitive property holds only for metric distances.Instead,we have compared against distance-based hashing(DBH)[3]which can be used for arbitrary distances.F.Duplicate ConfirmationFrom the top retrieved candidate,the duplicate detection system has to validate whether the query has been derived from it.The duplicate video keyframes can be matched with the corresponding frames in the original video using spatio-temporal registration methods.In[21],[27],the approximate NN results are postprocessed to compute the most globally similar candidate based on a registration and vote strategy.In [26],Law-To et e the interest points proposed in[22] for trajectory building along the video sequence.A robust voting algorithm utilizes the trajectory information,spatio-temporal registration,and the labels computed during the offline indexing to make thefinal retrieval decision.In our duplicate detection system,we have a“distance threshold based”(Section VII-A)and a registration-based framework (Section VII-B)to determine if the query is a duplicate derived from the best-matched model video.The advantages of our method over other state-of-the-art methods are summarized as follows.1)In current duplicate detection methods,the query isassumed to contain a large fraction of the original model video frames.Hence,the query signature,computed over the entire video,is assumed to be similar to the model video signature.This assumption,often used as an initial search strategy to discard outliers,does not hold true when the query is only a small fraction(e.g.,5%)of the original video.For such cases,the query frames have to be individually compared with the best matching model frames,as is done by our distance measure.As shown later in Figs.3and7,we observe that our proposed distance measure performs much better than other distances for duplicate detection for shorter queries.2)We develop a set of efficient querying techniques withthe proposed distance measure which achieves much better dataset pruning than DBH.DBH is the state-of-the-art method for querying in nonmetric space.3)In[21],[27],the registration step is performed betweenquery frames and other model frames to confirm whether the query is a duplicate.Our distancecomputation parison of the duplicate video detection error for(a)keyframe-based features and(b)entire video-based features:query length is varied from2.5%to50%of the actual model video length.Error is averaged over all the query videos generated using noise addition operations discussed in Section VI-A.The modelfingerprint size used in(a)is5×.already gives us the inter-vector correspondence between the given query keyframes and the best matching model keyframes(Section VII-B)facilitating the registration step.III.Feature ExtractionA.Candidate FeaturesWe performed duplicate video detection with various frame-based features:CLD,CFMT[13],LCH[32],and EHD[41]. The LCH feature divides the image into a certain number of blocks and the3-D color histogram is computed per block.For example,if each color channel is quantized into four levels,the 3-D histogram per image block has43=64levels.If the image is divided into two partitions along each direction,the total LCH feature dimension is43×22=256.To study the variation of detection accuracy with signature size,we have considered the LCH feature for dimensions256and32(22×23=32).For the frame-based features,we use our proposed distance measure,which is explained in Section IV.We also considered global features computed over the entire video and not per keyframe.One such global feature used is the m-dimensional histogram obtained by mapping each of the 256-dimensional LCH vectors to one of m codebook vectors (this signature creation is proposed in[32]),obtained after k-means clustering of the LCH vectors.We have experimented with m=20,60,256,and8192,and L2distance is used. The other global feature is based on a combination of the ordinal and color histograms[43].Both the ordinal and color histograms are72-dimensional(24dimensions along each of the Y,Cb,and Cr channels)and the distance measure used is a linear combination of the average of the distance between the ordinal histograms and the minimum of the distance between the color histograms,among all the three channels.B.Experimental Setup and Performance ComparisonWe describe the duplicate detection experiments for feature comparison.We use a database of1200videofingerprints and a detection error occurs when the best matched video is not the actual model from which the query was derived.The various manipulations used for query video creation are discussed in Section VI-A.The query length is gradually reduced from 50%to2.5%of the model video length and the detectionerror(averaged over all noisy queries)is plotted against thefractional query length(Fig.2).For our dataset,a fractionalquery length of0.05corresponds,on an average,to6s of video ≈30frames,assuming25frames/s and a sub-sampling factor of5.Fig.2(a)compares frame-based features while Fig.2(b)compares video-based features.In[38],we have shown that the CFMT features perform bet-ter asfingerprints than SIFT features for duplicate detection.Fig.2(a)shows that for duplicate detection,18-dimensionalCLD performs slightly better than80-dimensional EHD,whichdoes better than36-dimensional CFMT and256-dimensionalLCH.Due to the lower signature size and superior detectionperformance,we choose18-dimensional CLD feature perkeyframe forfingerprint creation.It is seen that for a shortquery clip,the original model video histogram is often notrepresentative enough leading to higher detection error,as inFig.2(b).Hence,using a global signature with L2distanceworks well only for longer queries.We briefly describe the CLD feature and also provide someintuition as to why it is highly suited for videofingerprinting.The CLD signature[24]is obtained by converting the imageto an8×8image,on average,along each(Y/Cb/Cr)channel.The discrete cosine transform(DCT)is computed for eachimage.The dc andfirstfive(in zigzag scan order)ac DCTcoefficients for each channel constitute the18-dimensionalCLD feature.The CLD feature captures the frequency contentin a highly coarse representation of the image.As our exper-imental results suggest,different videos can be distinguishedeven at this coarse level of representation for the individualframes.Also,due to this coarse representation,most globalimage processing attacks do not alter the CLD significantly soas to cause detection errors.Significant cropping or gammavariation can however distort the CLD sufficiently to causeerrors—a detailed comparison of its robustness to variousattacks is presented later in Table VII.Depending on theamount of cropping,the8×8image considered for CLDcomputation can change significantly,thus severely perturbingthe CLD feature.Also,severe gamma variation can change thefrequency content,even for an8×8image representation,soas to cause detection errors.Storage-wise,our system consumes much less memorycompared to methods which store keypoint-based descriptors[11],[21].The most compact keypoint-based descriptor is the20-dimensional vector proposed in[21]where each dimensionis represented by8bits and17feature vectors are computedper second.The corresponding storage is10times that ofour system(assuming18-dimensional CLD features per framewhere each dimension is stored as a double,25frames/s,temporal sub-sampling by5,5%of the sub-sampled framesbeing used to create the model signature).IV.Proposed Distance MeasureOur proposed distance measure to compare a modelfinger-print X i with the query signature Q is denoted by d(X i,Q)1 (1).This distance is the sum of the best-matching distance of 1For ease of understanding,the quasi-distance measure d(·,·)is referred to as a distance function in subsequent discussions.each vector in Q with all the vectors in X i.In(1), X i j−Q k1 refers to the L1distance between X i j,the j th feature vectorof X i and Q k,the k th feature vector of Q.Note that d(·,·)is a quasi-distanced(X i,Q)=Mk=1min1≤j≤F iX i j−Q k1.(1)What is the motivation behind this distance function?We assume that each query frame in a duplicate video is a tampered/processed version of a frame in the original model video.Therefore,the summation of the best-matching distance of each vector in Q with all the vectors in the signature for the original video(X i)will yield a small distance.Hence, the model-to-query distance is small when the query is a (noisy)subset of the original model video.Also,this definition accounts for those cases where the duplicate consists of a reordering of scenes from the original video.A comparison of distance measures for video copy detection is presented in[18].Our distance measure is similar to the Hausdorff distance[18],[19].For our problem,the Hausdorff distance h(X i,Q)and the partial Hausdorff distance h P(X i,Q) are interpreted ash(X i,Q)=max1≤k≤Mmin1≤j≤FiX i j−Q k1(2)h P(X i,Q)=P thlargest1≤k≤Mmin1≤j≤FiX i j−Q k1.(3)For image copy detection,the partial Hausdorff distance(3) has been shown to be more robust than the Hausdorff distance (2)in[18].We compare the performance of h P(X i,Q)(3)for varying P,with d(X i,Q),as shown in Fig.3,using the same experimental setup as in Section III.It is seen that the results using d(X i,Q)are better—the improved performance is more evident for shorter queries.Intuitively,why does our distance measure perform better than the Hausdorff distance?In(2)[or(3)],wefirstfind the “minimum query frame-to-model video”distance for every query frame and thenfind the maximum(or P th largest) among these distances.Thus,both h(X i,Q)and h P(X i,Q) effectively depend on a single query frame and model video frame,and errors occur when this query(or model)frame is not representative of the query(or model)video.In our distance function(1),d(X i,Q)is computed considering all the“minimum query frame-to-model video”terms and hence, the effect of one(or more)mismatched query feature vector is compensated.Dynamic time warping(DTW)[37]is commonly used to compare two sequences of arbitrary lengths.The proposed distance function has been compared to DTW in[2],where it is shown that DTW works well only when the query is a continuous portion of the model video and not a collection of disjoint parts.This is because DTW considers temporal constraints and must match every data point in both the sequences.Hence,DTW takes any inter-sequence mismatch into account(thus increasing the effective distance),while the mismatch is safely ignored in our distance formulation.parison of the duplicate video detection error for the proposed distance measure d(·,·)(1)and the Hausdorff distances:here,(h p:P= k)refers to the partial Hausdorff distance(3)where the k th maximum is considered.V.Search AlgorithmsWe propose a two-phase approach for fast duplicate re-trieval.The proposed distance measure(1)is used in our search algorithms for duplicate detection.First,we discuss a naive linear search(NLS)algorithm in Section V-A.Search techniques based on the vector quantized representation of thefingerprints that achieve speedup through suitable lookup tables are discussed in Section V-B.Algorithms for further speedup based on dataset pruning are presented in Sec-tion V-C.We sub-sample the query video along time to get a signature Q orig having T Q vectors(see Fig.1and Table I).The initial coarse search(first pass)uses a smaller query signature Q, having M(M<T Q)vectors.Q consists of the cluster centroids obtained after k-means clustering on Q orig.When M=(5/100)T Q,we refer to the queryfingerprint size as 5×.Thefirst pass returns the top-K NN from all the N model videos.The larger query signature(Q orig)is used for the second pass to obtain the best matched video from these K candidates using a naive linear scan.As the query length decreases,the query keyframes may differ significantly from the actual model video keyframes;hence,thefirst pass needs to return more candidates to include the actual model video.A naive search method is to compute all the N model-to-query distances and thenfind the best match.This set of N distances is denoted by A(4).We speedup the coarse search by removing various computation steps involved in A.To explain the speedup obtained by various algorithms,we provide the time complexity breakup in Table IIA={d(X i,Q)}N i=1=Mk=1min1≤j≤F iX i j−Q k1Ni=1.(4)A.NLSThe NLS algorithm implements the two-pass method with-out any pruning.In thefirst pass,it retrieves the top-K candi-dates based on the smaller query signature Q by performing a full dataset scan using an ascending priority queue L of length K.The priority queue is also used for the other coarse search algorithms in this section to keep track of the top-K NN candidates.The k th entry in L holds the model video index(L k,1)and its distance from the query(L k,2).A model signature is inserted into L if the size of L is less than K or its distance from the query is smaller than the largest distance in the queue.In the second pass,NLS computes the distance of the K candidates from the larger query signature Q orig so as tofind the best matched candidate.The storage needed for all the model signatures=O(N¯Fp),where¯F denotes the average number of vectors in a modelfingerprint.B.VQ and Acceleration TechniquesFrom Table II,it is observed that time T11can be saved by precomputing the inter-vector distances.When the feature vectors are vector quantized,an inter-vector distance reduces to an inter-symbol distance,which isfixed once the VQ codevectors arefixed.Hence,we vector quantize the feature vectors and represent the signatures as histograms,whose bins are the VQ symbol indices.For a given VQ,we precompute and store the inter-symbol distance matrix in memory.We now describe the VQ-based signature ing the CLD features extracted from the database video frames, a VQ of size U is constructed using the Linde–Buzo–Gray algorithm[31].The distance d(·,·)(1)reduces to d VQM(·,·) (5)for the VQ-based framework,where D is the inter-VQ codevector distance matrix(6).C SX i jrefers to the S X ijth codevector,i.e.,the codevector to which the VQ maps X i jd VQM(X i,Q)=Mk=1min1≤j≤F iC SX i j−C SQ k1(5)=Mk=1min1≤j≤F iD(S X ij,S Qk)where D(k1,k2)= C k1−C k21,1≤k1,k2≤U.(6) Let−→q=[q1,q2,...,q U]denote the normalized histogram-based query signature(7)and−→x i=[x i,1,x i,2,...,x i,U]denote the corresponding normalized model signature(8)for video V iq k=|{j:S Qj=k,1≤j≤M}|/M(7) x i,k=|{j:S X ij=k,1≤j≤F i}|/F i.(8) Generally,consecutive video frames are similar;hence, many of them will get mapped to the same VQ codevector while many VQ codevectors may have no representatives(for a large enough U).Let{t1,t2,...,t Nq}and{n i,1,n i,2,...,n i,Nx i} denote the nonzero dimensions in−→q and−→x i,respectively, where N q and N xidenote the number of nonzero dimensions in−→q and−→x i,respectively.The distance between the VQ-based signatures−→x i and−→q can be expressed asd VQ(−→x i,−→q)=N qk=1q tk.min1≤j≤N xiD(t k,n i,j).(9)It can be shown that the distances in(5)and(9)are identical, apart from a constant factord VQM(X i,Q)=M.d VQ(−→x i,−→q).(10) The model-to-query distance(9)is same for different model videos if their VQ-based signatures have the same nonzero di-mensions.For our database of38000videos,the percentage ofvideo pairs(among(380002)pairs)that have the same nonzero。