Multilevel image reconstruction with natural pixels
- 格式:pdf
- 大小:907.16 KB
- 文档页数:27
Two-tier image annotation model based on a multi-labelclassifier and fuzzy-knowledge representation schemeMarina Ivasic-Kos a,n,Miran Pobar a,Slobodan Ribaric ba Department of Informatics,University of Rijeka,Rijeka,Croatiab Faculty of Electrical Engineering and Computing,University of Zagreb,Zagreb,Croatiaa r t i c l e i n f oArticle history:Received31October2014Received in revised form31August2015Accepted22October2015Available online6November2015Keywords:Image annotationKnowledge representationInference algorithmsFuzzy Petri NetMulti-label image classificationa b s t r a c tAutomatic image annotation involves automatically assigning useful keywords to an unlabelled image.Themajor goal is to bridge the so-called semantic gap between the available image features and the keywordsthat people might use to annotate images.Although different people will most likely use different words toannotate the same image,most people can use object or scene labels when searching for images.We propose a two-tier annotation model where thefirst tier corresponds to object-level and thesecond tier to scene-level annotation.In thefirst tier,images are annotated with labels of objects present inthem,using multi-label classification methods on low-level features extracted from images.Scene-levelannotation is performed in the second tier,using the originally developed inference-based algorithms forannotation refinement and for scene recognition.These algorithms use a fuzzy knowledge representationscheme based on Fuzzy Petri Net,KRFPNs,that is defined to enable reasoning with concepts useful forimage annotation.To define the elements of the KRFPNs scheme,novel data-driven algorithms foracquisition of fuzzy knowledge are proposed.The proposed image annotation model is evaluated separately on thefirst and on the second tier usinga dataset of outdoor images.The results outperform the published results obtained on the same imagecollection,both on the object-level and on scene-level annotation.Different subsets of features composedof dominant colours,image moments,and GIST descriptors,as well as different classification methods(RAKEL,ML-kNN and Naïve Bayes),were tested in thefirst tier.The results of scene level annotation in thesecond tier are also compared with a common classification method(Naïve Bayes)and have shownsuperior performance.The proposed model enables the expanding of image annotation with new conceptsregardless of their level of abstraction.&2015Elsevier Ltd.All rights reserved.1.IntroductionImage retrieval,search and organisation have become a problem on account of the huge number of images produced daily.In order to simplify these tasks,different approaches for image retrieval have been proposed that can be roughly divided into those that compare visual content(content-based image retrieval)and those that use text descriptions of images(text-based image retrieval)[27,7].Image retrieval based on text has appeared to be easier,more natural and more suitable for people in most everyday cases.This is because it is much simpler to write a keyword-based query than to provide image examples,and it is likely that the user does not have an example image of the query.Besides,images corresponding to the same keywords can be very diverse.For example,if a person has an image of a town,but wants a different view,content-based retrieval would not be the best choice because most of the results would be too similar to the image he already has.On the other hand,very diverse images can be retrieved with a keyword query,e.g.for the keyword Dubrovnik,different views of the town can be retrieved(Fig.1.).To be able to retrieve images using text,they must be labelled or described in the surrounding text.However,most images are not. Since manually providing image annotation is a tedious and expensive task,especially when dealing with a large number of images, automatic image annotation has appeared as a solution.Contents lists available at ScienceDirectjournal homepage:/locate/prPattern Recognition/10.1016/j.patcog.2015.10.0170031-3203/&2015Elsevier Ltd.All rightsreserved.n Correspondence to:R.Matejčić2,51000Rijeka,Croatia.Tel.:þ38551584710.E-mail address:marinai@uniri.hr(M.Ivasic-Kos).Pattern Recognition52(2016)287–305Automatic annotation methods deal with visual features that can be extracted from the raw image data,such as colour,texture and structure,and can automatically assign metadata in the form of keywords from a controlled vocabulary to an unlabelled image.The major goal is to bridge the so-called semantic gap [12]between the available features and the keywords that could be useful to people.In the paper [15]an image representation model is given to re flect the semantic levels of words used in image annotation.This problem is challenging because different people will most likely annotate the same image with different words that re flect their knowledge of the context of the image,their experience and cultural background.However,most people when searching for images use two types of labels:object and scene labels.Object labels correspond to objects that can be recognised in an image,like sky,trees,tracks and train for the image in Fig.2a.Scene labels represent the context of the whole image,like SceneTrain or more general Transportation for Fig.2a.Scene labels can be either directly obtained as a result of the global classi fication of image features [23]or inferred from object labels,as proposed in our approach.Since object and scene labels are most commonly used,in this paper we focus on automatic image annotation at scene and object levels.We propose a two-tier annotation model for automatic image annotation,where the first tier corresponds to object annotation and the second tier to scene-level annotation.An overview of the proposed model is given after the sections with the related work,in Section 3.The first assumption is that there can be many objects in any image,but an image can be classi fied into one scene.The second assumption is that there are typical objects of which scenes are composed.Since many object labels can be assigned to an image,the object-level annotation was treated as a multi-label problem and appropriate multi-label classi fication methods RAKEL and ML-kNN were used.The details of object level annotation are given in Section 4.The scene-level annotation relies on originally developed inference based algorithm de fined for fuzzy knowledge representation scheme based on Fuzzy Petri Net,KRFPNs.The KRFPNs scheme and data-driven algorithms for knowledge acquisition about domain images are de fined in Section 5.The inference approach for annotation re finement at the object level is given in Section 6.The conclusions about scenes are drawn using inference-based algorithms on facts about image domain and object labels obtained in the first tier,as detailed in Section 7.The major contributions of this paper can be summarised as follows:1.The adaptive two-tier annotation model in which each tier can be independently used and modi fied.2.The first tier uses multi-label classi fication to suit the image annotation at object level.3.The second tier uses inference-based algorithms to handle the recognition of scenes and higher-level concepts that do not have speci fic low-level features.4.The de finition of the knowledge representation scheme based on Fuzzy Petri Nets,to represent knowledge about domain images that is often incomplete,uncertain and ambiguous.5.Novel data-driven algorithms for acquisition of fuzzy knowledge about relations between objects and about scenes.6.Novel inference based algorithm for annotation re finement to reduce the error propagation through the hierarchical structure of concepts during the inference of scenes.7.Novel inference based algorithm for automatic scene recognition.8.A comparison of inference based scene classi fication with an ordinary classi fication approach.The performance of the proposed two-tier automatic annotation system was evaluated on outdoor images with regard to different feature subsets (dominant colours,moments,GIST descriptors [23]and compared to the published results obtained on the same image database as detailed in Section 8.The inference based scene classi fication is also compared with an ordinary classi fication approach and has shown better performance.The paper ends with a conclusion and directions for future work,Section 8.Fig.1.Part of the image search results for the keyword"Dubrovnik".Object labelstracks, train, cloud, sky, trees,snow, polar bear Scene label SceneTrain, Transportation ScenePolarbear, WildLife, ArcticFig.2.Examples of images and their annotation at object and scene levels.M.Ivasic-Kos et al./Pattern Recognition 52(2016)287–3052882.Related workAutomatic image annotation (AIA)has been an active research topic in recent years due to its potential impact on image retrieval,search,image interpretation and description.The AIA approaches proposed so far can be divided in various ways,e.g.according to the theory they are most related to [9]or according to the semantic level of concepts that are used for annotation [30].AIA approaches are most closely related to statistical theory and machine learning or logical reasoning and arti ficial intelligence,while semantic levels can be organised into flat or structured vocabularies.Classical AIA approaches belonging to the field of machine learning look for mapping between image features and concepts at object or scene levels.Classi fication and probabilistic modelling have been extensively used for this purpose.Methods based on classi fication,such as the one described in [17],treat each of the semantic keywords or concepts as an independent class and assign each of them to one classi fier.Methods based on the translation model [10]and methods,which use latent semantic analysis [22]fall into the category of probabilistic methods that aim to learn a relevance model to represent correlations between images and keywords.A recent survey of research in this field can be found in [35,7].Lately,graph-based methods have been intensively investigated to apply logical reasoning to images,and many graph-based image analysis algorithms have proven to be successful.Conceptual graphs which encode the interpretation of the image are used for image annotation.Pan et al.[24]have proposed a graph-based method for image annotation in which images,annotations and regions are considered as three types of nodes of a mixed media graph.In [19],automatic image annotation is performed using two graph-based learning processes.In the first graph,the nodes are images and the edges are relations between the images,and in the second graph the nodes are words and the edges are relations between the words.In [8],the authors aim to illustrate each of the concepts from the WordNet ontology with 500-1000images in order to create public image ontology called ImageNet.Within the project aceMedia,in [21]ontology is combined with fuzzy logic to generate concepts from the beach domain.In [1],the same group of authors has used a combination of different classi fiers for learning concepts and fuzzy spatial relationships.In [13],a framework based on Fuzzy Petri Nets is proposed for image annotation at object level.In the fuzzy-knowledge base,nodes are features and objects.Co-occurrence relations are de fined between objects,and attribute relations are de fined between objects and features.The idea of introducing a fuzzy knowledge representation scheme into image annotation system proposed in our previous research[14,16],is further developed in this paper.New inference-based algorithms for scene classi fication and annotation re finement as well as algorithms for data-driven knowledge acquisition are formally de fined and enhanced.As opposed to the previous work where object level classi fication was performed using classi fication of segmented images,object level annotation is treated here as a multi-label classi fication problem on the whole image.Binder et al.[3]have proposed a method called Output Kernel Multi-Task Learning (MTL)to improve ranking performance.Zhang et al.[36]have de fined a graph-based representation for loosely annotated images where each node is de fined as a collection of discriminative image patches annotated with object category labels.The edge linking two nodes models the co-occurrence relationship among different objects in the same image.3.Overview of the two-tier image annotation modelImages of outdoor scenes commonly contain one or more objects of interest,for instance person,boat,dog,bridge and different kinds of background objects such as sky,grass,water ,etc.However,people often think about these images as a whole,interpreting them asscenes,Fig.3.Framework of a two-tier automatic image annotation system.M.Ivasic-Kos et al./Pattern Recognition 52(2016)287–305289for example tennis match instead of person,court,racket,net ,and ball .To make the image annotation more useful for organising and for the retrieval of images,it should contain both object and scene labels.Object labels correspond to classes whose instances can be recognised in an image.Scene labels are used to represent the context or semantics of the whole image,according to common sense and expert knowledge.To deal with different levels of annotation with object and scene labels,a two-tier automatic image annotation model using a multi-label classi fier and a fuzzy-knowledge representation scheme is proposed.In the first tier,multi-label classi fication is proposed since it can provide reliable recognition of speci fic objects without the need to include knowledge.However,higher-level concepts usually do not correspond to speci fic low-level features,and inference based on knowledge about the domain is therefore more appropriate.The relationships between objects in a scene can be easily identi fied and are generally unchanging,so they can be used to infer scenes and more abstract concepts.Still,knowledge about objects and scenes is usually imprecise,somewhat ambiguous and incomplete.Therefore,fuzzy-knowledge representation is used in the second tier.The architecture of the proposed system is depicted in Fig.3.The input to the system is an unlabelled image and the results of automatic annotation are object and scene labels.First,from each image,low-level features are extracted which represent the geometric and photometric properties of the image.Each image is then represented by the m -component feature vector x ¼ðx 1;x 2;…;x m ÞT .The obtained features are used for object classi fication.The assumption is that more than one object can be relevant for image annotation,so multi-label classi fication methods are used.The result of the image classi fication in the first tier is object labels from the set C of all object labels.In the second tier,the object labels are used for scene-level classi fication.Scene-level classi fication is supported by the image-domain knowledge base,where each scene is de fined as a composition of typical object classes.Other concepts at different levels of abstraction can be inferred using the de fined hierarchical relations among concepts.Relations among objects and scenes can also be used to check the consistency of object labels and to discard those that do not fit the likely context.In this way,the propagation of errors is reduced through the hierarchical structure of concepts during the inference of scenes or more abstract concepts.To represent fuzzy knowledge about the domain of interest,a knowledge-representation scheme based on the Fuzzy Petri Net formalism [26]is de fined.The proposed scheme has the ability to cope with uncertain,imprecise,and fuzzy knowledge about concepts and relations,and enables reasoning about them.The knowledge-representation scheme related to image domain and inference-based algorithms for annotation re finement and for scene recognition are implemented as Java classes.The elements of the scheme are gen-erated utilising the proposed data-driven algorithms for acquisition of fuzzy knowledge,which are implemented in Matlab.4.Multi-label classi fication in the first tierIn the general case,an image can contain multiple objects in both the foreground and the background.All objects appearing on the image,or many of them,can be useful or important for automatic image interpretation.All objects appearing on the image are potentially useful and hold important information for automatic image interpretation.To preserve as much information about the image as possible,many object labels are usually assigned to an image.Therefore,image annotation at the object level is an inherently multi-label classi fication problem.Here we have used different feature sets with appropriate multi-label classi fication methods.4.1.Feature setsThe variety of perceptual and semantic information about objects in an outdoor image could be contained in global low-level features such as dominant colour,spatial structure,colour histogram and texture.For classi fication in the first tier,we used a feature set made up of dominant colours of the whole image (global dominant colours)and of parts of the image (region based dominant colours),colour moments and the GIST descriptor.The used set of features is constructed to represent the varied aspects of image representation including colour and structure information.The dominant colours were selected from the colour histogram of the image.The colour histogram was calculated for each of the RGB colour channels of the whole image.Next,histogram bins with the highest values for each channel were selected as dominant colours.AfterOriginal image 3x1 grid Central andbackground regionFig.4.Original images and the arrangement of a 3x1image grid and central and background regions from which the dominant colours features were computed.M.Ivasic-Kos et al./Pattern Recognition 52(2016)287–305290experimenting with different number of different colours (3–36)per channel we chose to use 12of them in each image as features (referred to as DC)for our classi fication tasks.The information about the colour layout of an image was preserved using 5local RGB histograms,from which dominant colours were extracted in the same manner as for the whole image.The local dominant colours are referred to as DC1to DC5.To calculate the DC1,DC2and DC3local features,a histogram was computed for each cell of a 3x1grid applied to each image.The DC4feature was computed in the central part of the image,presumably containing the main image object,and the DC5feature in the surrounding part that would probably contain the background,as shown in Fig.4.The size of the central part was 1/4of the diagonal size of the whole image,and of the same proportions.There are 12dominant colours with 3channels giving a 36-dimensional feature vector of global DC.The size of local dominant colours vector (DC1to DC5)is 180(12dominant coloursx3channelsx5image parts).Additionally,we computed the colour moments (CM)for each RGB channel:mean,standard deviation,skewness and kurtosis.The size of the CM feature vector is 12(4colour moments for 3channels).The GIST image descriptor [23]that was proven to be ef ficient for scene recognition was also used as a region-based feature.It is a structure-based image descriptor that refers to the dominant spatial structure of the image characterized by the properties of its boundaries (e.g.,the size,degree of openness,perspective)and its content (e.g.,naturalness,roughness).The spatial properties are estimated using the global features computed as a weighted combination of Gabor-like multi scale-oriented filters.In our case,we used 8Â8encoding samples in the GIST descriptor within 8orientations per 8scales of image components,so the GIST feature vector has 512components.We performed the classi fication tasks using all the extracted features (DC,DC1..DC5,CM,GIST).The size of the feature vector used for classi fication was 740.Since the size of the feature vector is large in proportion to the number of images,we also tested the classi fication performance using five subsets.4.2.Image annotation at object levelWe attempt to label images with both foreground and background object labels,assuming that they are all useful for image annotation.Since an image may be associated with multiple object labels,the annotation at the object level in the first tier is treated as a multi-label classi fication problem.Multi-label image classi fication can be formally expressed as φ:E -P ðC Þ,where E is a set of image examples,P ðC Þis a power set of the set of class labels C and there is at least one example e j A E that is mapped into two or more classes,i.e.(e j A E:φe j ÀÁZ 2j .The methods most commonly used to tackle a multi-label classi fication problem can be divided into two different approaches [29].In the first,the multi-label classi fication problem is transformed into more single-label classi fication problems [20],known as the data transformation approach.The aim is to transform the data so that any classi fication method designed for single-label classi fication can be applied.However,data transformation approaches usually do not exploit the patterns of labels that naturally occur in multi-label setting.For example,dolphin usually appears together with sea ,while observing the labels cheetah and fish is much less common.On the other hand,algorithm adaptation methods extend speci fic learning algorithms in order to handle multi-label data directly and can sometimes utilise the inherent co-occurrence of labels.Another issue in multi-label classi fication is the fact that typically some labels occur more rarely in the data and it may be dif ficult to train individual classi fiers for those labels.For the multi-label classi fication task,we used the Multi-label k-Nearest Neighbour (ML-kNN)[34],a lazy learning algorithm derived from the traditional kNN algorithm,and RAKEL (RAndom k-labELsets)[28]which is an example of data adaptation methods.The kNN classi fier is a nonlinear and adaptive algorithm that is rather well suited for rarely occurring labels.The RAKEL algorithm considers a small random subset of labels and uses a single-label classi fier for the prediction of each element in the power set of this subset.In this way,the algorithm aims to take into account label correlations using single-label classi fiers.In our case,kNN classi fier and the C4.5classi fication tree are used as base single-label classi fiers for RAKEL.The base classi fiers are applied to subtasks with a manageable number of labels.It was experimentally shown that for our task the best results are obtained using RAKEL with the C4.5as base classi fier.We also used the Naïve Bayes (NB)classi fier along with data transformation.The data were transformed so that each instance with multiple labels was replaced with elements of the binary relation ρD E ÂC between a set of samples E and a set of class labels C .An ordered pair ðe ;c ÞA E ÂC can be interpreted as “e is classi fied into c ”and is often written as e ρc .For example,if an image example e 15A E was annotated with labels φe 15ðÞ¼lion ;sky ;grass ÈÉ;lion ;sky ;grass A E,it was transformed into three single-label instances e 15ρlion ;e 15ρsky ;e 15ρgrass .With NB,a model is learned for each label separately,and the result is the union of all labels that are obtained as a result of a binary classi fication in one-vs.-all fashion and with RAKEL for small subsets of labels.Unlike the NB,with RAKEL and ML-kNN a model is learned for small subsets of labels and therefore as a result of the classi fication only such subsets can be obtained.In both cases,incorrect labels can also be assigned to an image along with correct ones if they are in the assigned subset for RAKEL and kNN case,or if they are a result of binary misclassi fication for NB.For the one-vs.-all case,the inherent correlations among labels are not used,but an annotation re finement post-processing step is proposed here that incorporates label correlations and excludes from results labels that do not normally occur together with other labels in images.The object labels and the corresponding con fidence obtained as a result of the multi-label classi fication in the first tier of the image annotation system are used as features for the recognition of scenes in the second tier.In case when the Naïve Bayes classi fier is used,con fidence of each object is set according to the posterior probability of that object class.Otherwise,when classi fier used in the first tier cannot provide information on the con fidence of the classi fication,e.g.when the ML-kNN classi fier is used,the con fidence value is set to 1.Object labels from the annotation set φe ðÞD C and the corresponding con fidence values make the set Φe ðÞ&φe ðÞÂ0;1½ that is input to the inference based algorithm for scene recognition in the second tier.5.Knowledge representation in the second tierThe assumption is that there are typical objects of which scenes are composed,so each scene is treated as a composition of objects that are selected as typical,based on the used training dataset.To enable making inferences about scenes,relations between objects and scenes and relations between objects in an image are modelled and stored in a knowledge base.Facts about scenes and objects are collected M.Ivasic-Kos et al./Pattern Recognition 52(2016)287–305291automatically from the training dataset and organised in a knowledge representation scheme.It is a faster and more effective way of collecting speci fic knowledge then manually building the knowledge base,since it is a broad,uncertain domain of knowledge and expertise is not clearly de fined.An expert only de fines facts related to general knowledge like hierarchical relationships between concepts so the need for manual de finition of facts is reduced drastically.However,an expert can still review generated facts and modify them if necessary.Due to the inductive nature of generating facts,the acquired knowledge is often incomplete and unreliable,so the ability to draw conclusions about scenes from imprecise,vague knowledge becomes necessary.Additionally,the inference of an unknown scene relies on facts about objects that are obtained as classi fication results in the first tier and that are subject to errors.Apart from using a knowledge representation scheme that allows representation of fuzzy knowledge and can handle uncertain reasoning,the algorithm for object annotation re finement is proposed to detect and discard those objects that do not fit the context and so to prevent error propagation.For the purpose of knowledge representation,a scheme based on KRFPN formalism [26],which is in turn based on the Fuzzy Petri Net,is de fined to represent knowledge about outdoor images as well as to support subsequent reasoning about scenes in the second tier of the system.The de finition of a fuzzy knowledge representation scheme adapted for automatic image annotation,named KRFPNs and the novel data-driven algorithms for automatic acquisition of knowledge about object and scenes,and relations between them,are detailed below.5.1.De finition of the KRFPNs scheme adapted for image annotationThe elements of the knowledge base used for the annotation of outdoor images,particularly fuzzy knowledge about scenes and relationships among objects,are presented using the KRFPNs scheme.The KRFPNs scheme concatenates elements of the Fuzzy Petri Net (FPN )with a semantic interpretation and is de fined as:KRFPNs ¼FPN ;^α;^β;D ;Σ :ð1ÞFPN ¼ðP ;T ;I ;O ;M ;Ω;f ;c Þis a Fuzzy Petri Net where P ¼p 1;p 2;:::;p n ÈÉ;n A N is a finite set of places,T ¼t 1;t 2;:::;t m f g ;m A N is a finite set of transitions,I :T -P P ðÞ=∅is the input function and O :T -P P ðÞ=∅is the output function,where P P ðÞis the power set of places.The sets of places and transitions are disjunctive,P \T ¼∅,and the link between places and transitions is given with the input and output functions.Elements P ;T ;I ;O are parts of an ordinary Petri Net (PN).M ¼m 1;m 2;…;m r f g ;r Z 0is a set of tokens used to de fine the state of a PN in discrete steps.The state of the net in step w is de fined with distribution of tokens:Ωw :P -P M ðÞ,where P M ðÞis a power set of M .The initial distribution of tokens is denoted as Ω0and in our case,in the initial distribution a place can have no or at most one token,so j Ω0p ðÞj A 0;1f g .A place p is marked in step w if Ωw ðp Þa ∅.Marked places are important for the firing of transitions and the execution of the net.Firing of the transitions changes the distribution of tokens and causes the change of the net state.Fuzziness in the scheme is implemented with functions c :M -½0;1 and f :T -½0;1 that associate tokens and transitions with truth values in the range [0,1],where zero means “not true ”and one “always true ”.Semantic interpretation of places and transitions is implemented in the scheme with functions ^αand ^βin order to link elements of the scheme with concepts related to image annotation.The function ^α:P -D Â0;1½ ;maps a place p i A P to a concept and the corresponding con fidence value d j ;c d j ÀÁÀÁ:d j A D ;c d j ÀÁ¼max c m l ðÞ;m l A Ωp i ÀÁ.If there are no tokens in the place,the token value is zero.An alternative function thatdiffers in the codomain is de fined ;α:P -D .The function α,is a bijective function,so its inverse exists αÀ1:D -P .Those functions are useful for mapping a place p i A P in the scheme to a concept d j and vice versa,when no information about the con fidence of a concept is needed.The set of concepts D contains object and scene labels used for annotation of outdoor images,D ¼C [SC .A set C contains object labels such as,Airplane ;Train ;Shuttle ;Ground ;Cloud ;Sky ;Coral ;ÈDolphin ;Bird ;Lion ;Mountain g &C and set SC contains scene labels on different is SC ¼{SceneAirplane,SceneBear,..,Seaside,Space,Undersea}.The function ^β:T -ΣÂ0;1½ &D ÂD Â0;1½ ;maps a transition t j A T to a pair d pk σi d pl ;c j ÀÁ:σi A Σ;d pk ;d pl A D ;αÀ1ðd pk Þ¼p k ;αÀ1ðd pl Þ¼p l ;p k A I t j ÀÁ;p l A O t j ÀÁ;c j ¼f t j ÀÁwhere the first element is the relation between concepts linked with input and output places of the tran-sition t j and the second is its con fidence value.The function β:T -Σthat differs in the codomain is also de fined and is useful for checking whether there is an association between a transition and a relation.The set Σcontains relations and is de fined as Σ¼o ccurs _with ;is _part _of ;is _a .The relation σ1¼occurs _with is a pseudo-spatial relation that is de fined between objects and it models the joint occurrence of objects in images.The compositional relation σ2¼is _part _of is de fined between scenes and objects,re flecting the assumption that objects are parts of a scene.The hierarchical relation σ3¼is _a is de fined between scenes to model a taxonomic hierarchy between scenes.The con fidence values of σ1¼occurs _with and σ2¼is _part _of relations are de fined according to the used training dataset,and in case of σ3¼is _a it is de fined by an expert.5.2.Data-driven knowledge acquisition and automatic de finition of KRFPNs schemeThe process of knowledge acquisition related to annotation re finement and scene recognition is supported with algorithms that enable automatic extraction of knowledge from existing data in the training set and arrangement of acquired knowledge in the knowledge representation scheme.Automated knowledge acquisition is a fast and economic way of collecting knowledge about the domain and can be used to estimate the con fidence values of acquired facts especially when the domain knowledge is by nature broad as is the case with the image annotation.In that case experts are not able to provide a comprehensive overview of possible relationships between objects in different scenes or objectively evaluate their reliability.Automatically accumulated knowledge is very dependent on the examples in the data set,and to make it more general the estimated con fidences of facts are further adjusted.In addition,to improve the knowledge base consistency,completeness and accuracy,and to enhance the inference process,an expert can review or modify the existing facts and rules or add new ones.M.Ivasic-Kos et al./Pattern Recognition 52(2016)287–305292。
fine-to-coarse reconstruction算法-概述说明以及解释1.引言1.1 概述:在计算机视觉领域,图像重建是一项重要的任务,其目的是从输入的低分辨率图像中生成高质量的高分辨率图像。
Fine-to-Coarse Reconstruction算法是一种常用的图像重建算法,它通过逐渐增加图像的分辨率级别,从粗到细地重建图像,以获得更加清晰、细节丰富的图像。
Fine-to-Coarse Reconstruction算法在图像处理和计算机视觉中有着广泛的应用,能够有效地提高图像质量和细节信息的还原程度。
本文将详细介绍Fine-to-Coarse Reconstruction算法的原理、应用和优势,希望能为读者提供深入了解和应用该算法的指导。
1.2 文章结构本文主要分为引言、正文和结论三部分。
在引言部分中,我们将对Fine-to-Coarse Reconstruction算法进行概述,并介绍文章的结构和目的。
在正文部分,我们将详细介绍Fine-to-Coarse Reconstruction算法的原理以及其在实际应用中的表现。
我们将重点讨论该算法在图像处理、计算机视觉等领域的应用,并探讨其优势和局限性。
最后,在结论部分,我们将对整篇文章进行总结,展望Fine-to-Coarse Reconstruction算法的未来发展方向,并留下一些思考和结束语。
整个文章结构清晰,层次分明,将帮助读者全面了解和理解Fine-to-Coarse Reconstruction算法的重要性和价值。
1.3 目的Fine-to-Coarse Reconstruction算法的目的是通过逐步从细节到整体的重建过程,实现对图像或模型的高效重建。
通过逐步迭代的方式,算法能够在保持细节的同时,提高重建的速度和准确性。
本文旨在深入探讨Fine-to-Coarse Reconstruction算法的原理、应用和优势,以期能够为相关研究和应用提供更多的启发和帮助。
(19)中华人民共和国国家知识产权局(12)发明专利申请(10)申请公布号 (43)申请公布日 (21)申请号 201910286781.8(22)申请日 2019.04.10(71)申请人 大连民族大学地址 116600 辽宁省大连市经济技术开发区辽河西路18号(72)发明人 吴宝春 郑蕊蕊 毕佳晶 贺建军 辛守宇 (74)专利代理机构 大连智高专利事务所(特殊普通合伙) 21235代理人 刘斌(51)Int.Cl.G06T 3/40(2006.01)(54)发明名称满文图像超分辨率重建的生成器构建方法(57)摘要满文图像超分辨率重建的生成器构建方法,属于计算机图像处理领域,为了解决低分辨率满文图像进行超分辨率重建的生成器构建问题,使用5个相同结构的残差块和2个亚像素卷积层构建而成生成器,能够学习高低分辨率的满文图像间映射关系,从而对低分辨率满文图像进行超分辨率重建。
权利要求书1页 说明书7页 附图1页CN 110009568 A 2019.07.12C N 110009568A权 利 要 求 书1/1页CN 110009568 A1.一种满文图像超分辨率重建的生成器构建方法,其特征在于:使用5个相同结构的残差块和2个亚像素卷积层构建而成生成器。
2.如权利要求1所述的满文图像超分辨率重建的生成器构建方法,其特征在于:生成器结构是:第1操作是Input输入层,用于输入图像是训练数据中低分辨率RGB三通道图像;第2操作是G-Conv-1层,是卷积层,卷积核为9像素×9像素,步长1像素,包含64个滤波器;第3操作是PReLu层,其将G-Conv-1层的输入信号进行非线性变换,;第4-8操作均是Residual block残差块,且五个操作是5个结构相同的Residual block 残差块,用于对低分辨率图像的图示信息特征进行提取;第9操作包括G-Conv-2卷积层、BN操作、Sum操作,其中G-Conv-2卷积层的卷积核为3像素×3像素,步长1像素,包含64个滤波器,BN表示批量归一化操作,Sum表示输出求和;第10操作包括G-Conv-3卷积层、Sub-Pixel CN亚像素卷积层、PReLu层,其中G-Conv-3卷积层的卷积核为3像素×3像素,步长1像素,包含256个滤波器;Sub-Pixel CN亚像素卷积层具有2层,用于对提取到的低分辨率图像特征进行重组而生成高分辨率图像,PReLu层将上一层的输入信号进行非线性变换;第11操作是G-Conv-4是卷积层,卷积核为9像素×9像素,步长1像素,包含3个滤波器;第12操作是Output输出层。
Reflection Detection in Image SequencesMohamed Abdelaziz Ahmed Francois Pitie Anil KokaramSigmedia,Electronic and Electrical Engineering Department,Trinity College Dublin{/People}AbstractReflections in image sequences consist of several layers superimposed over each other.This phenomenon causes many image processing techniques to fail as they assume the presence of only one layer at each examined site e.g.motion estimation and object recognition.This work presents an automated technique for detecting reflections in image se-quences by analyzing motion trajectories of feature points. It models reflection as regions containing two different lay-ers moving over each other.We present a strong detector based on combining a set of weak detectors.We use novel priors,generate sparse and dense detection maps and our results show high detection rate with rejection to patholog-ical motion and occlusion.1.IntroductionReflections are often the result of superimposing differ-ent layers over each other(see Fig.1,2,4,5).They mainly occur due to photographing objects situated behind a semi reflective medium(e.g.a glass window).As a result the captured image is a mixture between the reflecting surface (background layer)and the reflected image(foreground). When viewed from a moving camera,two different layers moving over each other in different directions are observed. This phenomenon violates many of the existing models for video sequences and hence causes many consumer video applications to fail e.g.slow-motion effects,motion based sports summarization and so on.This calls for the need of an automated technique that detects reflections and assigns a different treatment to them.Detecting reflections requires analyzing data for specific reflection characteristics.However,as reflections can arise by mixing any two images,they come in many shapes and colors(Fig.1,2,4,5).This makes extracting characteris-tics specific to reflections not an easy task.Furthermore, one should be careful when using motion information of re-flections as there is a high probability of motion estimation failure.For these reasons the problem of reflection detec-tion is hard and was not examined before.Reflection can be detected by examining the possibility of decomposing an image into two different layers.Lots of work exist on separating mixtures of semi-transparent lay-ers[17,11,12,7,4,1,13,3,2].Nevertheless,most of the still image techniques[11,4,1,3,2]require two mixtures of the same layers under two different mixing conditions while video techniques[17,12,13]assume a simple rigid motion for the background[17,13]or a repetitive one[12].These assumptions are hardly valid for reflections on mov-ing image sequences.This paper presents an automated technique for detect-ing reflections in image sequences.It is based on analyzing spatio-temporal profiles of feature point trajectories.This work focuses on analyzing three main features of reflec-tions:1)the ability of decomposing an image into two in-dependent layers2)image sharpness3)the temporal be-havior of image patches.Several weak detectors based on analyzing these features through different measures are pro-posed.Afinal strong detector is generated by combining the weak detectors.The problem is formulated within a Bayesian framework and priors are defined in a way to re-ject false alarms.Several sequences are processed and re-sults show high detection rate with rejection to complicated motion patterns e.g.blur,occlusion,fast motion.Aspects of novelty in this paper include:1)A technique for decomposing a color still image containing reflection into two images containing the structures of the source lay-ers.We do not claim that this technique could be used to fully remove reflections from videos.What we claim is that the extracted layers can be useful for reflection detection since on a block basis,reflection is reduced.This technique can not compete with state of the art separation techniques.However we use this technique because it works on single frames and thus does not require motion,which is not the case with any existing separation technique.2)Diagnos-tic tools for reflection detection based on analyzing feature points trajectories3)A scheme for combining weak de-tectors in one strong reflection detector using Adaboost4) Incorporating priors which reject spatially and temporally impulsive detections5)The generation of dense detection maps from sparse detections and using thresholding by hys-1Figure1.Examples of different reflections(shown in green).Reflection is the result of superimposing different layers over each other.As a result they have a wide range of colors and shapes.teresis to avoid selecting particular thresholds for the systemparameters6)Using the generated maps to perform betterframe rate conversion in regions of reflection.Frame rateconversion is a computer vision application that is widelyused in the post-production industry.In the next section wepresent a review on the relevant techniques for layer separa-tion.In section3we propose our layer separation technique.We then go to propose our Bayesian framework followed bythe results section.2.Review on Layer Separation TechniquesA mixed image M is modeled as a linear combinationbetween the source layers L1and L2according to the mix-ing parameters(a,b)as follows.M=aL1+bL2(1)Layer separation techniques attempt to decompose reflec-tion M into two independent layers.They do so by ex-changing information between the source layers(L1andL2)until their mutual independence is maximized.Thishowever requires the presence of two mixtures of the samelayers under two different mixing proportions[11,4,1,3,2].Different separation techniques use different forms ofexpressing the mutual layer independence.Current formsused include minimizing the number of corners in the sep-arated layers[7]and minimizing the grayscale correlationbetween the layers[11].Other techniques[17,12,13]avoid the requirement ofhaving two mixtures of the same layers by using tempo-ral information.However they often require either a staticbackground throughout the whole image sequence[17],constraint both layers to be of non-varying content throughtime[13],or require the presence of repetitive dynamic mo-tion in one of the layers[12].Yair Weiss[17]developed atechnique which estimates the intrinsic image(static back-ground)of an image sequence.Gradients of the intrinsiclayer are calculated by temporallyfiltering the gradientfieldof the sequence.Filtering is performed in horizontal andvertical directions and the generated gradients are used toreconstruct the rest of the background image.yer Separation Using Color IndependenceThe source layers of a reflection M are usually color in-dependent.We noticed that the red and blue channels ofM are the two most uncorrelated RGB channels.Each ofthese channels is usually dominated by one layer.Hence thesource layers(L1,L2)can be estimated by exchanging in-formation between the red and blue channels till the mutualindependence between both channels is r-mation exchange for layer separation wasfirst introducedby Sarel et.al[12]and it is reformulated for our problem asfollowsL1=M R−αM BL2=M B−βM R(2)Here(M R,M B)are the red and blue channels of themixture M while(α,β)are separation parameters to becalculated.An exhaustive search for(α,β)is performed.Motivated by Levin et.al.work on layer separation[7],thebest separated layer is selected as the one with the lowestcornerness value.The Harris cornerness operator is usedhere.A minimum texture is imposed on the separated lay-ers by discarding layers with a variance less than T x.For an8-bit image,T x is set to2.The removal of this constraintcan generate empty meaningless layers.The novelty in thislayer separation technique is that unlike previous techniques[11,4,1,3,2],it only requires one image.Fig.2shows separation results generated by the proposedtechnique for different images.Results show that our tech-nique reduces reflections and shadows.Results are only dis-played to illustrate a preprocess step,that is used for one ofour reflection measures and not to illustrate full reflectionremoval.Blocky artifacts are due to processing images in50×50blocks.These artifacts are irrelevant to reflectiondetection.4.Bayesian Inference for Reflection Detection(BIRD)The goal of the algorithm is tofind regions in imagesequences containing reflections.This is achieved by an-(a)(b)(c)(d)(e)(f)Figure 2.Reducing reflections/shadows using the proposed layer separation technique.Color images are the original images with reflec-tions/shadows (shown in green).The uncolored images represent one source layer (calculated by our technique)with reflections/shadows reduced.In (e)reflection still remains apparent however the person in the car is fully removed.alyzing trajectories of feature points.Trajectories are gen-erated using KLT feature point tracker [9,14].Denote P inas the feature point of i th track in frame n and F inas the 50×50image patch centered on P in .Trajectories are ana-lyzed by examining all feature points along tracks of length more than 4frames.For each point,analysis are carriedover the three image patches (F i n −1,F i n ,F in +1).Based onthe analysis outcome,a binary label field l in is assigned toeach F i n .l in is set to 1for reflection and 0otherwise.4.1.Bayesian FrameworkThe system derives an estimate for l in from the posterior P (l |F )(where (i,n)are dropped for clarity).The posterior is factorized in a Bayesian fashion as followsP (l |F )=P (F|l )P (l |l N )(3)The likelihood term P (F|l )consists of 9detectors D 1−D 9each performing different analysis on F and operating at thresholds T 1−9(see Sec.4.5.1).The prior P (l |l N )en-forces various smoothness constraints in space and time toreject spatially and temporally impulsive detections and to generate dense detection masks.Here N denote the spatio-temporal neighborhood of the examined site.yer Separation LikelihoodThis likelihood measures the ability of decomposing animage patch F in into two independent layers.Three detec-tors are proposed.Two of them attempts to perform layer separation before analyzing data while the third measures the possibility of layer separation by measuring the color channels independence.Layer Separation via Color Independence D 1:Our technique (presented in Sec.3)is used to decompose the im-age patch F i n into two layers L 1i n and L 2in .This is applied for every point along every track.Reflection is detected by comparing the temporal behavior of the observed image patches F with the temporal behavior of the extracted lay-ers.Patches containing reflection are defined as ones with higher temporal discontinuity before separation than after separation.Temporal discontinuity is measured using struc-ture similarity index SSIM[16]as followsD1i n=max(SS(G i n,G i n−1),SS(G i n,G i n+1))−max(SS(L i n,L i n−1),SS(L i n,L i n+1))SS(L i n,L i n−1)=max(SS(L1i n,L1i n−1),SS(L2i n,L2i n−1))) SS(L i n,L i n+1)=max(SS(L1i n,L1i n+1),SS(L2i n,L2i n+1)) Here G=0.1F R+0.7F G+0.2F B where(F R,F G,F B) are the red,green and blue components of F respectively. SS(G i n,G i n−1)denotes the structure similarity between the two images F i n and F i n−1.We only compare the structures of(G i n,G i n−1)by turning off the luminance component of SSIM[16].SS(.,.)returns an a value between0−1where 1denotes identical similarity.Reflection is detected if D1i n is less than T1.Intrinsic Layer Extraction D2:Let INTR i denote the intrinsic(reflectance)image extracted by processing the 50×50i th track using Yair technique[17].In case of re-flection the structure similarity between the observed mix-ture F i n and INTR i should be low.Therefore,F i n isflagged as containing reflection if SS(F i n,INTR i)is less than T2.Color Channels Independence D3:This approach measures the Generalized Normalized Cross Correlation (GNGC)[11]between the red and blue channels of the ex-amined patch F i n to infer whether the patch is a mixture between two different layers or not.GNGC takes values between0and1where1denotes perfect match between the red and blue channels(M R and M B respectively).This analysis is applied to every image patch F i n and reflection is detected if GNGC(M R,M B)<T3.4.3.Image Sharpness Likelihood:D4,D5Two approaches for analyzing image sharpness are used. Thefirst,D4,estimates thefirst order derivatives for the examined patch F i n andflags it as containing reflection if the mean of the gradient magnitude within the examined patch is smaller than a threshold T4.The second approach, D5,uses the sharpness metric of Ferzil et.al.[5]andflagsa patch as reflection if its sharpness value is less than T5.4.4.Temporal Discontinuity LikelihoodSIFT Temporal Profile D6:This detectorflags the ex-amined patch F i n as reflection if its SIFT features[8]are undergoing high temporal mismatch.A vector p=[x s g]is assigned to every interest point in F i n.The vector contains the position of the point x=(x,y),scale and dominate ori-entation from the SIFT descriptor,s=(δ,o),and the128 point SIFT descriptor g.Interest points are matched with neighboring frames using[8].F i n isflagged as reflection if the average distance between the matched vectors p is larger than T6.Color Temporal Profile D7:This detectorflags the im-age patch F i n as reflection if its grayscale profile does not change smoothly through time.The temporal change in color is defined as followsD7i n=min( C i n−C i n−1 , C i n−C i n+1 )(4) Here C i n is the mean value for G i n,the grayscale representa-tion of F i n.F i n isflagged as reflection if D7i n>T7.AutoCorrelation Temporal Profile D8:This detector flags the image patch F i n as reflection if its autocorrelation is undergoing large temporal change.The temporal change in the autocorrelation is defined as followsD8i n=min(1NA i n−A i n−1 2,1NA i n−A i n+1 2)(5)A i n is a vector containing the autocorrelation of G i n while N is the number of pels in A i n.F i n isflagged as reflection if D8i n is bigger than T8.Motion Field Divergence D9:D9for the examined patch F i n is defined as followsD9i n=DFD( div(d(n)) + div(d(n+1)) )/2(6) DFD and div(d(n))are the Displaced Frame Difference and Motion Field Divergence for F i n.d(n)is the2D motion vector calculated using block matching.DFD is set to the minimum of the forward and backward DFDs.div(d(n)) is set to the minimum of the forward and backward di-vergence.The divergence is averaged over blocks of two frames to reduce the effect of possible motion blur gener-ated by unsteady camera motion.F i n isflagged as reflection if D9>T9.4.5.Solving for l in4.5.1Maximum Likelihood(ML)SolutionThe likelihood is factorized as followsP(F|l)=P(l|D1)P(l|D2−8)P(l|D9)(7)Thefirst and last terms are solved using D1<T1and D9>T9respectively.D2−8are used to form one strong detector D s and P(l|D2−8)is solved by D s>T s.We found that not including(D1,D9)in D s generates better de-tection results than when included.Feature analysis of each detector are averaged over a block of three frames to gen-erate temporally consistent detections.T9isfixed to10in all experiments.In Sec.4.5.2we avoid selecting particular thresholds for(T1,T s)by imposing spatial and temporal priors on the generated maps.Calculating D s:The strong detector D s is expressed as a linear combination of weak detectors operating at different thresholds T as followsP(l|D2−8)=Mk=1W(V(k),T)P(D V(k)|T)(8)False Alarm RateC o r r e c tD e t e c t i o n R a t eFigure 3.ROC for D 1−9and D s .The Adaboost detector D s out-performs all other techniques and D 1is the second best in the range of false alarms <0.1.Here M is the number of weak detectors (fixed to 20)used in forming D s and V (k )is a function which returns a value between 2-8to indicate which detectors from D 2−8are used.k indexes the weak detectors in order of their impor-tance as defined by the weights W .W and T are learned through Adaboost [15](see Tab.1).Our training set consist of 89393images of size 50×50pels.Reflection is modeled in 35966images each being a synthetic mixture between two different images.Fig.3shows the the Receiver Operating Characteristic (ROC)of applying D 1−9and D s on the training samples.D s outperforms all the other detectors due to its higher cor-rect detection rate and lower false alarms.D 6D 8D 5D 3D 2D 4D 7W 1.310.960.480.520.330.320.26T0.296.76e −60.040.950.6172.17Table 1.Weights W and operating thresholds T for the best seven detectors selected by Adaboost.4.5.2Successive Refinement for Maximum A-Posteriori (MAP)The prior P (l |l N )of Eq.3imposes spatial and temporal smoothness on detection masks.We create a MAP estimate by refining the sparse maps from the previous ML steps.We first refine the labeling of all the existing feature points P in each image and then use the overlapping 50×50patches around the refined labeled points as a dense pixel map.ML Refinement:First we reject false detections from ML which are spatially inconsistent.Every feature point l =1is considered and the sum of the geodesic distance from that site to the two closest neighbors which are labeledl =1is measured.When that distance is more than 0.005then that decision is rejected i.e.we set l =0.Geodesic distances allow the nature of the image material between point to be taken in to account more effectively and have been in use for some time now [10].To reduce the compu-tational load of this step,we downsample the image mas-sively by 50in both directions.This retains gross image topology only.Spatio-Temporal Dilation:Labels are extended in space and time to other feature points along their trajecto-ries.If l in =1,all feature points lying along the track i are set to l =1.In addition,l is extended to all image patches (F n )overlapping spatially with the examined patch.This generates a denser representation of the detection masks.We call this step ML-Denser.Hysteresis:We can avoid selecting particular thresholds [T 1,T s ]for BIRD by applying Hysteresis using a set of dif-ferent thresholds.Let T H =[−0.4,5]and T L =[0,3]de-note a high and low configuration for [T 1,T s ].Detection starts by examining ML-Denser at high thresholds.High thresholds generate detected points P h with high confi-dence.Points within a small geodesic distance (<D geo )and small euclidean distance (<D euc )to each other are grouped together.Here we use (D geo ,D euc )=(0.0025,4)and resize the examined frames as mentioned previously.The centroids of each group is then calculated.Thresholds are lowered and a new detection point is added to an exist-ing group if it is within D geo and D euc to the centroid of this group.This is the hysteresis idea.If however the examined point has a large euclidean distance (>D euc )but a small geodesic distance (<D geo )to the centroid of all existing groups,a new group is formed.Points at which distances >D geo and >D euc are regarded as outliers and discarded.Group centroids are updated and the whole process is re-peated iteratively till the examined threshold reaches T L .The detection map generated at T L is made more dense by performing Spatio-Temporal Dilation above.Spatio-Temporal ‘Opening’:False alarms of the previ-ous step are removed by propagating the patches detected in the first frame to the rest of the sequence along the fea-ture point trajectories.A detection sample at fame n is kept if it agrees with the propagated detections from the previous frame.Correct detections missed from this step are recovered by running Spatio-Temporal Dilation on the ‘temporally eroded’solution.This does mean that trajecto-ries which do not start in the first frame are not likely to be considered,however this does not affect the performance in our real examples shown here.The selection of an optimal frame from which to perform this opening operation is the subject of future work.=Figure 4.From Top:ML (calculated at (T 1,T s )=(−0.13,3.15)),Hysteresis and Spatio-Temporal ‘Opening’for three consecutive frames from the SelimH sequence.Reflection is shown in red and detected reflection using our technique is shown in green.Spatio-Temporal ‘Opening’rejects false alarms generated by ML and by Hysteresis (shown in yellow and blue respectively).5.Results5.1.Reflection Detection15sequences containing 932frames of size 576×720are processed with BIRD.Full sequences with reflection de-tection can be found in /Misc/CVPR2011.Fig.4compares the ML,Hysteresis and Spatio-Temporal ‘Opening’for three consecutive frames from the SelimH se-quence.This sequence contains occlusion,motion blur and strong edges in the reflection (shown in red).The ML so-lution (first line)generates good sparse reflection detection (shown in green),however it generates some errors (shown in yellow).Hysteresis rejects these errors and generates dense masks with some false alarm (shown in blue).These false alarms are rejected by Spatio-Temporal ‘Opening’.Fig.5shows the result of processing four sequences us-ing BIRD.In the first two sequences,BIRD detected regions of reflections correctly and discarded regions of occlusion (shown in purple)and motion blur (shown in blue).In Girl-Ref most of the sequence is correctly classified as reflection.In SelimK1the portrait on the right is correctly classified as containing reflection even in the presence of motion blur (shown in blue).Nevertheless,BIRD failed in detecting the reflection on the left portrait as it does not contain strong distinctive feature points.Fig.6shows the ROC plot for 50frames from SelimH .Here we compare our technique BIRD against DFD and Im-age Sharpness[5].DFD,flags a region as reflection if it has high displaced frame difference.Image Sharpness flags a region as reflection if it has low sharpness.Frames are pro-cessed on 50×50blocks.Ground truth reflection masks are generated manually and detection rates are calculated on pel basis.The ROC shows that BIRD outperforms the other techniques by achieving a very high correct detection rate of 0.9for a false detection rate of 0.1.This is a major improvement over a correct detection rate of 0.2and 0.1for DFD and Sharpness respectively.5.2.Frame Rate Conversion:An applicationOne application for reflection detection is improving frame rate conversion in regions of reflection.Frame rate conversion is the process of creating new frames from ex-isting ones.This is done by using motion vectors to inter-polate objects in the new frames.This process usually fails in regions of reflection due to motion estimation failure.Fig.7illustrates the generation of a slow motion effect for the person’s leg in GirlRef (see Fig.5,third line).This is done by doubling the frame rate using the Foundry’s Kro-nos plugin [6].Kronos has an input which defines the den-sity of the motion vector field.The larger the density theFigure 5.Detection results of BIRD (shown in green)on,From top:BuilOnWind [10,35,49],PHouse 9-11,GirlRef [45,55,65],SelimK132-35.Reflections are shown in red.Good detections are generated despite occlusion (shown in purple)and motion blur (shown in blue).For GirlRef we replace Hysteresis and Spatio-Temporal ‘Opening’with a manual parameter configuration of (T 1,T s )=(−0.01,3.15)followed by a Spatio-Temporal Dilation step.This setting generates good detections for all examined sequences with static backgrounds.more detailed the vector and hence the better the interpo-lation.However,using highly detailed vectors generate ar-tifacts in regions of reflections as shown in Fig.7(second line).We reduce these artifacts by lowering the motion vec-tor density in regions of reflection indicated by BIRD (see Fig.7,third line).Image sequence results and more exam-ples are available in /Misc/CVPR2011.6.ConclusionThis paper has presented a technique for detecting reflec-tions in image sequences.This problem was not addressed before.Our technique performs several analysis on feature point trajectories and generates a strong detector by com-bining these analysis.Results show major improvement over techniques which measure image sharpness and tem-poral discontinuity.Our technique generates high correct detection rate with rejection to regions containing compli-cated motion eg.motion blur,occlusion.The technique was fully automated in generating most results.As an ap-plication,we showed how the generated detections can be used to improve frame rate conversion.A limiting factor of our technique is that it requires source layers with strong distinctive feature points.This could lead to incomplete de-tections.Acknowledgment:This work is funded by the Irish Re-serach Council for Science,Engineering and TechnologyFigure 7.Slow motion effect for the person’s leg of GirlRef (see Fig:5third line).Top:Original frames 59-61;Middle:generated frames using the Foundry’s plugin Kronos [6]with one motion vector calculated for every 4pels;Bottom;with one motion vector calculated for every 64pels in regions of reflection.False Alarm RateC o r r e c tD e t e c t i o n R a t eFigure 6.ROC plots for our technique BIRD,DFD and Sharpness for SelimH .Our technique BIRD outperforms DFD and Sharp-ness with a massive increase in the Correct Detection Rate.(IRCSET)and Science Foundation Ireland (SFI).References[1] A.M.Bronstein,M.M.Bronstein,M.Zibulevsky,and Y .Y .Zeevi.Sparse ICA for blind separation of transmitted and reflected images.International Journal of Imaging Systems and Technology ,15(1):84–91,2005.1,2[2]N.Chen and P.De Leon.Blind image separation throughkurtosis maximization.In Asilomar Conference on Signals,Systems and Computers ,volume 1,pages 318–322,2001.1,2[3]K.Diamantaras and T.Papadimitriou.Blind separation ofreflections using the image mixtures ratio.In ICIP ,pages 1034–1037,2005.1,2[4]H.Farid and E.Adelson.Separating reflections from imagesby use of independent components analysis.Journal of the Optical Society of America ,16(9):2136–2145,1999.1,2[5]R.Ferzli and L.J.Karam.A no-reference objective imagesharpness metric based on the notion of just noticeable blur (jnb).IEEE Trans.on Img.Proc.(TIPS),18(4):717–728,2009.4,6[6]T.Foundry.Nuke,furnace .6,8[7] A.Levin,A.Zomet,and Y .Weiss.Separating reflectionsfrom a single image using local features.In IEEE Conference on Computer Vision and Pattern Recognition (CVPR),pages 306–313,2004.1,2[8] D.G.Lowe.Distinctive image features from scale-invariantput.Vision ,60(2):91–110,2004.4[9] B.D.Lucas and T.Kanade.An iterative image registra-tion technique with an application to stereo vision (darpa).In DARPA Image Understanding Workshop ,pages 121–130,1981.3[10] D.Ring and F.Pitie.Feature-assisted sparse to dense motionestimation using geodesic distances.In International Ma-chine Vision and Image Processing Conference ,pages 7–12,2009.5[11] B.Sarel and M.Irani.Separating transparent layers throughlayer information exchange.In European Conference on Computer Vision (ECCV),pages 328–341,2004.1,2,4[12] B.Sarel and M.Irani.Separating transparent layers of repet-itive dynamic behaviors.In ICCV ,pages 26–32,2005.1,2[13]R.Szeliski,S.Avidan,and yer extrac-tion from multiple images containing reflections and trans-parency.In CVPR ,volume 1,pages 246–253,2000.1,2[14] C.T.Takeo and T.Kanade.Detection and tracking ofpoint features.Carnegie Mellon University Technical Report CMU-CS-91-132,1991.3[15]P.Viola and M.Jones.Robust real-time object detection.InInternational Journal of Computer Vision ,2001.5[16]Z.Wang,A.Bovik,H.Sheikh,and E.Simoncelli.Imagequality assessment:from error visibility to structural simi-larity.TIPS ,13(4):600–612,April 2004.4[17]Y .Weiss.Deriving intrinsic images from image sequences.In ICCV ,pages 68–75,2001.1,2,4。
翻译词汇,日积月累计划单列市city specifically designated in the state plan农机具补贴subsidy for agricultural machinery and tools附加税VAT(value-added tax)/extra duties/surtax/super tax/tax surcharge三资企业three kinds of foreign-invested enterprise or ventures 中外合资企业Sino-foreign joint ventures外商独资企业exclusively foreign-owned company in china中外合作企业cooperative business经济萧条economic slump/economic recession/economic depression business depression发展才是硬道理development as a task of overriding importance driven by consumption, investment ,import and export统筹城乡发展coordinate development in rural and urban regions 拓宽农村增收渠道seek new ways to increase farmer's incomes区域协调互动发展机制 a mechanism for promoting balance and interactive development among regions粗放型增长方式the extensive/inefficient mode /model of growth 密集的增长模式the intensive mode of growth国民经济性支柱产业pillar industry of national economy英汉词汇homecoming 同学会agent 代理crowdsourcing 众包lion's share 大部分ecological civilization 生态文明surrealism 超现实主义Central Ballet Troupe 中央芭蕾舞歌舞团myth of China's peaceful rise 中国的和平崛起的神话currency manipulator 货币操纵国sample survey 抽样检查core competiveness 核心竞争力intellectual property infringement 知识产权侵权multi-polar world 多极世界unabridged dictionary 足本词典Chic lit(文学)年轻女性小说缩略语交货前付款CBD(Cash Before Delivery)泛太平洋伙伴关系TPP(trans-pacific partnership)万国邮政联盟UPU(Universal Postal Union)订金CBD(Cash Before Delivery)有线电视CATV(Cable Television)汉英词汇转变经济增长方式the transformation of the economic growth mode弥补外需缺口compensate for weak external demand重点产业调整振兴规划the plan for restructuring and reinvigorating key industries县级基本财力保障机制 a mechanism for ensuring basic funding for county-level governments对外工程承包和劳务合同营业额the total turnover of contracted overseas construction and labor contracts激发经济内在活力stimulate the internal vitality of the economy 消除经济不利影响overcome the adverse effects of imported and structural inflation进出口许可证管理the import and export licensing administration 货物实际进出境口岸the port of the actual entry and exit of the goods进口环节增值税和消费税import value added tax and consumption tax劳动报酬在初次分配中的比重the ratio of worker's incomes in the primary distribution of national income区域发展总体战略the master strategy of regional development 登月舱lunar module服务舱service module指令舱command module英汉词汇reentry module 返回舱propelling module 推进舱orbital module 轨道舱geosynchronous satellite 同步轨道卫星satellite in sun geosynchronous 太阳同步卫星weather satellite 气象卫星recoverable satellite 返回卫星communication satellite 通讯卫星remote sensing satellite 遥感卫星The long marchⅡrocket launcher 长征2号运载火箭low earth orbit 近地轨道orbit the earth 绕地球飞行fine-tune orbit 调整轨道personnel type 人才类型talent competitiveness 人才竞争力缩略语特别行政区SAR(Special Administrative Region)联合国旅游组织WTO(World Tourism Organization)军事委员会CMA(Committee of Military Affairs)仅供参考FRI(for your information)仅供参考FYR(for your reference)汉英词汇人才流失brain drain海外人才overseas talents人才库talent pool人才管理talent management人才政策personnel policy人才发展talent development尊重人才value talent/ talented people党管人才human resources under the Party leadership 党政人才Party and government officials青年英才young talents高素质教育人才high-quality educators海外高层人才high-quality overseas professionals全民健康卫生人才national health professionals高技能人才the highly skilled/highly skilled workers企业经营管理人才enterprise management talents英汉词汇military talents 高素质军事人才professional medical personnel 专业医药人才innovative skilled sci-tech workers 创新性科技人才急需紧缺专门人才professionals in short supplybuilding of talent team 人才队伍建设talent management 人才工作管理体制major projects for talent development 重大人才工程HR= human resources 人力资源overseas Chinese students' home 全球留学人才之家cramming method of teaching 填鸭式教学be released from regular work for study 脱产学习locally-granted student loan 生源地助学贷款pseudo-science 伪科学academic credit system 学分制talent pool 人才贮备汉英词汇保障性住房indemnificatory housing包工头labor contractor首期按揭down payment筒子楼tube-shaped apartment危房dilapidated building违章建筑unapproved construction project样板房model unit活动板房movable plank house/ movable house违章借贷abusive lending包装风险房贷packaging of risky mortgages住房公积金贷款housing accumulation fund/ housing provident fund保障性住房low-income housing/government subsidized housing 小产权房house/apartment with limited/incomplete property rights棚改房housing run-down areas that will undergo renovation一定让人民安居乐业we must ensure peace and security for the people出让、划拨和转让grant, allocation and transfer of right to use land英汉词汇relief fund 救济金under-the-counter deals 开后门pet phrase 口头禅flash mob 快闪族Malthusian Theory of Population 马尔萨斯人口论the Matthew Effect 马太效应rule of thumb 拇指规则door-to-door service 上门服务probationary period 试用期human trafficking 人口贩卖sugar-coated bullet 糖衣炮弹job-hopping 跳槽ostrich policy/ostrichism 鸵鸟政策die-hard fans 铁杆粉丝sister cities 友好城市缩略语首次公开募股IPOs(initial public offerings)中国民航CAAC(Civil Aviation Authority of China)域名服务器DNS(Domain Name Server)亚洲石油公司APC(Asian Petroleum Company)最惠国待遇原则MFNS(Most-Favored-Nation Treatment)汉英词汇邮政储蓄postal savings回扣sales commission/kickback潘多拉魔盒Pandora's box霹雳舞break dancing诺亚方舟Noah‘s Ark软肋soft spot/Achilles’ heel软着陆soft landing汇演joint performance花边新闻tidbit西气东输pipeline for transmitting natural gas from the west to the east政府保障力the government’s capacity for safeguard people’s livelihood开辟了中国特色社会主义道路blaze a trial of socialism with Chinese characteristics生产安全事故workplace accident ; work-related accident/work accident/accident at work/accident due to lack of work safety 行政事业性收费administrative charges/fees/public service charges/fees一次性补偿lump-sum compensation/once-and-for-all compensation/one-off compensation/flat compensation英汉词汇property tax 物业税vacant property 闲置地产complete apartment 现房first/second stage 一期/二期monthly installment payment 月供added-value fees 增值地价depreciation allowances 折旧费policy-related house 政策性住房key zones for development 重点开发区residential property 住宅地产owner-occupied homes /houses 自住型住房消费run-down areas 棚户区commercial housing with price ceilings 限价商品房change hand 房屋转手(易主)jelly-built projects 豆腐渣工程缩略语中国银监会CBRC(China Banking Regulatory Commission)多边贸易谈判MTN(multilateral trade negotiations)国内生产总值GDP(Gross Domestic Product)石油输出国家组织OPEC(Organization of Petroleum Exporting Countries)东南亚国家联盟(东盟)ASEAN(Association of Southeast Asian Nations)汉英词汇观望态度wait-and-watch attitude旧区改造reconstruction of old area楼层建筑面积floor space普通购房者private homebuyer期房forward delivery housing容积率capacity rate商品房commercial /residential building商业地产commercial property商住综合楼commercial/ residential complex社区community首付down payment售后回租(租回已租出财产)leaseback塔楼tower building停车位parking space停车场parking lot英汉词汇land reserves 囤地second-hand house 二手房property ownership certificate 房产证real estate agent 房产中介property bubble 房地产泡沫real estate/property market 房地产市场overheated property sector 房地产市场过热property price/housing price 房价payment by installment 分期付款tenement 分租合住的经济公寓property deed tax 购房契税housing bubble 房地产泡沫land use certificate 土地使用证allowances for repairs and maintenance 维修费speculative property transaction 投机性房产交易缩略语科学技术开发应用咨询委员会ACASTD(Advisory Committee on the Application of Science and Technology Development)联合国可持续发展委员会CSD(Commission on Sustainable Development)粮食及农业组织FAO(Food and Agriculture organization)联合国大会GA(general assembly)国际法院ICJ(International Court of Justice)汉英词汇花钱炫富spend money thoughtlessly and flaunt one's wealth 土豪newly rich and powerful people南水北调工程South-to-North water diversion project政绩考核assess the performance of local authorities生态脆弱fragile eco-system妥善处理敏感问题和分歧properly handle sensitive questions and disputes安居工程housing project for low-income urban residents住房空置率housing vacancy rate地王property king /land king/top bidder廉租房low-rent housing经济实用型住房affordable housing限价房price-capped housing公租房public rental housing按揭购房buy a house on mortgage板楼slap-type apartment building英汉词汇civil service exam 公务员考试interdisciplinary talent 复合型人才college/university entrance exam 高考divergent thinking 发散思维top student 高材生teaching to the test/exam-oriented education 应试教育quality-oriented education 素质教育senior cadre/high-ranking official 高干enrollment expansion 扩招real estate speculator 炒房者unpaid mortgage balance 抵押贷款欠额location classification 地段等级nail household/nail house residents 钉子户planned enrollment 计划内招生correspondence university 函授大学缩略语独联体CIS(Commonwealth of Independence States)东盟自由贸易区AFTA(Asian Free Trade Area)联合国教科文组织UNESCO(United Nations Educational,Scientific,and Cultural Organization)全国人民代表大会NPC(National People's Congress)中华人民共和国PRC(People's Republic of China)汉英词汇各类型,多层次人才队伍 a skilled, diversified and multilevel workforce高级人才high-end professionals/top talents国家中长期人才发展规划纲要national program for medium and long-term talent development中国国际人才交流协会Conference on InternationalExchange of Professionals科教兴国战略和人才强国战略the strategy of reinvigorating China through science, education and human resources文化渗透cultural infiltration玛雅文化Mayan civilization国家非物质文化遗产national intangible cultural heritage以人为本的城镇化human-centered urbanization打破城乡二元经济结构break up the city-country dualistic economic structure释放巨大的内需潜力unleash huge potential in domestic demand 公共文化服务体系public service system of culture国家文化软实力the country’s soft power in cultural fields全民族文明素质the Chinese people‘s civil education对外文化宣传international cultural publicity英汉词汇Trojan Horse 特洛伊木马cultural industry 文化产业cultural undertaking 文化事业hard-disk drive 硬盘驱动器organic nanomaterial 有机(生化)纳米材料lunar rover 月球车live broadcast 直播chief editor 主编candid camera 抓拍镜头special dispatch 专电sulfur dioxide emissions 二氧化硫排放wind break 防风林charm of oriental culture 东方神韵the Oriental Hawaii 东方夏威夷holiday resort 度假村缩略语转基因动物GMA(Genetically Modified Animal)转基因食品GMF(Genetically Modified Food)国家环境保护总局SEPA(State Environmental Protection Administration)联合国气候变化框架公约UNFCCC(The United Nations Framework Convention on Climate Change)中国人民政治协商会议CPPCC(Chinese People's Political Consultative Conference)汉英词汇两岸包机the cross-Straits charter flight历史遗留问题problems left over by history领土管辖权territorial jurisdiction落实共识implement the consensus民间外交people-to people diplomacy白色污染white pollution悬浮颗粒物suspended particles烟尘排放soot emission野生动植物wild animals and plant;wild fauna and flora有毒废料toxic waste加强文化建设加强文化建设strengthen cultural development efforts高雅艺术high art培养青少年思想道德建设cultivate ideals and ethics among young people满足人民群众不断增长的精神文化需求satisfy the people'sever-growing demand for cultural products加强公民道德建设工程carry out the program for improving civic morality英汉词汇joint venture 合资企业gold reserve 黄金储备reciprocal trade agreement 互惠贸易协定exchange risks 汇率风险demand deposit 活期存款judge by default 缺席判决rights of the person 人身权力determine facts 认定事实petition for appeal 上诉状cases involving foreign interests 涉外案件boarding school 寄宿学校education in the home 家庭教育education in home economics 家政教育contingent of teachers 教师队伍innovation in pedagogy 教学方法创新缩略语合格境内机构投资者QDII(Qualified Domestic Institutional Investor)合格境外机构投资者QFII(Qualified Foreign Institutional Investors)计算机辅助教学CAI(computer-aided instruction或computer-assisted instruction)计算机辅助设计CAD(Computer - Aided Design)计算机辅助翻译CAT(Computer Aided Translation)汉英词汇"爱国者"导弹PatriotmissileAA制Dutchtreatment;goDutch百闻不如一见Seeingisbelieving.百慕大三角BurmudaTriangle闭关政策closed-doorpolicy闭卷closed-bookexam冰雕icesculpture本垒打circuitclout,four-master,roundtrip本票cashier'scheque本本主义bookishness吃皇粮"receivesalaries,subsidies,orothersupportedfromthegovernmen t"充值卡rechargeablecard出入平安Safetripwhereveryougo敞蓬轿车open-toppedlimousine畅通工程SmoothTrafficProject英汉词汇deadaccount 呆帐packingcredit(loan) 打包贷款sun-top 吊带衫revokelicense 吊销执照firstinning 第一发球权countdown 倒计时globalvillage 地球村24-hour(service) 全天候all-weatehraircraft 全天候飞机performanceofenterprises 企业效益listingofacompany 企业上市braindrain人才流失competitionfortalentedpeople 人才战administrativeprogram 施政纲领double-edgedsword 双刃剑缩略语有线电视CATV(Cable Television)首席信息主管CIO(Chief Information Officer)注册会计师CPA(Certified Public Accountant)情商EQ(Emotional Quotient)超文本传输协议HTTP(Hyper Text Transfer Protocol)汉英词汇女汉子tough girl, manly woman殡葬改革funeral and interment reform生态安葬an ecological way of burial多功能生态保护实验区multifunction ecological experimentation zone谨言慎行be cautious in words and deeds包二奶have a concubine (originally a Cantonese expression) 变相涨价disguised inflation保证金margins, collateral保证金帐户margin account背投屏幕rear projection screen背黑锅to become a scapegoat不夜城"sleepless city, ever-bright city"不败记录clean record, spotless record存款准备金制度reserve against deposit system粗放式管理extensive management英汉词汇target hitting activities 达标活动grand slam 大满贯paid holiday 带薪假期penalty kick 点球loan loss provision, provisions of risk 风险准备金quota-free products 非配额产品burglarproof door; antitheft door 防盗门real estate evaluator 房产估价师real estate management 房管Time and tide wait for no man 时不我待break a butterfly on the wheel 杀鸡用牛刀be disengaged from work; divorce oneself from one's work 脱产talk show 脱口秀inter-bank borrowing 同业拆借geostationary satellilte 同步卫星缩略语强迫性神经官能症OCD(obsessive-compulsive disorder)艾滋病病毒HIV(human immunodeficiency virus)疯牛病BSE(bovine spongiform encephalopathy)非处方药OTC(over the counter)联合国儿童基金会UNICEF(United Nations International Children'sEmergency Fund)。
***大学毕业设计(英文翻译)原文题目:Automatic Panoramic Image Stitching using Invariant Features 译文题目:使用不变特征的全景图像自动拼接学院:电子与信息工程学院专业: ********姓名: ******学号:**********使用不变特征的全景图像自动拼接马修·布朗和戴维•洛{mbrown|lowe}@cs.ubc.ca计算机科学系英国哥伦比亚大学加拿大温哥华摘要本文研究全自动全景图像的拼接问题,尽管一维问题(单一旋转轴)很好研究,但二维或多行拼接却比较困难。
以前的方法使用人工输入或限制图像序列,以建立匹配的图像,在这篇文章中,我们假定拼接是一个多图像匹配问题,并使用不变的局部特征来找到所有图像的匹配特征。
由于以上这些,该方法对输入图像的顺序、方向、尺度和亮度变化都不敏感;它也对不属于全景图一部分的噪声图像不敏感,并可以在一个无序的图像数据集中识别多个全景图。
此外,为了提供更多有关的细节,本文通过引入增益补偿和自动校直步骤延伸了我们以前在该领域的工作。
1. 简介全景图像拼接已经有了大量的研究文献和一些商业应用。
这个问题的基本几何学很好理解,对于每个图像由一个估计的3×3的摄像机矩阵或对应矩阵组成。
估计处理通常由用户输入近似的校直图像或者一个固定的图像序列来初始化,例如,佳能数码相机内的图像拼接软件需要水平或垂直扫描,或图像的方阵。
在自动定位进行前,第4版的REALVIZ拼接软件有一个用户界面,用鼠标在图像大致定位,而我们的研究是有新意的,因为不需要提供这样的初始化。
根据研究文献,图像自动对齐和拼接的方法大致可分为两类——直接的和基于特征的。
直接的方法有这样的优点,它们使用所有可利用的图像数据,因此可以提供非常准确的定位,但是需要一个只有细微差别的初始化处理。
基于特征的配准不需要初始化,但是缺少不变性的传统的特征匹配方法(例如,Harris角点图像修补的相关性)需要实现任意全景图像序列的可靠匹配。
Algebraic operation 代数运算;一种图像处理运算,包括两幅图像对应像素的和、差、积、商。
Aliasing 走样(混叠);当图像像素间距和图像细节相比太大时产生的一种人工痕迹。
Arc 弧;图的一部分;表示一曲线一段的相连的像素集合。
Binary image 二值图像;只有两级灰度的数字图像(通常为0和1,黑和白)Blur 模糊;由于散焦、低通滤波、摄像机运动等引起的图像清晰度的下降。
Border 边框;一副图像的首、末行或列。
Boundary chain code 边界链码;定义一个物体边界的方向序列。
Boundary pixel 边界像素;至少和一个背景像素相邻接的内部像素(比较:外部像素、内部像素)Boundary tracking 边界跟踪;一种图像分割技术,通过沿弧从一个像素顺序探索到下一个像素将弧检测出。
Brightness 亮度;和图像一个点相关的值,表示从该点的物体发射或放射的光的量。
Change detection 变化检测;通过相减等操作将两幅匹准图像的像素加以比较从而检测出其中物体差别的技术。
Class 类;见模或类Closed curve 封闭曲线;一条首尾点处于同一位置的曲线。
Cluster 聚类、集群;在空间(如在特征空间)中位置接近的点的集合。
Cluster analysis 聚类分析;在空间中对聚类的检测,度量和描述。
Concave 凹的;物体是凹的是指至少存在两个物体内部的点,其连线不能完全包含在物体内部(反义词为凸)Connected 连通的Contour encoding 轮廓编码;对具有均匀灰度的区域,只将其边界进行编码的一种图像压缩技术。
Contrast 对比度;物体平均亮度(或灰度)与其周围背景的差别程度Contrast stretch 对比度扩展;一种线性的灰度变换Convex 凸的;物体是凸的是指连接物体内部任意两点的直线均落在物体内部。