当前位置:文档之家› Content based feature indexing and retrieval for image databases

Content based feature indexing and retrieval for image databases

Content based feature indexing and retrieval for image databases
Content based feature indexing and retrieval for image databases

Constraint Based Feature Indexing and Retrieval for Image Databases

Peter Eggleston

Amerinex Artificial Intelligence, Inc.

409 Main Street, Amherst, MA 01002

phone: 413-256-8941

fax: 413-253-4203

email: aaimail@https://www.doczj.com/doc/b29477018.html,

ABSTRACT

This paper presents a prototype system which uses constraints (mathematical feature mapping functions) to index and retrieve images. Information is automatically extracted from images such as the shape, texture, and position of objects within the image. Once extracted, this feature information is stored in an associatively accessible database. The database allows users to locate images containing objects of interest, or locate objects of interest within images. The system presented here also provides a method for automatic indexing of the database through the learning and application of object types or classes. Query of the database is accomplished by way of: 1) sketched example, 2) selected prototype object from an image or atlas, 3) graphically specified single or multidimensional feature ranges, or 4) class type. The use of pre-derived features and mapping functions allow this method to be efficiently implemented in real-time systems.

1. INTRODUCTION

Wi th the emergence of i nexpensi ve effi ci ent storage devi ces, di gi ti zati on methods, hi gh quali ty di splays, and network hardware and software, a vast number of imaging workstations/servers are being developed which span many industries. For example, many Realtors now have pictorial databases of homes for viewing in their office on a computer, served from a central office, or a network of Realtors. Hospitals have digital storage and display devices for NMR, CAT, X-Ray and Nuclear imaging, with vast amount of patient data being stored in digital form. Many institutions are digitizing publications or working documents for storage and retrieval on optical discs.

Images, by nature, contain tremendous amounts of information. A collection of images can be given descriptive labels or keywords which act as database indexes to allow later retrieval. Retrieval can be performed by querying the database to retrieve images which have indexes matching user specified search keywords. Indexing images based upon their content can be laborious, expensive, and error prone. Manual indexing could also fail to take into account some unknown feature or context which may be relevant given the scope of some future requested search. For example, a vacation picture could be labeled or indexed with the keywords people, car, vacation, summer, and Washington D.C.. However, such a picture might contain a structure such as the Washington monument in the background. The manual keyword index method may fail to note all the content of the image, and therefor the image may not get retrieved if a user may be searching for picture containing a monument. In another example, a patient may have a fracture, and his X-Ray might be labeled and stored with the vital statistics, type of fracture, location of fracture, etc.. However, if a doctor is later attempting to compile a collection of patients with fractures of a certain nature in orientation, shape, or the like, the keyword indexing method would fail to incorporate all the information necessary to retrieve images which match a particular fracture, defect, or abnormality of interest.

In each of the examples above, what is needed is a method of querying the database by example, perhaps by drawing a sketch or submitting an image with a desired property, or by features of interest, such as shape, context, etc.. The remainder of this paper discusses a prototype index generation and database query/retrieval system which implements this method of search.

2. BUILDING AND AUTOMATICALLY INDEXING A DATABASE

For the work presented in this paper, an image database of art objects was created and utilized. However, the methods employed here could be applied to document, medical, industrial, or other types of imagery. The collection of art objects consisted of three vases, three wicker baskets, five percussion instruments, and two wooden fish. These art objects were first laid out in various combinations and orientations on a medium gray cloth. The gray color, rather than white or black, was used to introduce ambiguities in the borders of the objects, since it was desired to model more real-world data rather than laboratory derived data. A video camera in conjunction with a frame grabber was then used to digitize [1] the images. That is, the i mages were photographed i n a di gi tal form and stored on di sk. Once stored on di sk i n thi s form, computer vi si on techniques were used to automate the process of object identification, referred to as segmentation, and characterization, referred to as feature extraction. Indexes were then automatically generated from features of the segmented objects. Figure 1 shows the images used in developing this prototype system. The images are viewed in a database browser, a special tool for rapidly scanning the database in pictorial form.

2.1 Image Segmentation

Digitized data consists of two dimensional arrays of intensity values whose elements are called pixels. Various algorithms can be applied to this pixel data to extract image events which have some meaning in a vision application such as the one presented here. These abstractions usually represent an area of more than one pixel and contain descriptive information about the pixel data in the group. In other words, software acting on image data at the pixel level can convert intensity information into symbolic descriptors which represent components of images [2].

There are numerous methods of extracting events or components, called tokens, from image data, and there exists in general use several robust and proven software processes [3]. The segmentation method chosen for the art image database was a thresholdi ng techni que based on valleys i n the hi stogram on the i mage pi xel values. Thi s method was chosen as i t i s computationally inexpensive, and allows the segmentation process to be more adaptive. In addition, this method could be used to select multiple threshold points if multiple object classes are desired. However , histogram valley segmentation method will work only if the objects truly have uniform pixel distributions within the classes, which may or may not be a valid assumption for classes of imagery, including the examples shown here. The result of the segmentation step was a database of locations and boundary descriptions for each art object in every image. Each segmented art object was stored as a unique record in an associatively accessible database known as the Intermediate Symbolic Representation (ISR) [4]. At this point, the records contained the following fields: 1) Token Index, a unique integer key, 2) Bitplane, a bitmap representation of the extracted object, 3) Extents, a bounding rectangle for the object, and 4) Image File Name from which the object was extracted. Figure 2 shows the extracted object boundaries for the images shown in Figure 1. Note the imperfections in the boundaries of many of the objects, caused by the ambiguities of the objects with the background. In other words, the objects were not perfectly segmented.

2.2 Feature Extraction

Once images are segmented into discrete components or objects, attributes, also called features, can be calculated for each object. For example, average density (gray scale or intensity) can be calculated for each token (record) in the ISR database. Since one field of the record is a bitmap representation, this location feature can be used as a mask to extract member pixels from the original image data to derive this density feature. For each new feature type calculated and stored, a new field is automatically added to each record in the database. These features can be then used to identify image components or to assess relational or contextual information such as nearness, subpart, etc..

Once the art objects had been extracted from the image database, the following features were measured: 1) Intensity Mean, 2) Intensity Standard Deviation, 3) Boundary/Perimeter, 4) Elongation (major axis/minor axis), 5) Compactness, 6) Height, 7) Width, 8) Centroid, 9) % of Bounding Rectangle Fill, 10) Pixel Count, 11) Number of Holes, and 12) Perimeter.

2.3 Index Generation

Once features have been measured and stored in a database, fields containing these features can be used as indexes into the database. Due to the associative nature (retrieval by value rather than by location) of the ISR database, a user can search for

token records whose values in certain fields fall within some particular specified range. Once retrieved, the field containing the image file name from which the token was extracted can be retrieved, and the image can be displayed if desired. Since any single feature alone is not a good discriminate for any one object, or even class of objects, it is necessary to query the database on several features at once to index a unique object or class of objects. This assumes that the features measured and stored in the database are sufficient to uniquely separate the classes of objects. Such was the case for the art data used here, with an exception as discussed in the Conclusion Section.

To automate the indexing process, a method was derived to use the measured features to automatically assign memberships to each of four classes: Basket, Fish, Percussion, and Vase. (i.e., a classification scheme was devised). A set of objects for each class was first identified graphically through use of an interactive examination interface. These tags formed tokensubsets, or collection of token indexes (record pointers), also refereed to as training sets. Classification schemes employ non-statistical methods such as 1) Euclidean, Bayesian, K-nearest neighbor, minimum distance, or 2) statistical techniques such as Bayes' minimal-risk, memory networks, etc.. [5,6] to label or classify image components. In imaging applications such as the one presented here, it is desirable to use non-statistical classification schemes, since traditional statistical based schemes are based on probability of occurrence, and are often sensitive to noise and partial occlusion. In addition, cognitive neural approaches whi ch are qui te common i n i magi ng appli cati ons are often qui te sensi ti ve to the order and frequency of trai ni ng set presentation, tend to not adapt well to erroneous assumptions in the training data, and can be computationally expensive to implement.

For the indexing and retrieval system developed here, constraint functions [7] were used to map feature values into class membership scores. Constraints provide a robust and adaptable method for mapping feature attributes into hypothesis spaces. This is done through the use of arithmetic mapping functions. One such function is a piece-wise linear mapping function called a primitive constraint. Primitive constraints implement a form of fuzzy logic. Fuzzy logic differs from Boolean logic, or tree based (if-then-else) decision functions, in that it is more adaptive to the data, and allows the use of cooperative or redundant knowledge sources. Primitive constraints allow any processing method or piece of data to be inherently unreliable, yet not effect the overall outcome of the interpretation. Look-up table approximations can also be derived for primitive constraints, lending themselves well to rapid implementation in hardware for real-time systems.

Constraint functions also have the advantage of being modular. Modularity, the ability to add, modify, or delete individual data structures and knowledge independently of the remainder of the database, affects the degree to which the system is understandable and incrementally constructable. Non-modular systems have their knowledge represented implicitly in a form called procedural representation. The problem with this method is that the meaning of data structures in the database depends on the context in which the knowledge is being used. In the constraint based method used here, explicit knowledge is directly accessible for manipulation by the programmer and system. Because this allows facts to be globally represented, the same facts can be used for multiple purposes, thereby allowing algorithms and rules developed for a particular class of parts or components to be easily extended to redesigned or differing components.

Once training sets were identified for each class of art object, an auto constraint function was applied to the records in the training set. This function, based on a minimal distance classifier, was used to automatically generate mapping functions which were then applied to the entire database. Results of the automatic indexing by object class is shown in Figure 3. This figure illustrates the real-time access of classifications for objects within images as accessed by the user. Notice the fuzzy memberships to each class type, as shown bellow each image in which a user has selected an art object with the mouse (white and black crosshairs). For example, in the upper left, the user has selected the rectangular vase on the right. As seen in the text below the image, the auto indexing system has classified the object to be 50% plausible it is a basket, 0% a fish, 58.33% a percussion instrument, and 75% plausible that it is a vase. Note the use of the term plausible, not probable in this context. The system has no notion of the probability of occurrence of an object, just the plausibility of existence given its unique set of features derived from the segmentation and feature extraction steps. The fuzzy membership to the classes will allow the system to respond to queries such as 'looks like,' or to order the likelihood of a match or fit of a retrieved image or object within an image to a query.

3. IMAGE DATABASE QUERY AND RETRIEVAL

Once images in a database have been segmented into characteristic components, the features derived and measured from those components, and indexes generated from those features, queries can be performed on the database to allow users to

locate images containing objects of interest, or locate objects of interest within images. In the system presented here, methods were derived to allow query by way of: 1) selected prototype object from existing database image or atlas, 2) sketched example, 3) graphically specified single or multidimensional feature ranges, or 4) class type.

3.1 Existing Prototype

An interactive data examiner interface allows users to select objects within an image already existing in the database by way of mouse as shown in Figure 3. Once selected, the token record associated with the image object in the database is retrieved. Gaussian mapping functions are created for each feature and centered about the feature value of the retrieved record. The standard deviation of the distribution is derived so that the entire feature range of the database (for the feature being mapped) falls within a normal distribution of three standard deviations in spread. These mapping functions are then applied to every other record in the database. The summation of these mappings is a fit score. Images which contain objects whose score exceeds some plausi bi li ty threshold are then presented to the user i n a browser i nterface such as shown i n Fi gure 1. Alternati vely, i mages can be presented rank ordered by best fi t, or i mages can be shown wi th gray scale i ntensi ti es representing fits to the user selected example such as shown in Figure 4. This method could also be used in conjunction with atlases of images, such as a medical text with body organs, some standard images, parts glossaries, or the like.

3.2 User Supplied Example

Users can also query the database by way of a sketched example. Figure 5 shows a binary image editor called Bitmap, similar to a PC or Mac based drawing program. The user sketches an object of interest to be retrieved from the database. This image is then segmented and features derived just as in the case of any other image being added to the database. As in the procedure described in the Section 3.1, Gaussian mapping functions are derived for this new object, and used to retrieve images from the database. Fi gure 6 i llustrates the i mported object (LR), one such feature mappi ng functi on created (UL), an i mage retrieved from the database (LL), and the degree of fit of all objects in the image to the supplied prototype (UR) (white being best fit). Note that the upper left quadrant of Figure 6 is a histogram (frequency distribution) of a single feature value. Each spike is an object in the image below the histogram. The height of the spike corresponds to the size of the object. (Not shown in the black and white photographs are colors used to designate histogram contributions which come from the training sets.) The curve is a Gaussian mapping function centered about the feature derived from the user sketched object shown in the lower right. This method of query also works with a user supplied image in which the user identifies a component of interest.

3.3 By Feature

Query by feature involves accessing an image database by selecting a range or group of ranges for a single or multiple features (record fields). With the interface shown in Figure 7, users can query the database in real-time by selecting portions of feature histograms with a mouse. Objects whose feature fall in the selected ranges are then displayed from the database, or image lists are generated from the database for display in the image browser tool. In the upper left window of Figure 7, a user draws a box around an object of interest. Four features of the object, as pre-selected by the user are displayed on two two-dimensional feature histograms shown at the bottom of the figure. The black dots indicate the selected vase has a boundary to peri meter feature of 1.2, a compactness feature of .55, a boundi ng rectangle fi ll rati o of .75, and an i ntensi ty standard deviation of 23. The histograms in the middle of Figure 7 are cumulative feature histograms for the entire art database. The figure in the upper right has objects mapped to intensity for another image, so that black is a poor fit to the object class selected in the upper left window, and white is a good fit. Note that the vase is a good fit to the user selected features, as guided by the histograms in the lower windows. This mode of access provides for queries based on qualitative attributes such as long, wide, circular, irregular, etc.. In other words, instead of querying the database for images which contain only vases, one can query the database for things which 'look' like vases.

3.4 By Class/Index

Figure 8 illustrates a method of query by class type. The user selects a class from the list of classes as shown in the menu on the lower right, such as the basket class (highlighted on the menu). Images containing objects of this class are shown in the top image browser window. At any time, the user can view the selection criteria, or the constraints used in the query. The window on the lower left of Figure 8 is a constraint system interface, in which counter clockwise from the lower left, an image containing a vase and a basket is shown, its object extraction is displayed, the basket constraint mapping result is presented, and one of the constraint mapping functions along with spikes representing the two objects are shown.

4. CONCLUSION

Although prototyped with a small number of images and class types, the image database indexing and query system presented here worked very well. Some difficulty was encountered in automatically defining the percussion instrument class, and was attributed to a lack of discriminating features. The addition of other features such as texture, color, and more sophisticated shape features was tried and resulted in automatic indexing. However, for the results shown here, another approach was tried with success: the percussion class was defined by a constraint which subtracted all other class scores from 1, and then added the result. (i.e., if it is not anything else, then it is a percussion instrument). This approach worked well.

Future versions of this system could easily employ context or inter object relationships [8] as a result of the use of constraints. Geometric models could also be used as query prototypes. Additional features (record fields) can be easily added at any time on part or all of the database. This is facilitated transparently to the user, as the ISR database is dynamically reconfigurable. Thi s constrai nt based method i s also very effi ci ent, si nce i nformati on i s extracted from the i mages upon entry i nto the database, and is extracted only once. This reduced form of information allows rapid scanning and processing during database queries. Image pointers are passed back as results of the queries, allowing images to be stored off line or on slower access devices.

5. ACKNOWLEDGMENTS

The software tools used to generate the results shown here were provided by the KBVision? System [9] running on a SUN Sparcstation computer. No code or algorithms were written or used other than the routines provided with the KBVision?System. Images were acquired courtesy of Sunrise Imaging Corp. which provided an IMS-1000 video digitizer board and a VHS-C Camera.

6. REFERENCES

1.M. Fairhurst, Computer Vision for Robotic Systems: An Introduction, Prentice Hall, New York, 1988.

2. A. Hanson and E. Riseman, "From Image Measurements to Object Hypotheses," COINS Technical Report 87-129,

University of Massachusetts at Amherst.

3.P. Eggleston, "Object Segmentation Techniques for use in Laboratory Visual Automation Systems," Proceeding of the

Advances in Intelligent Robotic Systems XI, 1992.

4.J. Brolio, B. Draper, J.R. Beveridge, and A. Hanson, "The ISR: A Database for Symbolic Processing in Computer

Vision," COINS Technical Report 89-111, University of Massachusetts at Amherst.

5.R. Duda and P. Hart, Pattern Recognition and Scene Analysis, John Wiley, New York, 1973.

6. D. Tuesdell, "Content Based Retrieval: Imaging in Multimedia," Advanced Imaging, pp. 26-28, 80, May 1992.

7.P. Eggleston, "Using Constraints to Incorporate Domain Knowledge," Proceeding of the Advances in Intelligent Robotic

Systems X, 1991.

8.H. Tagare, G. Gindi, J. Duncan, and C. Jaffe, "A Geometric Indexing Scheme for an Image Library," Technical Report,

Department of Diagnostic Radiology, School of Medicine, Yale University.

9.P. Eggleston, "General Support Tools for Algori thm Development and Sci enti fi c Research i n Machi ne Vi si on,"

Proceedings of the Electronic Imaging International, pp. 157-162, 1992

Figure 1. Image database of art objects used in prototyping the indexing and retrieval system. Images are shown by way of

the Image Browser , a pictorial directory tool.

Figure 2. Extracted object boundaries of images shown in Figure 1.

Figure 3. Class membership examination of image objects selected by mouse.

Figure 4. Image objects scored (mapped to gray scale) as a fit to user selected prototype object.

Figure 5. Binary image editor used to generate example objects for query of image database. User sketches object of interest

to be retrieved from database.

Figure 6. Object sketched in Figure 5 is analyzed and used to generate query constraints which are then used to retrieve

images having similar properties from the database.

Figure 7. Query of image database by way of multi-dimensional feature space selection.

Figure 8. Query of image database by class type. Images containing class(es) of objects selected from menu are retrieved from database and displayed in the browser tool. Users can examine and modify query constraints used at anytime.

相关主题
相关文档 最新文档