当前位置:文档之家› Digital map conflation a review of the process and a proposal for classification

Digital map conflation a review of the process and a proposal for classification

Digital map conflation  a review of the process and a proposal for classification
Digital map conflation  a review of the process and a proposal for classification

Full Terms & Conditions of access and use can be found at

https://www.doczj.com/doc/111606924.html,/action/journalInformation?journalCode=tgis20

Download by: [Michigan State University]

Date:

10 November 2015, At: 00:27

International Journal of Geographical Information Science

ISSN: 1365-8816 (Print) 1362-3087 (Online) Journal homepage: https://www.doczj.com/doc/111606924.html,/loi/tgis20

Digital map conflation: a review of the process and a proposal for classification

Juan J. Ruiz , F. Javier Ariza , Manuel A. Ure?a & Elidia B. Blázquez

To cite this article: Juan J. Ruiz , F. Javier Ariza , Manuel A. Ure?a & Elidia B. Blázquez (2011) Digital map conflation: a review of the process and a proposal for classification,International Journal of Geographical Information Science, 25:9, 1439-1466, DOI:10.1080/13658816.2010.519707

To link to this article:

https://www.doczj.com/doc/111606924.html,/10.1080/13658816.2010.519707

Published online: 06 Sep 2011.

Submit your article to this journal

Article views: 741

View related articles

Citing articles: 16 View citing articles

REVIEW ARTICLE

Digital map conflation:a review of the process and a proposal

for classification

Juan J.Ruiz a ,F.Javier Ariza a ,Manuel A.Uren ?a a *and Elidia B.Bla ′zquez b

a

Department of Cartographic Engineering,University of Jaen,Jae ′n,Spain;b Department of Graphic

Engineering,Design and Projects,University of Malaga,Malaga,Spain

(Received 21January 2010;final version received 18August 2010)

This article is centred on analysing the state of the art of the conflation processes applied to geospatial databases (GDBs)from heterogeneous sources.The term conflation is used to describe the procedure for the integration of these different data,and conflation methods play an important role in systems for updating GDBs,derivation of new cartographic products,densification of digital elevation models,automatic features extraction and so on.In this article we define extensively each conflation process,its evaluation measures and its main application problems and present a classification of all conflation processes.Finally,we introduce a bibliography which the reader may find useful to further explore the field.It tries to serve as a starting point and direct the reader to characteristic research in this area.

Keywords:conflation;data fusion;data integration;interoperability;accuracy

1.Introduction

The domain of Geographical Information System (GIS)research is experiencing a rapid growth of both computational power and quantity of information,making large spatial data archives available over the Internet.Moreover,there is an increasing necessity to share this information between different users.In this way GIS agencies have adopted a spatial data infrastructure (SDI)model (Bernard et al .2005,Masser 2005).The maintaining of SDI implies the development of initiatives and associations to formalize global,international,national and regional infrastructures for the creation of effective frames for data interchange,including INSPIRE (INfrastructure for SPatial InfoRmation in Europe)(Directive 2007/2/CE;EU 2007),SEIS (Shared Environmental Information System)(SEIS 2008),OGC (Open Geospatial Consortium)or the Technical Committee 211of ISO among others,with special attention to 19,100norm family.

The previous situation allows us to develop geospatial databases (GDBs)from hetero-geneous sources,which cover the same geographical zone,describe the same information in different forms and vary in density and accuracy (Beller et al .1997).In this context the general term conflation is used to describe the same procedure that other authors (Thakkar and Knoblock 2003,2004,Michalowski et al .2004,Olteanu et al .2006,Butenuth et al .2007)have defined like data integration of these heterogeneous sources,arising from the need to combine geographical information of several scales and precisions (Kyriakidis et al .

International Journal of Geographical Information Science V ol.25,No.9,September 2011,1439–1466

*Corresponding author.Email:maurena@ujaen.es

ISSN 1365-8816print/ISSN 1362-3087online #2011Taylor &Francis

https://www.doczj.com/doc/111606924.html,/10.1080/13658816.2010.519707https://www.doczj.com/doc/111606924.html,

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

1999),transferring attributes from one dataset to another or adding missing features.More vague are the definitions of Cobb et al .(1998,2000)or Edwards and Simpson (2002)that refer to the conflation process as the action of unifying or integrating two different GDBs to obtain an enriched product ‘better’than the previous two.This definition agrees with the traditional definition of data fusion that is commonly used in computer science and remote sensing fields (Csatho ′and Schenk 1998,Lee and Shan 2003,Bartels et al .2006,Chen et al .2008,Elaksher 2008),and mainly for urban areas (Cornet et al .2001,Fanelli et al .2001,

Wald and Ranchin 2001).According to Stankut _e

and Asche (2009),the fundamental concept of data fusion is the extraction of the best-fit geometry data as well as the most suitable semantic data from existing datasets so that the extracted data features are subsequently amalgamated into a newly created dataset.These authors recognize the approaches of White (1981)and Saalfeld (1985)as the first ones in the domain of data fusion or data integration .Finally,the definition of conflation by Casado (2006)is more explicit and introduces the basic concept of the conflation process,which is to identify the homologous elements between both GDBs and to perform a suitable transformation which brings one map onto the other.

According to Brovelli and Zambroni (2004),although the term map conflation was coined in the early 1980s by Saalfeld,we cannot consider it a reality until the middle of this decade when it appeared in the works of Lynch and Saalfeld (1985),Rosen and Saalfeld (1985),Saalfeld (1985,1988),Fagan and Soehngen (1987)and Lupien and Moreland (1987).In these works the conflation process is considered as the main consequence of three factors:(i)the need to compile a great number of digital maps with lower time cost,(ii)the technological development achieved enough to support interactive and real-time management of a great quantity of images and maps and (iii)the rapid development and implementation of mathe-matical algorithms in the computational geometry environments (Preparata and Shamos 1985).This allowed the development of software needed to satisfy the conflation systems (Saalfeld 1988),which were able to employ new triangulation routines (Gillman 1985,Saalfeld,1985),topologic transformations (White 1981,Griffin and White 1985,Saalfeld 1985)and pattern recognition techniques (Pavlidis 1982,Saalfeld 1987).

GDB conflation can be divided into two phases:the identification of possible corre-spondences between elements (matching)and the alignment of these matchings (Gillman 1985,Gabay and Doytsher 1994).Traditionally,these phases have been executed in an interactive way as indicated by Lupien and Moreland (1987)and Saalfeld (1988).Although the identification problem has been resolved relatively easily,the correctness of the matching has been more complex,as mentioned by Saalfeld (1985),Walter and Fritsch (1999)and Uitermark (2001).To overcome the matching problem,Cobb et al .(1998)or Chen et al .(2004)considered that the conflation procedure can be redefined by following three phases:feature matching between spatial data,ensuring that there is no inappropriate matching and the differences between matched objects are just apparent,and correcting spatial data or creating new integrated data so that apparent differences are eliminated.Following Coob et al .(1998),feature matching can be considered as a type of classification problem that can be handled through theories of evidential reasoning or uncertainty,such as fuzzy logic.In this sense,we note the works of Foley and Petry (2000)and Rahimi et al .(2006).The three previously mentioned phases are completed by Yuan and Tao (1999)with a previous stage of data pre-processing.This is used to standardize the input data,thus assuring confrontability .Veregin and Giordano (1994)defined confrontability as the level at which it is possible to fuse spatial datasets that occupy the same geographical region.Having this idea in mind,the factors used to assure confrontability are the resolution level and generalization,the carto-graphic scale,the data format and the projection.

1440J.J.Ruiz et al.

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

Finally,we note that the conflation processes can be used to solve several practical problems like spatial discrepancy deletion (Yuan and Tao 1999),spatial feature (or attri-butes)transfer in the updating processes of GDBs (Tomaselli 1994,Dallal 1998)or the development of new products that are the result of integrating GDBs from different sources (Cobb et al .1998).

In this article we define each conflation process,its evaluation measures and its main application problems with the aim to serve as a starting point and direct the reader to characteristic research in this area.Following this order,we have organized this article as follows:In Sections 3–6we describe the conflation processes according to the proposed classification.In Sections 7and 8we catalogue the main applications of the processes and analyse the main conflation software systems.At the end of the article there is a bibliogra-phy,which the reader may find useful to further explore the field.2.

Classification of conflation processes

The classification of conflation processes is a very complex issue because of the need to structure the several approaches from existing literature and classify the solutions proposed by these approaches.To achieve a significant classification,we must take into account several aspects of the processes (Table 1):not only the matching criteria used (Casado 2006)or the categorization problem (Yuan and Tao 1999)but also the representation model or the automatization factor used.Finally,we note that a specific kind of application or conflation process does not necessarily have to be included in a unique category or class.It can be associated with some of them.3.

Classification of conflation processes according to the matching criteria used

Casado (2006)classified the conflation problems of GDBs as the main criteria using the properties used to compare and match both GDBs.We distinguish,then,between the processes of geometric,semantic and topological conflation,according to the criteria used to match the objects.These conflation processes are complementary.

The proposed conceptual model corresponds to a general conflation process,shown in Figure 1.In this process,after testing that two GDBs have the same format,scale,carto-graphic projection and reference system,we begin to determine the homologous elements of both datasets,using the semantic filters and ontologies as debug operators that separate those relevant elements from those useless to the process.Once we have obtained the homologous elements,it is possible to evaluate the differences.Both intermediate stages can be included as useful outputs of the conflation process (without the need to go further in the process)as

Table 1.

Classification of conflation processes.

According to the matching criteria used According to the representation model used -Geometric conflation -Conflation between two vector GDBs

-Semantic conflation -Conflation between a vector GDB and an image -Topological conflation

-Conflation between two images

-Conflation between an image and a DEM -Conflation between two DEMs

According to the categorization problem According to the automatization level applied -Vertical conflation -Automatic conflation -Horizontal conflation -Semiautomatic conflation -Temporal conflation

-Manual conflation

International Journal of Geographical Information Science 1441

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

mentioned by Kucera and Clarke (2005).However,once evaluated the differences,we can also establish the most appropriate geometric adjustments for the transformation of both GDBs.These transformations include the topological modifications as constraints.After erasing the possible differences,both GDBs can be matched.3.1.Geometric conflation 3.1.1.Conceptualization

In the case of geometric conflation,when we are working with two GDBs,the problem is defined as how to transform the features of one map onto another (target map)minimizing the geometric differences between them (Casado 2006).These differences are presented as a loss of positional interoperability created by displacements between the elements of a GBD and the elements of the second one.3.1.2.

Evaluation measures

We propose the grouping of the sets of geometric evaluation measures for testing positional interoperability in two classes:absolute and relative (or probabilistic)measures.

(1)Absolute measures are expressed in absolute terms.Matching between elements is

achieved when the selected parameter is lower than a predefined threshold.Distance is the main absolute geometric parameter used to establish the differences between the two GDBs.The employed distance is a function of the kind of element that we are trying to match.Thus,it is convenient to use the Euclidean distance to match points,whereas the average distance (McMaster 1986),the Hausdorff distance (Hausdorff 1919,Mustie `re 1995,Yuan and Tao 1999,Deng et al .2005)or the Fre ′chet discrete distance (Fre ′chet 1906,Alt and Godau 1995,Devogele 2002)are

Pre-processing

Pre-processing

GDB1

GDB1+GDB2

GDB1′

GDB2′

GDB2

- Resolution -Generalization

-Scale -Format

-Cartographic projection

Filtering

- Semantic - Ontologies

- Topological constraints - Geometric adjusting

- Points - Lines - Polygons

Evaluation of differences

Determination of homologous elements

Geometric transformations

Figure 1.Conceptual framework for the conflation process between two GDBs.

1442J.J.Ruiz et al.

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

generally used to match lines.However,in the case of linear features,all the mentioned distances are not appropriate.Thus,the Hausdorff distance only takes into account the sets of points on both curves and does not reflect the course of the lines (Alt and Buchin 2005),being able to happen that two lines which have a small Hausdorff distance,do not look like one to each other at all.On the other hand,the average distance depends on the selected points on the two lines so that different selection of points can change the distance between them.To overcome this pro-blem,it is more appropriate to use the Fre ′chet distance because of its greater robustness with regard to the noise in the data.Kundu (2006)proposed an alternative measure of distance between the lines optimizing matching using a previous trans-formation of both orientation and position.However,there are examples that distances are not suited to handle conflation problems,existing in other measure-ment parameters that can be used to match the lines,such as the angles or orientation,the geopositional relationships or the measures of similitude (McMaster 1986,Mustie `re 1995)described in units of length or curvature.Finally,the motivation for multiple polyline-to-polygons matching is twofold:First,the matching of shapes has been performed mostly by comparing them as a whole (Arkin et al .1991,Rosin 1993,Mokhtarian et al .1996,Pentland et al .1996,Siddiqi et al .1999,Latecki and Laka ¨mper 2000,Veltkamp and Hagedoorn 2001,Samal et al .2004).This fails when a significant part of one shape is distorted by noise.Second,partial matching helps identifying similarities even when a significant portion of one shape boundary is occluded or distorted (Tanase 2005).

(2)Relative or probabilistic measures delimitate the conflation zone,as was described

by Savary and Zeitouni (2005),and are expressed in probabilistic or relative units.One technique is the buffer method where a distance d is defined and associated with a geometric object x belonging to GDB1.Each object (belonging to GDB2)whose distance compared with the object x is less than d has a great probability of being matched with x .The matching with the highest degree of probability is used to resolve the conflation process.This model,developed by Walter and Fritsch (1999),has been improved later by Mantel and Lipeck (2004),Stigmar (2005)and Zhang et al .(2005).The second technique is the epsilon band method,where a tolerance zone is associated with points and segments composing the polylines.In this method,a circle of tolerance is associated with each point whose distance changes according to the nature of the represented point.Then the circles associated with each end of the segment are linked by their common tangent to build the tolerance band (Gabay and Doytsher 1994).

3.1.3.Applications

We note that most studies of conflation refer to the geometric aspect of the process,perhaps because its interest and evidence grows every time two different GDBs are combined.Geometric adjustment operations have always been based on the dimensional transforma-tions,and among all of these we notice the Helmert transformation (Watson 2006),the affine transformation (To ¨bler 1994),the rubber-sheeting method (Gillman 1985,Griffin and White 1985,Saalfeld 1993,Doytsher 2000,Petry and Somodevilla 2000,Doytsher et al .2001,Kang 2002,Shimizu and Fuse 2003,Haunert 2005),the conformal transformations based on analytical functions (Ward-Brown and Churchill 2004)and the special transformation functions called ‘multiresolution spline’(Brovelli and Zambroni 2004).

International Journal of Geographical Information Science 1443

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

3.2.

Semantic conflation

3.2.1.Conceptualization

The geometric transformation of a GDB to minimize the positional differences with respect to the second GDB does not guarantee the correct matching of the homologous elements of those GDBs (Casado 2006).This error in matching is mainly due to semantic heterogeneity (differences in intended meaning of terms in specific contexts)(Stuckenschmidt 2003).Semantic heterogeneity of GDBs has been addressed in many works (Savary and Zeitouni 2003,Worboys and Duckham 2005,Maue ′2008,Shvaiko and Euzenat 2008),and their evaluation measures are based on the analysis of the semantic relationships established between their elements (Chen et al .2005).In this context,semantic conflation tries to minimize the semantic differences between two GDBs caused by the semantic heterogeneity.

3.2.2.Evaluation measures

One of the previous tasks of the semantic evaluation of conflation processes is to apply semantic filters (Savary and Zeitouni 2005).A semantic filter removes all those entities that are irrelevant to the execution of the process.Once the filter is completed,the semantic relationships among the entities have to be established.To obtain the evaluation two sets of measures are proposed:those based on ontologies and those using artificial intelligence.(1)Measures based on ontologies.Ontology is a logical theory that describes a domain of

interest and a specification of the meaning of terms used in the vocabulary (V accari et al .2009).Based on the precision of this specification,the notion of ontology includes various data and conceptual models (Euzenat and Shvaiko 2007).Ontologies provide new solutions to the semantic heterogeneity problem in many applications,including integration of GDBs (Morocho et al .2003,Giunchiglia et al .2008)and retrieval of geographical information (Lutz and Klien 2006,Klien 2007).There are different research approaches for semantic integration measures based on ontologies.Fonseca et al .(2002)took a top–down approach by starting from ontologies and using the concept of role to handle different conceptual views of geospatial information.

Rodr?

′guez and Egenhofer (2003)based the measures on a similarity analysis of con-cepts described in independent ontologies.Kovalerchuk et al .(2005b)provided a framework for an imagery virtual expert system that supports imagery registration and conflation tasks based on iconized ontologies.This approach generates ontological iconic annotation of images to be able to compare and conflate images on conceptual ontological level.Vaccari et al .(2009)adopted a particular type of ontology matching,namely,structure preserving semantic matching (SPSM).This matching operation takes two graph-like structures and produces a set of correspondences between those nodes of the graphs that correspond semantically to one another (Giunchiglia et al .2008).

(2)Measures based on artificial intelligence use the agent technology (Brodie 1992,

Robertson 2004)or mediators (Wiederhold 1994)as the main method for resolving problems of semantic interoperation.According to Wiederhold (1994),mediation is an integrating concept,combining a number of current technologies to find and transform data.A mediator is an interchange software that allows the localization,transformation and integration of geospatial data from different sources using semantic interpretation.Moreover,the use of mediators eases the access to a great variety of sources (Gravano et al .1994),the selection of more relevant information and the evaluation of the incorrect matching level achieved (Wiederhold 1994).

1444J.J.Ruiz et al.

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

3.2.3.

Applications

The most widely used application of the semantic conflation processes is the homo-genization of the existing feature classes on a map.The existent information is analysed and a new feature classification,consistent with the data and the scale,is generated.This can result in a class grouping with fewer feature classes (Casado 2006).The most simple case that can be resolved,where the correspondence is obvious,is where the two GDBs have the same set of attributes,relations between them and categories (Yuan and Tao 1999).

3.3.Topological conflation

3.3.1.Conceptualization

Even after removing the positional differences and assuring the semantic correspondences,there is no guarantee of a correct matching between two GDBs.It is necessary that,simultaneously,the topological relationships are preserved (Egenhofer and Franzosa 1991,Li et al .2002).In this sense,it can be stated that topological conflation is a consequence,and hence the basis of the two previously described conflation processes (geometric and semantic).This is because each positional change of the entities creates the need to generate a new topology of the GDBs and the topological conflation is the complement used to optimize geometric and semantic adjustments.

3.3.2.Evaluation measures

In this case,we cannot define evaluation measures as previously.Topological conflation has a different approach that uses the information for global adjustment.The evaluation mea-sures are grouped into active and passive measures.

(1)Active measures are based on the active participation of the information obtained

from the topological relations between the two GDBs.The active measures follow the integration of the topological information in a global adjustment,and so improve the quality of the matching between the entities of the GDBs using the topological relationships.To obtain this improvement,the relationships must be actively applied to global adjustment procedures,which help to preserve the topology.In the case presented by Hope et al .(2006)and Hope and Kealy (2008),the topological relationships are presented as constraints in a geometric minimum square adjustment to optimize the global adjustment.

(2)Passive measures use the topology as a test element of the geometric conflation

process between GDBs without actively intervening in it.These are the cases presented by Filin and Doytsher (2000)who develop a matching validation procedure,named round-tripwalk ,which tests the correctness of the matching based on topological relationships from all the possible candidates;Mustie `re and Devogele (2008)who propose a matching process,named NetMatcher ,based on the comparison of geometric,attributive and topological properties of objects and Tong et al .(2009)who propose a probability-based feature match-ing method by integrating multiple measures:geometric,semantic and topological.

International Journal of Geographical Information Science

1445

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

3.3.3.

Applications

Topological conflation can be used to reduce the matching search in the geometric conflation processes,or even to test the results obtained from such processes (Yuan and Tao 1999).However,as described by Hope and Kealy (2008),the passive use considerably limits the real possibility of topology as a part of the conflation processes.Another application of topology as an integrator between GDBs is to extend the homologous element search and its corresponding matching to all types of networks.This requires a high topological similitude between GDBs,as described in the method called ‘Topological Transfer’(Tomaselli 1994).4.Classification of conflation processes according to the representation model used Before we can classify the conflation processes based on the model used,it is necessary to briefly define the models.The geographical information can be stored in vector or raster format (Burrough 1986).The raster model,and thus the raster format,divides the space into cells of homogeneous size,each having one value.Once we increase the cell size,the resolution decreases and so the precision of the representation of the geographical informa-tion.This model is mainly used in studies that require continuous layers to represent non-discrete phenomena.In this sense,a digital image and a digital elevation model (DEM)can be considered as particular cases of raster products.However,in the case of the vector model,positional precision is the most important attribute of each element without obliging the phenomena to have discrete representation.This second model uses,to represent real world entities,three kinds of geometric elements:points,lines and polygons.4.1.

Conflation between vector GDBs (V vs.V)

4.1.1.Conceptualization

The origin of the conflation processes between vector GDBs is in the necessity of optimizing products.In the case of comparing two different data sources,one having greater precision in attribute and the other in position,we would desire a fusion of both sources to obtain one product with the best of both characteristics.Saalfeld (1985,1987)and Lynch and Saalfeld (1985)were the first to propose a matching methodology between two vector GDBs.4.1.2.Difficulties,entities used and applications

Difficulties arise because of matching complexity,mainly because of the heterogeneity of the characteristics (form,position or scale)of the products compared.To match both vector GDBs,we can use points (Lynch and Saalfeld 1985,Saalfeld 1985,1987,VITAL 1997,Kang 2001,2002,V olz 2006,Schuurman et al .2006),lines (Saalfeld 1988,Walter and Fritsch 1999,Doytsher 2000,Gabay and Doytsher 2000,Brovelli and Zambroni 2004,Mantel and Lipeck 2004,Kampshoff 2005,Stigmar 2005,Zhang et al .2005)or polygons (Arkin et al .1991,Gombosi et al .2003,Masuyama 2006).

Examples of this process are the combination of two sets of topographical data,the detection of temporal changes in GDBs and the conflation of digital gazetteer (DG)data (Hastings 2008).The first case is described by Casado (2006),who analysed the most appropriate geometric transformations to reduce the differences between the GDBs.Figure 2shows two kinds of geometric differences described by Casado (2006):(a)crossing is translated without preserving the angles or orientations of the lines,whereas in (b)there is a simple translation of the junction between the lines.The detection of temporal changes in

1446J.J.Ruiz et al.

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

GDBs is described by several authors like Schuurman et al .(2006)in the Canada census case,or Kang (2001)in the US Bureau census case.4.2.Conflation between a vector GDB and an image (V vs.I)4.2.1.

Conceptualization

This conflation obtains products by integrating the structuring and modelling of the vector GDBs and the semantic component of the images.The conflation processes between a vector GDB and an image were developed as a result of the identification and extraction of features (roads or buildings)from imagery (Fischler et al .1981,Gugan and Dowman 1988,Fortier et al .1999,Price 1999,Nevatia and Price 2002,Song et al .2006,Doucette et al .2007,Kovalerchuk et al .2008).However,Hild and Fritsch (1998),who inspired on the method for the vectorization of land maps of Musavi et al .(1988),and mainly Chen et al .(2003,2004)develop a global alignment methodology between vector and images.Wu et al .(2007)proposed to break the global alignment problem into a set of local domains where the displacement between imagery and vectors is approximated by a translation.4.2.2.

Difficulties,entities used and applications

In this case the main issue is the complexity inherent in determining the minimum precision of the homologous elements of the image.The main entities used are points.Chen et al .(2003)proposed the uses of road junctions as points to accomplish matching.Several authors (Chen et al .2003,Chiang et al .2005,2009)utilized techniques based on observa-tions of pixel tone to extract pixels from images.They implement a method that analyses the shape of the greyscale histogram and classifies the histogram clusters based on their sizes.Thus,the goal of these techniques is the partition of the original image into a set of regions that are visually distinct and uniform with respect to certain statistical properties.However,the majority of real imageries regions do not display this statistical uniformity because of the remaining noise coming from the small objects.To solve these problems,Ruiz et al .(2011)developed an algorithm for identifying and extracting pixels that belong to road intersections based on a non-parametric approach to texture analysis.They measure the distributions of simple texture by the local binary patterns (LBP)(Ojala et al .1996,Ojala and Pietika ¨inen 1999).

A highly representative example of overlapping between a vector GD

B and an image in the web is the representation of the cadastral parcel mapping over orthophotos at 1m resolution by the Spanish virtual cadastral office (Figure 3).Another important applications of this conflation process are (i)the updating of GDBs making use of

high-resolution

Figure 2.Examples of geometric differences between two vector GDBs.

International Journal of Geographical Information Science 1447

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

imagery (Song et al .2006,2009),(ii)the possibility of integrating information from spatial information systems such as GIS and searchable databases of geo-referenced imagery (Gupta et al .1999,Dare 2000)and (iii)the possibility of detecting inconsistencies in vector data using oblique images (Mishra 2008).4.3.Conflation between two Images (I vs.I)4.3.1.Conceptualization

The combination or integration of images enriches the resulting product and gives us more possibilities for its applications such as remote sensing or photogrammetric processing.4.3.2.Difficulties,entities used and applications

The problem of imagery conflation requires sophisticated and robust methods to produce better image fusion,target recognition and tracking (Kovalerchuk et al .2005a).From the initial techniques using points (Brookshire et al .1990,Besl and McKay 1992,Feldmar and Ayache 1994,Zhang 1994,Afek and Brand 1998,Dare 2000,Wang et al .2001,Brown 2002,Terzopoulos et al .2003,Shah and Xiao 2005),we have evolved to (i)the use of defined areas with a determined number of image pixels using the radiometric level as an attribute for correlation techniques (Kovalerchuk and Schwing 2002,Kovalerchuk and Sumner 2003,Kovalerchuk et al .2004),(ii)the use of new algebraic structural invariant approaches (invariant to geometric distortions and change of image resolution)to identify corresponding linear and area features in two images (Kovalerchuk et al .2005a,2006,Kovalerchuk 2007)and (iii)the use of geostatistical approaches for quantifying spatial autocorrelation inherent in regionalized variables (Zhang et al .2009).

Conflation between two images is a common process both in multispectral remote sensing studies (Zhou 1994)and in aerial images alignment (Wang et al .2003).In the first case,band combination is a commonly used procedure,because even in multitemporal studies it is necessary to use some scenes from the same geographical zone obtained at different dates.The second case includes digital photogrammetry,where the focus is on obtaining automatic orientation processes to determine different homologous points.This is achieved with the matching of radiometric patterns of two images (Schenk 1994)(Figure 4).Another important application is cartographic motion representation,which increases

our

Figure 3.Examples of superposition of a vector GDB on an image.In both cases we can appreciate the leak of coherence between the vector elements (border of the path,parcel boundary line or buildings)and the digital orthophoto.Source:Virtual Cadastral Office.https://ovc.catastro.meh.es.

1448J.J.Ruiz et al.

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

understanding of complex spatial evolutionary procedures (Acevedo and Masuoka 1997,Sidiropoulos et al .2005).

4.4.Conflation between images and DEMs (I vs.E)4.4.1.

Conceptualization

This conflation is achieved to obtain products that integrate the model capabilities of a DEM or a surface elevation model (SEM)with the semantic and interpretative component of an image.

4.4.2.Difficulties,entities used and applications

The main difficulties arise when comparing the products having height information,DEM or SEM,with others which only have radiometric data.For this reason,it can be said that this conflation process has a high estimative component,because the visual evaluation tasks are of great importance.Even if the image has no height information,it can be derived from the resulting information integrated in the https://www.doczj.com/doc/111606924.html,ing this information,the comparison can be determined by matching these new entities with the same ones from the DEM or SEM.

The main processes of conflation between images and DEM are the 3D reconstruction of urban zones using an SEM or the 3D reconstruction of buildings (Figure 5).In these cases the height data are obtained from a laser sensor,for example,laser scanner without a clear definition of the break lines of the 3D model (Haala 1994,Kraus and Pfeifer 1998,Ackermann 1999).These break lines of the SEM must be defined using other methodolo-gies.McIntosh and Krupnik (2002)reconstructed the boundaries of the buildings by over-laying original SEM with aerial imagery.Another application is the development of flight simulations which,as it allows us to see buildings in 3D,changes the way we read maps (Sidiropoulos et al .

2005).

Figure 4.Matching with radiometric patterns.In both images we have marked the search zone.

International Journal of Geographical Information Science 1449

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

4.5.Conflation between DEMs (E vs.E)

4.5.1.Conceptualization

The combination of DEMs is achieved to remove gaps or discontinuities on height data to obtain a complete and continuous representation of the terrain,with the continuity expressed in terms of both continuous height representation and continuous topological representation (Katzil and Doytsher 2003).We must note that the concept of DEM is used here in a wide sense,including several height data models (raster,disperse points,triangular irregular networks,etc.)obtained from different sources that can use different formats and densities.4.5.2.Difficulties,entities used and applications

The methodologies used to combine overlapping DEMs offer a partial solution only regarding the completeness and continuity requirements,as they address only the issue of height representation of the terrain,but not its characteristics (Laurini 1998).The entities used are height points included in both DEMs.

Following Katzil and Doytsher (2003),two types of methodologies are used to merge overlapping terrain databases:(i)Cut and paste :the less accurate (usually lower density)database is replaced with the more accurate (usually higher density)database in the over-lapping zones;and (ii)Height smoothing :Heights of the merged database in the edge zones between the two databases are calculated using a weighted average of heights from both databases,with weighting defined as a function of the different database accuracy levels.However,in both methodologies the nature of the terrain is not preserved.To solve this problem,Katzil and Doytsher (2006)proposed the use of overlapping adjacent DEMs assuming the existence of a set of homologous point pairs between the two DTMs (extracted using the Scale Invariant Feature Transform described in Lowe 2004)and integrating both datasets with the continuity expressed in terms of continuous height representation

and

Figure 5.3D reconstruction of urban zones and buildings using a SEM.

1450J.J.Ruiz et al.

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

continuous topological representation (morphological structures)(Figure 6).Kyriakidis et al .(1999)employed a geostatistical approach for integrating elevation estimates derived from DEMs and elevation measurements of higher accuracy.Both set of data are employed for modelling the unknown elevation surface in a way that properly reflects the relative reliability of the two sources of information.Stochastic conditional simulation is performed for generating alternative representations of the unknown surface.From this set of repre-sentations,the probability that the unknown value is greater than that reported at each node in the DEM is determined.

5.Classification of Conflation processes according to the categorization problem Once the conflation processes have been grouped based on the matching criteria and the model of the products used,we proceed to categorize the problem,paying special attention to the description of where the processes are used.Yuan and Tao (1999),as a first approach,classified the conflation problems of GDBs,based on their categorization,in two ways:vertical conflation and horizontal conflation.We have added a third type of conflation,related to the temporal aspect to the process.We call this temporal conflation.The time factor is of great interest because it is the source of the changes in GDBs.In this section we define each process,specifying some utilities and citing their representative

examples.

Figure 6.Merging two adjacent DEMs:(a)Left DEM.(b)Right DEM.(c)Merged DEM using cut-and-paste method.(d)Merged DEM using Katzil methodology (Katzil and Doytsher 2003).

International Journal of Geographical Information Science 1451

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

5.1.

Vertical conflation

5.1.1.Conceptualization

Vertical conflation is concerned with detecting and erasing the differences between spatial datasets that occupy the same geographical region.5.1.2.

Processes where used and difficulties

It is mainly applied to interoperation tasks between GDBs,generally vector GDBs.Among the main problems we note the lack of interoperability between two GDBs corresponding to geographical information,which is modelled as networks with different levels of details:roads (Safra et al .2006),specifically the ones that exist in the ITS (Intelligent Transport Systems);railways;electric lines and rivers (Mustie `re and Devogele 2008).In road’s case,the lack of interoperability of GDBs does not allow us,for example,to communicate the localization of an object or an event (moving vehicle,accident,available hotel,highway closure,etc.)unambiguously and in real time to suitably equipped recipients (Noronha et al .1999).

5.1.3.Solutions and applications

The solutions to these difficulties are based on (i)the geometric criteria,mainly related to the development of simulation processes,which allow us to generate positional distortion models using point elements in vector GDBs;and (ii)the comparison of geometric,semantic,and topological properties of objects.In the first case,distortion models,or error fields (Figure 7),can be densified using geostatistic techniques to minimize the errors of

the

Figure 7.Superposition of two GDBs and calculation of the error surface generated by the magnitude of the distortion vectors.

1452J.J.Ruiz et al.

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

methods and the processes involved in maps production.In the second case,the main advantage is the possibility of incorporating the topological organization to achieve an efficient matching.There are numerous studies related to the interoperability problems between GDBs used in ITS.Among these we consider the work of V olz (2005,2006),Hunter and Goodchild (1996),Funk et al .(1998),Church et al .(1998)and Noronha et al .(1999),all a consequence of the VITAL project (Vehicle Intelligence &Transportation Analysis Laboratory 1997)of the University of California,USA.With respect to the electric lines or railways,we consider the works of Mustie `re and Devogele (2008).

5.2.Horizontal conflation 5.2.1.Conceptualization

Horizontal conflation is related to detecting and erasing the differences between the common boundaries of adjacent datasets (Gregory and Ell 2006)or adjacent image tiles mosaic (Chen 2008).

5.2.2.Processes where used and difficulties

The development of these processes is usual in cadastral services,and merging adjacent DEMs (Katzil and Doytsher 2003)previously shown in Figure 6.The main problems with horizontal conflation are:(i)detecting and erasing the spatial differences between common limits (boundaries)of two adjacent GDBs,where these datasets are generated and managed independently by different administrative entities,and (ii)relative misalignment of image tiles in satellite image mosaics regarding ground features,such as roads,with no considera-tion on the issue of absolute accuracy (Chen 2008).

5.2.3.Solutions and applications

The Cadastral organization and its information systems must evolve from procedures and corporative systems towards a real interoperability with public administrations and private partners interested in territorial management (Conejo and Velasco 2007).To achieve this goal,proposed solutions focus on matching the common boundaries of both GDBs using the linear entities as the element of comparison.Beard and Chrisman (1988)were the first to define the problem of horizontal conflation,whereas Coren and Doytsher (1998)developed the first algorithm used to find the optimum matching between GDBs.Gregory (2002)or Seung-Hyun et al .(2005)attempted to erase the positional differences between adjacent city boundaries.Other authors like Chen (2008)adopt a statistical approach to derive critical parameters from sampled features for describing their geometrical deviations across adjacent image tiles.

5.3.Temporal conflation 5.3.1.Conceptualization

Temporal conflation is related to detecting and eliminating differences between spatial datasets that occupy the same geographical zone at two different points in time.

International Journal of Geographical Information Science 1453

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

5.3.2.

Processes where used and difficulties

This kind of process is normally used in updating GDBs.These GDBs are repeated and updated from time to time so,although questions and definitions change (Norris and Mounsey 1983),a wealth of similar information is available over a long period of time (Gregory 2002).These updating processes usually refer to temporal conflation between GDBs that have a vector mechanism.The difficulties of incorporating time in the basic GIS data model are well known.Time has long been a thorn in the side of all GIS developers (Morris et al .2000).The main problem is that of detecting differences between two GDBs of the same location as a consequence of the difference between their respective acquisition times.

5.3.3.Solutions and applications

Generally,the solutions proposed to solve this problem are based on the comparison of each polygon (e.g.cadastral parcels)included in the first GDB with all the polygons in the second GDB.However,as it is described by Preparata and Shamos (1985),this solution is the worst possible,because the execution time required by the algorithm is extremely high.Gregory (2002)used a variety of European projects and described how they have attempted to add a temporal dimension to the vector GIS data model to create fully spatio-temporal databases for routinely collected socio-economic statistics.Among the literature we consider the works of Shmutter and Doytsher (1992),Doytsher and Gelbman (1995),Coren and Doytsher (1998),Gombosi et al .(2003),Gregory (2002),Gregory and Ell (2006),Masuyama (2006)and Kovalerchuk and Kovalerchuk (2007).These authors develop different meth-odologies to solve the lack of efficiency in time.Their methodologies range from finding repetitive geometric structures (between the two vector GDBs)using spatial indexing to match polygons of both GDBs using their representative points.Finally,Doucette et al .(2009)presented a quantitative evaluation methodology of spatial accuracy for automated vector data-updating methods based on both timed comparison between manual-and automation-based extraction and measures of spatial accuracy.6.

Level of automatization of conflation processes and conflation post-processing

Until now,the implementation limits have not allowed the development a ‘sufficiently intelligent’conflation system that can identify and recognize,on its own,the homologous elements in two different GDBs,thus completing the matching process having a high quality level.For example,in the case of two vector GDBs,it is necessary to achieve an efficient matching,to know the geometric,semantic and topological relationships between both datasets,which allow a user to delete or add correspondences that are not correctly matched by the automated algorithm (Xiong 2000).Therefore,in these cases it is necessary for a post-processing task.In this way,the quality of conflation processes and their success can be defined and evaluated on the basis of (i)the time needed for manual inspection of the results obtained from the automated matching algorithm (editing of the results and checking them)(Xiong and Sperling 2004),or the time comparison between manual and automation based-conflation (Doucette et al .2009);and (ii)measures of spatial accuracy obtained from the comparison of the results with ground truth.Both timed comparison between manual-and automation based-extraction and measures of spatial accuracy are needed.In this sense Doucette et al .(2009)developed a quantitative and meaningful evaluation methodology of spatial accuracy for automated vector GDBs updating methods.

1454J.J.Ruiz et al.

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

We note that the degree of accuracy of the automatic matching classifies the methodol-ogy and gives us the chance to use semiautomatic or even completely manual methods.In this sense,Lemarie and Raynal (1996)were the first to establish a classification based on the degree of automatism.Following these authors,conflation processes can be classified as being automatic,semiautomatic or completely manual.

In any case,current developments in artificial intelligence,specifically the agent theory,mean that both the complexity and the number of tasks that can be completely automated are growing,and the accuracy of matching processes is been improved.Following Arunachalam et al.(2003),an agent can be defined as an autonomous software entity that can solve problems and has the adaptive and learning capacity used to adjust its responses based on previous experience.The conflation processes follow this trend.The algorithms have been improved,helping to increase the degree of automatization of these conflation processes.This improvement has been achieved with three main objectives:increasing the number of elements used in the process,reducing the significance of the sampling and reducing the final costs.

7.Catalogue of the main applications of the conflation process

Following the proposed classification,we have selected a set of the most relevant tasks where the conflation processes between GDB are very important.Table 2shows the distribution of these tasks based on the focus of products used and the process catalogue.As can be seen in Table 2,most of the vertical conflation processes are focused on obtaining new products and assuring interoperability between them,whereas the temporal conflation process is mainly concerned with updating the GDB.

8.Conflation software systems and solved task

With the development of conflation techniques,many private organizations and public entities dedicated to GIS development have implemented their own conflation software systems and commercial conflation tools.Some of these conflation tools are shown below.

ConfleX is a conflation tool that uses artificial intelligence (AI)to automatically match GIS feature from multiple GDB sources and allow for transfer of attributes.Once the automated conflation is completed,the system provides extensive tools for review and quality control of the assigned features.

JCS Conflation Suite performs various kinds of geospatial conflation processes.The system supports operations for detecting and visualizing errors and both automatic and manual cleaning functions to adjust geometry.JCS provides manual editing tools to perform human-assisted conflation for cases that automated methods cannot solve.

MapMerger (developed by ESEA)is a conflation software that provides the capability to manage points,lines and polygons between two overlapping GDBs.This system is also capable of both conflating large datasets efficiently and solving the GDB update process quickly.

TotalFit is a process that exactly aligns multiple spatial GDBs and provides an affordable choice between the accuracy of expensive survey quality data and inconsistency of the non-aligned spatial data.This system is not static.Once alignment relations are established,ongoing updates of spatial data are developed.

International Journal of Geographical Information Science 1455

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

T a b l e 2.

D i s t r i b u t i o n o f t h e m o r e u s u a l a p p l i c a t i o n o f t h e c o n f l a t i o n p r o c e s s e s .

F r e q u e n t a p p l i c a t i o n s

P r o c e s s c a t e g o r i z a t i o n

B a s e d o n t h e m o d e l s o f G D B s

E n t i t i e s u s e d

V e r t .

H o r i z .T e m p .V v s .V V v s .I I v s .I I v s .E E v s .E P o i n t s

L i n e s P o l y g o n s U p d a t i n g o f G D B s C a d a s t r a l a p p l i c a t i o n s *****

Q u a l i t y c o n t r o l o f c a r t o g r a p h i c p r o d u c t s ****

D e n s i f i c a t i o n o f D

E M a n d S E M *

*

*C h a n g e d e t e c t i o n i n a t t r i b u t e a n d p o s i t i o n ****

E r a s i n g t h e d i f f e r e n c e s b e t w e e n c o m m o n b o u n d a r i e s ****A u t o m a t i c o b j e c t e x t r a c t i o n ****

M o s a i c g e n e r a t i o n **

*I n t e r o p e r a b i l i t y b e t w e e n G D B s ****D e r i v a t i o n o f n e w c a r t o g r a p h i c p r o d u c t s *****

*

**P h o t o g r a m m e t r i c o r i e n t a t i o n ***

3D B u i l d i n g r e c o n s t r u c t i o n ***F l i g h t s i m u l a t i o n s **

**A d j a c e n t o v e r l a p p i n g o f G D B s *******

A t t r i b u t e t r a n s f e r

***

*

*U s i n g i m a g e p i x e l s .**U s i n g g r o u n d e l e m e n t s .

1456

J.J.Ruiz et al.

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

9.

Conclusion and future directions of conflation processes

In this article,an overview of what has happened in the theoretical and practical development of conflation is presented.We have analysed the different processes related to the conflation process between GDBs,with reference to previous ulterior studies or research related to this area.Moreover,we distinguish between three kinds of conflation processes:(i)process categorization,(ii)based on the models of the GDBs and (iii)entities used.This classification provides a global and rapid view of the issue,giving a clear idea of what constitutes a conflation of GDBs.Future directions of conflation research include uncertainty propagation during the processes,the development of geostatistical prediction and simulation methods or the development of tools to increase the degree of automatization of conflation processes.

Acknowledgements

This work has been partially funded by the Ministry of Science and Technology of Spain under Grant No.BIA2003-02234and by the Regional Ministry of Innovation,Science and Enterprise of Andalusia (Spain)under Grant No.P08-TIC-4199.

The authors also acknowledge the Regional Government of Andalusia (Spain)for their constant

financial support since 1997to their research group (Ingenier?′a Cartogra ′fica,Code TEP-164).

References

Acevedo,W.and Masuoka,P.,1997.Time series animation techniques for visualizing urban growth.

Computers &Geosciences ,23(4),423–435.

Ackermann,F.,1999.Airborne laser scanning-present status and future expectations.ISPRS Journal of

Photogrammetry and Remote Sensing ,54(1),64–67.

Afek,Y .and Brand, A.,1998.Mosaicking of Orthorectified Aerial Images.Photogrammetric

Engineering &Remote Sensing ,64(2),115–125.

Alt,H.and Buchin,M.,2005.Semi-computability of the Fre ′chet distance between surfaces.In:

Proceedings of the 21st European Workshop on Computational Geometry,9–11March 2005,Eindhoven,Netherlands.Available online at https://www.doczj.com/doc/111606924.html,/viewdoc/summary?doi=10.1.148.2774[Accessed 20April 2010].Alt,H.and Godau,M.,https://www.doczj.com/doc/111606924.html,puting the Fre ′chet distance between two polygonal curves.

International Journal of Computational Geometry and Applications ,5(1/2),75–91.

Arkin,E.,et al .,1991.An efficiently computable metric for comparing polygonal shapes.IEEE

Transactions on Pattern Analysis and Machine Intelligence ,13(3),209–216.

Arunachalam,R.,et al .,2003.The TAC Supply Chain Management Game.CMU-CS-03-184.

Technical report.

Bartels,M.,Wei,H.,and Ferryman,J.,2006.Analysis of LIDAR data fused with co-registered bands.

In :Proceedings of IEEE international conference on advanced video and signal-based surveil-lance ,22–24November Sydney,Australia.Available from:https://www.doczj.com/doc/111606924.html,/projects/LIDAR/index.html [Accessed 20April 2010].

Beard,M.and Chrisman,N.,1988.Zipper:a localized approach to edgematching.The American

Cartographer ,15(2),163–172.

Beller,A.,Doytsher,Y .,and Shimbersky,E.,1997.Practical Linear Conflation in an innovative

software environment.In :Proceedings of 1997ACSM/ASPRS ,7–10April 1997Seattle,USA.Available from:https://www.doczj.com/doc/111606924.html,/Spatial/contents/proceedings/acsm97.htm [Accessed 16December 2009].

Bernard,L.,et al.,2005.Towards an SDI research agenda.In :Proceedings 11th EC-GI &GIS

workshop ,June 29–July 12005Alghero,Italy.Available from:http://www.geoinfo.uji.es/pubs/2005-SDIResearchAgenda.pdf [Accessed 23June 2010].

Besl,P.and McKay,N.,1992.A method for registration of 3-D shapes.IEEE Transactions on Pattern

Analysis and Machine Intelligence ,14(2),239–256.

Brodie,M.,1992.The promise of distributed computing and the challenges of legacy information

systems.In :proceedings of IFIP DS-5semantics of interoperable database systems ,16–20

International Journal of Geographical Information Science

1457

D o w n l o a d e d b y [M i c h i g a n S t a t e U n i v e r s i t y ] a t 00:27 10 N o v e m b e r 2015

相关主题
文本预览
相关文档 最新文档