Concurrency of line segments in uncertain geometry
- 格式:pdf
- 大小:231.18 KB
- 文档页数:12
Robustness for Freein Unconditional Multi-Party ComputationMartin Hirt and Ueli MaurerETH Zurich,SwitzerlandManuscript,February12,2001Abstract.We present a very efficient multi-party computation protocol unconditionally secureagainst an active adversary.The security is maximal,i.e.,active corruption of up to t<n/3ofthe n players is tolerated.The communication complexity for securely evaluating a circuit withm multiplication gates over afinitefield is O(mn2)field elements,including the communicationrequired for simulating broadcast.This corresponds to the complexity of the best known protocolsfor the passive model,where the corrupted players are guaranteed not to deviate from the protocol.Even in this model,it seems to be unavoidable that for every multiplication gate every player mustsend a value to every other player,and hence the complexity of our protocol may well be optimal.The constant overhead factor for robustness is small and the protocol is practical.1Introduction1.1Secure multi-party computationSecure multi-party computation(MPC),as introduced by Yao[Yao82],allows a set of n players to compute an arbitrary agreed function of their private inputs,even if an adversary may cor-rupt up to t arbitrary players.Almost any distributed cryptographic protocol can be seen as a multi-party computation,and can be realized with a general MPC protocol.Multi-party com-putation protocols are an important building block for reducing the required trust and building secure distributed systems.While currently special-purpose protocols(e.g.,for collective sign-ing)are considered practical,this paper suggests also that general-purpose protocols may well be practical for realistic applications.Two different notions of corrupting are usually considered.A passive(or curious)adversary may only read the information stored by the corrupted players,without controlling the player’s behavior.Hence only privacy of the inputs is an issue to consider,but not the correctness of the result.In contrast,an active adversary can take full control of the corrupted players.Assuring not only the privacy of the inputs,but also the correctness of the outputs(robustness)appears to entail a substantial overhead.For instance,all known protocols make(usually heavy)use of a broadcast sub-protocol for which the optimal known complexity is O(n2).We briefly review the classical results on secure MPC.Goldreich,Micali,and Wigderson [GMW87]presented a protocol,based on cryptographic intractability assumptions,which allows n players to securely compute an arbitrary function even if an active adversary corrupts any t< n/2of the players.In the secure-channels model,where bilateral secure channels between every pair of players are assumed,Ben-Or,Goldwasser,and Wigderson[BGW88]and independently Chaum,Cr´e peau,and Damg˚ard[CCD88]proved that unconditional security is possible if at most t<n/3of the players are corrupted.In a model where additionally physical broadcast channels are available,unconditional security is achievable if at most t<n/2players are corrupted [RB89,Bea91b,CDD+99].1.2Previous work on efficiencyIn the past,both the round complexity and the communication complexity of secure multi-party protocol were subject to many investigations:Protocols with low round complexity[BB89,BFKR90,FKN94,IK00]suffer either from an unacceptably high communication complex-ity(even quadratic in the number of multiplication gates),or tolerate only a very small numberof cheaters.First steps towards better communication complexity were taken by Franklin and Yung[FY92]and Gennaro,Rabin,and Rabin[GRR98],wherefirst a private but non-resilient com-putation is performed(for the whole protocol in[FY92],and for a segment of the protocol in[GRR98]),and only in case of faults the computation is repeated with a slow but resilient pro-tocol.Although this approach can improve the best-case complexity of the protocol(when noadversary is present),it cannot speed up the protocol in the presence of a malicious adversary:asingle corrupted player can persistently enforce the robust but slow execution,annihilating anyefficiency gain.Recently,Hirt,Maurer,and Przydatek[HMP00]proposed a new protocol for perfectly securemulti-party computation with considerably better communication complexity than previous pro-tocols:A set of n players can compute any function(over afinitefield F)which is specified as acircuit with m multiplication gates(and any number of linear gates)by communicating O(mn3)field elements,contrasting the previously best complexity of O(mn6).Subsequently,the samecomplexity was achieved by Cramer,Damg˚ard,and Nielsen[CDN01]in the cryptographic model(where more cheaters can be tolerated).1.3ContributionsThe main open question in this line of research was whether security against active cheaterscan be achieved with the same communication complexity as security against passive cheaters,namely with O(mn2).We answer this question in the affirmative:The only(and unavoidable)price to to pay for active security is a reduction in the number of tolerable cheaters(t<n/3instead of t<n/2).The computation complexity of the new protocol is on the order of thecommunication complexity and hence not relevant.The achieved communication complexity of O(mn2)appears to be optimal.Even in the passive case,it appears unavoidable that every player sends a value to every other player for each multiplication gate.The new protocol uses Beaver’s circuit-randomization technique[Bea91a]and the player-elimination framework from[HMP00].2ModelWe consider the well-known secure-channels model as introduced in[BGW88]:The set P= {P1,...,P n}of n players is connected by bilateral synchronous reliable secure channels.Broad-cast channels are not assumed to be available.The goal of the protocol is to compute an agreedfunction,specified as an arithmetic circuit over afinitefield F with|F|>n.The number ofmultiplication gates in the circuit is denoted by m.To each player P i a unique public valueαi∈F\{0}is assigned,where for computation efficiency we assume thatαi=ωi for some n-th root of unityω(see Appendix B for details).The computation of the function is secure withrespect to a computationally unbounded active adversary that is allowed to corrupt up to t ofthe players,where t is a given threshold with t<n/3.Once a player is corrupted,the adversary2can read all his information and can make the player misbehave arbitrarily.The security of our protocol is unconditional with an arbitrarily small probability of error.Formal definitions of security can be found in[Can00]and in[MR98],and our protocols are secure for any of these definitions.3Protocol OverviewThe protocol proceeds in two phases:In a preparation phase,which could actually be performed as a pre-computation independent of the circuit(except an upper bound on the number m of multiplication gates must be known),m random triples(a(i),b(i),c(i))(for i=1,...,m)with c(i)=a(i)b(i)are shared among the players.In the computation phase,the circuit is evaluated gate by gate,where for each multiplication gate one shared triple from the preparation phase is used[Bea91a].In the preparation phase,some of the players in P might be eliminated,and the sharings are only among the set P ⊆P of remaining players.However,it will be guaranteed that the number of corrupted players in P is smaller than(|P |−t)/2,which is sufficient for evaluating the circuit.As the underlying secret-sharing scheme we use the scheme of Shamir[Sha79],like in most threshold protocols:A value s is t-shared among the players means that every player P i holds a share s i,and there exists a polynomial f(x)of degree at most t such that f(0)=s and f(αi)=s i for i=1,...,n.4Preparation PhaseThe goal of this phase is to generate m t-shared random triples(a(i),b(i),c(i))with c(i)=a(i)b(i) in such a way that the adversary obtains no information about a(i),b(i),and c(i)(except that c(i)is the product of a(i)and b(i)).The generation of these triples makes extensive use of the player-elimination framework of[HMP00]:Therefore,the triples are generated in blocks of = m/n triples.The triples of a block are generated(in parallel)in a non-robust manner;only at the end of the block,consistency is checked jointly for all triples of the block in a single verification procedure(fault detection). In case of an inconsistency,a set D⊆P of two players,at least one of whom is corrupted,is identified(fault localization)and excluded from further computations(player elimination).The triples of the failed block are discarded.Player elimination ensures that at most t blocks fail, and hence in total at most(n+t)blocks must be processed.More precisely,the consistency verification takes place in two steps.In thefirst verification procedure(fault detection I),the degree of all involved sharings is verified.In other words,the players jointly verify that all sharings produced for generating the triples are of appropriate degree.The second verification step(fault detection II)is performed only if thefirst verification step is successful.Here,the players jointly verify that for every triple(a(i),b(i),c(i)),every player shared the correct values such that c(i)=a(i)b(i).If a fault is detected(in either fault-detection procedure),then all triples in the actual block are discarded.Furthermore,a set D⊆P of two players,one of whom is corrupted,is found(fault localization I,resp.fault localization II)and eliminated from further computations.Note that in the fault-localization procedure,the privacy of the triples is not maintained.Due to the circuit-randomization technique[Bea91a],the triples contain completely random values unrelated with all values of the actual computation.3Both verification steps use n“blinding triples”,and the privacy of these triples is annihilated in the verification procedure.Therefore,in each block, +2n triples are generated.Thefirst verification step verifies the degree of all sharings of thefirst +n triples,using(and destroying) the remaining n triples for blinding.The second verification step verifies thefirst triples,using the remaining n triples for blinding.Note that the second verification step requires that the sharings of all +n involved triples are verified to be correct.During the generation of the blocks,players can be eliminated.We denote the actual set of players with P ,the actual number of players with n =|P |,and the maximum number of cheaters in P with t .Without loss of generality,we assume that P ={P1,...,P n }.During the computation,the inequality2t <n −t will hold as an invariant.In the beginning,P =P, n =n,and t =t,and trivially2t <n −t is satisfied.In player elimination,n will be decreased by2,and t by1.Clearly,this preserves the invariant.0.Set P =P,n =n,and t =t.1.Repeat until n blocks(i.e.,n ≥m triples)succeeded:1.1Generate +2n triples(in parallel)in a non-robust manner(Sect.4.1).1.2Verify the consistency of all sharings involved in thefirst +n triples(fault detection I,Sect.4.2).If a fault is detected,identify a set D⊆P of two players such that at least one player in D is a cheater,and set P to P \D,n to n −2and t to t −1(fault localization I).1.3If no fault was detected in Step1.2,then verify that in thefirst triples,every playershared the correct values(fault detection II,Sect.4.3).If a fault is detected,identify a set D⊆P of two players,at least one of whom is corrupted,and set P to P \D,n to n −2and t to t −1(fault localization II).1.4If both verification steps were successful,then the generation of the block was successful,and thefirst triples can be used.If either verification procedure failed,then all triples of the actual block are discarded.4.1Generate one t-shared triple(a,b,c)The purpose of this protocol is to generate one t-shared triple(a,b,c),where c=ab.The generation of this triple is non-robust:verification will take place only at the end of the block. In particular,in order to share a value,the dealer simply computes the shares and sends them to the players;the consistency verification of the sent shares is delayed.The generation of the triple is straight-forward:First,the players jointly generate t -sharings of two random values a and b.This is achieved by having every player share two random values, one for a and one for b,which are then summed up.Then,a t -sharing of c=ab is computed along the lines of[BGW88,GRR98](passive model):Every player computes the product of his share of a and his share of b.These product shares define a2t -sharing of c,and c can be computed with Lagrange interpolation.This interpolation is a linear function on the product shares.Hence,a t -sharing of c can be computed as a linear combination of t -sharings of the product shares.Finally,the degrees of the sharings of a,b,and c must be increased from t to t.In order to do so,the players jointly generate three random sharings of0,each with degree t, and add one of them to the t -sharings of a,b,and c,respectively.Note that the protocol for computing a sharing of c=ab relies on the fact that the degree of the sharings of a and b is less than one third of the number of actual players,and it would not4work if a and b would be shared with degree t for 3t ≥n .On the other hand,it is important that finally the sharings of all blocks have the same degree (otherwise the multiplication protocol of Section 5would leak information about the factors),and t can decrease from block to block.Therefore,first the triple is generated with degree t ,and then this degree is increased to t .Protocol “Generate”We give the exact protocol for generating one t -shared triple (a,b,c ):1.The players jointly generate t -sharings of random values a and b :1.1Every player P i ∈P selects two random degree-t polynomials f i (x )and g i (x ),and hands the shares a ij = f i (αj )and b ij = gi (αj )to player P j for j =1,...,n .1.2The polynomial for sharing a is f (x )= n i =1 f i (x )(thus a = f (0)),and the polynomial for sharing b is g (x )= n i =1g i (x )(thus b = g (0)),and every player P j ∈P computes his shares of a and b asa j =ni =1 a ij ,and b j =n i =1 b ij .2.The players jointly compute a t -sharing of c =ab :2.1Every player P i ∈P computes his product share e i = a ib i ,and shares it among the players with the random degree-t polynomial h i (x )(with h i (0)= e i ),i.e.,sends the share e ij = h i (αj )to player P j for j =1,...,n .2.2Every player P j computes his share c j of c asc j =ni =1w i e ij ,where w i =n j =1j =i αjof degree t ,and hence define a valid sharing.Furthermore,if at least one player in P i∈P honestly selected random polynomials f i(x)and g i(x),then a and b are random and unknown to the adversary.In Step2,we need the observation that c can be computed by Lagrange interpolation [GRR98]:c=ni=1w i e i,where w i=nj=1j=iαjProtocol“Fault-detection I”The following steps for verifying the degree of all sharings in one block are performed in parallel, once for every verifier P v∈P :1.The verifier P v selects a random vector[r1,...,r +n ]with elements in F and sends it toeach player P j∈P .2.Every player P j computes and sends to P v the following corresponding linear combinations(plus the share of the blinding polynomial)for every i=1,...,n :a(Σ)ij= +nk=1r k a(k)ij+ a( +n +v)ijb(Σ) ij =+nk=1r k b(k)ij+ b( +n +v)ijc(Σ)ij= +nk=1r k c(k)ij+ c( +n +v)ij¯a(Σ)ij=+nk=1r k¯a(k)ij+¯a( +n +v)ij¯b(Σ)ij=+nk=1r k¯b(k)ij+¯b( +n +v)ij¯c(Σ)ij=+nk=1r k¯c(k)ij+¯c( +n +v)ij3.P v verifies whether for each i=1,...,n ,the shares a(Σ)i1,..., a(Σ)inlie on a polynomial ofdegree at most t .The same verification is performed for the shares b(Σ)i1,..., b(Σ)in and for theshares c(Σ)i1,..., c(Σ)in,for i=1,...,n .Furthermore,P v verifies whether for each i=1,...,n ,the shares¯a(Σ)i1,...,¯a(Σ)inlie on a polynomial of degree at most t−1.The same verificationis performed for the shares¯b(Σ)i1,...,¯b(Σ)inand for the shares¯c(Σ)i1,...,¯c(Σ)infor i=1,...,n .4.Finally,P v broadcasts(using an appropriate sub-protocol)one bit according to whether allthe6n verified polynomials have degree at most t ,respectively t−1(confirmation),or at least one polynomial has too high degree(complaint).Protocol“Fault-localization I”This protocol is performed if and only if at least one verifier has broadcasts a complaint in Step4 of the above fault-detection protocol.We denote with P v the verifier who has reported a fault. If there are several such verifiers,the one with the smallest index v is selected.5.The verifier P v selects one of the polynomials of too high degree and broadcasts the lo-cation of the fault,consisting of the index i and the“name”of the sharing( a, b, c,¯a,¯b,or¯c).Without loss of generality,we assume that the fault was observed in the sharing a(Σ)i1,..., a(Σ)in.6.The owner P i of this sharing(i.e.,the player who acted as dealer for this sharing)sends to theverifier P v the correct linearly combined polynomial f(Σ)i (x)=+nk=1f(k)i(x)+ f( +n +v)i(x).7.P vfinds the(smallest)index j such that a(Σ)ij(received from P j in Step2)does not lie onthe polynomial f(Σ)i (x)(received from the owner P i in Step6),and broadcasts j among theplayers in P .8.Both P i and P j send the list a(1)ij,..., a( +n )ij , a( +n +v)ijto P v.9.P v verifies that the linear combination[r1,...,r +n ]applied to the values received fromP i is equal to f(Σ)i (αj).Otherwise,P v broadcasts the index i,and the set of players tobe eliminated is D={P i,P v}.Analogously,P v verifies the values received from P j to be7consistent with a(Σ)ij received in Step2,and in case of failure broadcasts the index j,andD={P j,P v}.10.P vfinds the(smallest)index k such that the values a(k)ij received from P i and P j differ,andbroadcasts k and both values a(k)ij from P i and a(k)ij from P j.11.Both P i and P j broadcast their value of a(k)ij.12.If the values broadcast by P i and P j differ,then the localized set is D={P i,P j}.If thevalue broadcast by P i differs from the value that P v broadcast(and claimed to be the value received from P i),then D={P i,P v}.Else,D={P j,P v}.Security analysisIt follows from simple algebra that if all players are honest,then the above fault-detection protocol will always pass.On the other hand,if at least one of the involved sharings(in any of the +n triples)has too high degree,then every honest verifier will detect this fault with probability at least1/|F|.For at least n −t ≥n−2t honest players,this gives an overall error probability of at most|F|−(n−2t).The correctness of the fault-localization protocol can be verified by inspection.There is no privacy issue in this protocol;the generated triples are discarded.Complexity analysisThe fault-detection protocol requires n(n( +n)+6n2)=n2 +7n3field elements to be sent and n bits to be broadcast.For fault localization,up to n+2( +n+1)=2 +3n+2field elements must be sent and2log n+log6+log( +n+1)+4log|F|bits must be broadcast.4.3Verification that all players share the correct product sharesIt remains to verify that in each triple k=1,..., ,every player P i shared the correct product share e(k)i= a(k)i b(k)i(Step2.1of protocol Generate).Since it is already verified that the sharingsof all factor shares are of degree t ,it is sufficient to verify that the shares e(k)1,..., e(k)n lie on apolynomial of degree at most2t .Note that the at least n −t >2t shares of the honest players uniquely define this polynomial.The key idea of this verification protocol is the same as in the previous verification protocol:Every verifier P v distributes a random challenge vector,and the corresponding linear combination of the polynomials(plus one blinding polynomial)is opened towards P v.If a fault is detected,then a set D of two players(one of whom is corrupted)can be found with the fault-localization protocol.Protocol“Fault-detection II”The following steps are performed for each verifier P v∈P in parallel.1.The verifier P v selects a random vector[r1,...,r ]with elements in F and sends it to each player P j∈P .2.Every player P j computes and sends to P v the following linear combinations(with blinding)for every i=1,...,n :e(Σ)ij=k=1r k e(k)ij+ e( +v)ij. 83.P v verifies whether for each i=1,...,n the shares e(Σ)i1,..., e(Σ)inlie on a polynomial ofdegree at most t ,and if so,whether the secrets eΣ1,..., eΣn of the above sharings(computedby interpolating the corresponding share-shares)lie on a polynomial of degree at most2t .P v broadcasts one bits according to whether all polynomials have appropriate degree(confir-mation),or at least one polynomial has too high degree(complaint).Protocol“Fault-localization II”We denote with P v the verifier who has reported a fault in Step3of the above fault-detection protocol.If there are several such verifiers,the one with the smallest index v is selected.4.If in Step3,the degree of one of the second-level sharings e(Σ)i1,..., e(Σ)inwas too high,thenP v applies error-correction tofind the smallest index j such that e(Σ)ij must be corrected(cf.Appendix B).Since all sharings have been verified to have correct degree,P v can concludethat P j has sent the wrong value e(Σ)ij .P v broadcasts the index j,and the set of players tobe eliminated is D={P j,P v}(and the following steps need not be performed).5.Every player P i sends to P v all his factor shares a(1)i,..., a( )i, a( +v)i and b(1)i,..., b( )i, b( +v)i.6.P v verifies for every k=1,..., , +v whether the shares a(k)1,..., a(k)n lie on a polynomialof degree t .If not,then P v applies error-correction andfinds and broadcasts the(smallest) index j such that a(k)j must be corrected.The set of players to be eliminated is D={P j,P v}. The same verification is performed for the shares b(k)1,..., b(k)n for k=1,..., , +v.7.P v verifies for every i=1,...,n whether the value eΣi computed in Step4is correct,i.e.,whethereΣi?=k=1r k a(k)i b(k)i+ a( +v)ib( +v)i.This test will fail for at least one i,and P v broadcasts this index i.The players in D={P i,P v} are eliminated.Security analysisIt follows from simple algebra that if all players are honest,then the above fault-detection protocol will always pass.On the other hand,if the degree of at least one of the involved sharings is higher than2t ,then every honest verifier will detect this fault with probability at least1/|F|.For at least n −t ≥n−2t honest players,this makes an overall error probability of at most|F|−(n−2t).The correctness of the fault-localization protocol can be verified by inspection.There is no privacy issue in this protocol;the generated triples are discarded.Complexity analysisThe fault-detection protocol requires n(n +n2)=n2 +n3elements to be sent,and n bits to be broadcast.The fault-localization protocol requires2n( +1)field elements to be sent and log n bits to be broadcast.5Computation PhaseThe evaluation of the circuit is along the lines of the protocol of[Bea91a].Slight modifications are needed because the degree t of the sharings and the upper bound t on the number of9cheaters need not be equal.Furthermore,special focus is given to the fact that in our protocol, also eliminated players must be able to give input to and receive output from the computation.From the preparation phase,we have m random triples(a(i),b(i),c(i))with c(i)=a(i)b(i), where the sharings are of degree t among the set P of players.The number of corrupted players in P is at most t with2t <n −t,where n =|P |.This is sufficient for efficient computation of the circuit.5.1Input sharingFirst,every player who has input secret-shares it(with degree t)among the set P of players. We use the verifiable secret-sharing protocol of[BGW88](with perfect security),with a slight modification to support t=t .The dealer is denoted by P,and the secret to be shared by s.We do not assume that P∈P (neither P∈P).1.The dealer P selects at random a polynomial f(x,y)of degree t in both variables,withp(0,0)=s,and sends the polynomials f i(x)=f(αi,x)and g i(x)=p(x,αi)to player P i for i=1,...,n .2.Every player P i∈P sends to P j for j=i+1,...,n the values f i(αj)and g i(αj).3.Every player P j broadcasts one bit according to whether all received values are consistentwith the polynomials f j(x)and g j(x)(confirmation)or not(complaint).4.If no player has broadcast a complaint,then the secret-sharing isfinished,and the share ofplayer P j is f j(0).Otherwise,every player P j who has complaint broadcasts a bit vector of length n ,where a1-bit in position i means that one of the values received from P i was not consistent with f j(x)or g j(x).The dealer P must answer all complaints by broadcasting the correct values f(αi,αj)and f(αj,αi).5.Every player P i checks whether the values broadcast by the dealer in Step4are consistentwith his polynomials f i(x)and g i(x),and broadcasts either a confirmation or an accusation.The dealer P answers every accusation by broadcasting both polynomials f i(x)and g i(x)of the accusing player P i,and P i replaces his polynomials by the broadcast ones.6.Every player P i checks whether the polynomials broadcast by the dealer in Step5are con-sistent with his polynomials f i(x)and g i(x),and broadcasts either a confirmation or an accusation.7.If in Steps5and6,there are in total at most t accusations,then every player P i takes f i(0)as his share of s.Otherwise,clearly the dealer is faulty,and the players take a default sharing(e.g.,the constant sharing of0).It is clear that an honest player never accuses an honest dealer.On the other hand,if there are at most t accusations,then the polynomials of at least n −2t >t honest players are consistent,and these polynomials uniquely define the polynomial f(x,y)with degree t.Hence, the polynomials of all honest players are consistent,and their shares f1(0),...,f n (0)lie on a polynomial of degree t.This protocol communicates3n2field elements,and it broadcasts n bits(in the best case), respectively n2+3n+2t2log|F|bits(in the worst case).5.2Evaluation of the circuitThe circuit is evaluated gate by gate.Linear gates can be evaluated without any communication due to the linearity of the used sharing.Multiplication gates are evaluated according to[Bea91a]:10Assume that the factors x and y are t-shared among the players.Furthermore,a t-shared triple (a,b,c)with c=ab is used.The product xy can be written as follows:xy=((x−a)+a)((y−b)+b)=((x−a)(y−b))+(x−a)b+(y−b)a+c.The players in P reconstruct the differences d x=x−a and d y=y−b.This reconstruction is possible because2t <n −t(cf.Appendix B).Note that reconstructing these values does not give any information about x or y,because a and b are random.Then,the following equation holds:xy=d x d y+d x b+d y a+c.This equation is linear in a,b,and c,and we can compute linear combinations on shared values without communication.This means that the players can compute the above linear combination on their respective shares of x and y and they receive a t-sharing of the product xy.More details can be found in[Bea91a].This multiplication protocol requires two secret-reconstructions per multiplication gate. Secret-reconstruction requires every player in P to send his share to every other player(who then applies error-correction to the received shares and interpolates the secret).The communication costs per multiplication gate are hence2n2.Broadcast is not needed.5.3Output reconstructionAny player P can receive output(not only players in P or in P).In order to reconstruct a shared value x towards player P,every player in P sends his share of x to P,who then applies error-correction and interpolation to compute the output x.In the error-correction procedure, up to(n −t−1)/2≥t errors can be corrected(see Appendix B).Reconstructing one value requires nfield elements of communication,and no broadcast.5.4Probabilistic functionsThe presented protocol is for deterministic functions only.In order to capture probabilistic functions,one can generate one(or several)blocks with single values a(i)only(with simplified verification),and use these values as shared randomness.Alternatively,but somewhat wastefully,one just picks the value a(i)from a shared triple (a(i),b(i),c(i)),and discards the rest of the triple.Then,m denotes the number of multiplication gates plus the number of“randomness gates”.5.5On-going computationsIn an on-going computation,inputs and outputs can be given and received at any time during the computation,not only at the beginning and at the end.Furthermore,it might even not be specified beforehand which function will be computed.And example of an on-going computation is the simulation of a fair stock market.In contrast to the protocol of[HMP00],the proposed protocol can easily be extended to capture the scenario of on-going computations.First,the players generate triples(a,b,c)with c=ab,and perform the computation until all triples are exhausted.Then,a new block of triples is generated,and so on.11。
Digital Image Processing and Edge DetectionDigital Image ProcessingInterest in digital image processing methods stems from two principal application areas: improvement of pictorial information for human interpretation; and processing of image data for storage, transmission, and representation for autonomous machine perception.An image may be defined as a two-dimensional function, f(x,y), where x and y are spatial (plane) coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity or gray level of the image at that point. When x, y, and the amplitude values of f are all finite, discrete quantities, we call the image a digital image. The field of digital image processing refers to processing digital images by means of a digital computer. Note that a digital image is composed of a finite number of elements, each of which has a particular location and value. These elements are referred to as picture elements, image elements, pels, and pixels. Pixel is the term most widely used to denote the elements of a digital image.Vision is the most advanced of our senses, so it is not surprising that images play the single most important role in human perception. However, unlike humans, who are limited to the visual band of the electromagnetic (EM) spectrum, imaging machines cover almost the entire EM spectrum, ranging from gamma to radio waves. They can operate on images generated by sources that humans are not accustomed to associating with images. These include ultrasound, electron microscopy, and computer generated images. Thus, digital image processing encompasses a wide and varied field of applications.There is no general agreement among authors regarding where image processing stops and other related areas, such as image analysis and computer vision, start. Sometimes a distinction is made by defining image processing as a discipline in which both the input and output of a process are images. We believe this to be a limiting and somewhat artificial boundary. For example, under this definition,even the trivial task of computing the average intensity of an image (which yields a single number) would not be considered an image processing operation. On the other hand, there are fields such as computer vision whose ultimate goal is to use computers to emulate human vision, including learning and being able to make inferences and take actions based on visual inputs. This area itself is a branch of artificial intelligence(AI) whose objective is to emulate human intelligence. The field of AI is in its earliest stages of infancy in terms of development, with progress having been much slower than originally anticipated. The area of image analysis (also called image understanding) is in between image processing and computer vision.There are no clearcut boundaries in the continuum from image processing at one end to computer vision at the other. However, one useful paradigm is to consider three types of computerized processes in this continuum: low, mid, and highlevel processes. Low-level processes involve primitive operations such as image preprocessing to reduce noise, contrast enhancement, and image sharpening. A low-level process is characterized by the fact that both its inputs and outputs are images. Mid-level processing on images involves tasks such as segmentation (partitioning an image into regions or objects), description of those objects to reduce them to a form suitable for computer processing, and classification (recognition) of individual objects. A midlevel process is characterized by the fact that its inputs generally are images, but its outputs are attributes extracted from those images (e.g., edges, contours, and the identity of individual objects). Finally, higherlevel processing involves “making sense” of an ensemble of recognize d objects, as in image analysis, and, at the far end of the continuum, performing the cognitive functions normally associated with vision.Based on the preceding comments, we see that a logical place of overlap between image processing and image analysis is the area of recognition of individual regions or objects in an image. Thus, what we call in this book digital image processing encompasses processes whose inputs and outputs are images and, in addition, encompasses processes that extract attributes from images, up to and including the recognition of individual objects. As a simple illustration to clarify these concepts, consider the area of automated analysis of text. The processes of acquiring an image of the area containing the text, preprocessing that image, extracting (segmenting) the individual characters, describing the characters in a form suitable for computer processing, and recognizing those individual characters are in the scope of what we call digital image processing in this book. Making sense of the content of the page may be viewed as being in the domain of image analysis and even computer vision, depending on the level of complexity implied by the statement “making sense.” As will become evident shortly, digital image processing, as we have defined it, is used successfully in a broad range of areas of exceptional social and economic value.The areas of application of digital image processing are so varied that some formof organization is desirable in attempting to capture the breadth of this field. One of the simplest ways to develop a basic understanding of the extent of image processing applications is to categorize images according to their source (e.g., visual, X-ray, and so on). The principal energy source for images in use today is the electromagnetic energy spectrum. Other important sources of energy include acoustic, ultrasonic, and electronic (in the form of electron beams used in electron microscopy). Synthetic images, used for modeling and visualization, are generated by computer. In this section we discuss briefly how images are generated in these various categories and the areas in which they are applied.Images based on radiation from the EM spectrum are the most familiar, especially images in the X-ray and visual bands of the spectrum. Electromagnetic waves can be conceptualized as propagating sinusoidal waves of varying wavelengths, or they can be thought of as a stream of massless particles, each traveling in a wavelike pattern and moving at the speed of light. Each massless particle contains a certain amount (or bundle) of energy. Each bundle of energy is called a photon. If spectral bands are grouped according to energy per photon, we obtain the spectrum shown in fig. below, ranging from gamma rays (highest energy) at one end to radio waves (lowest energy) at the other. The bands are shown shaded to convey the fact that bands of the EM spectrum are not distinct but rather transition smoothly from one to the other.Fig1Image acquisition is the first process. Note that acquisition could be as simple as being given an image that is already in digital form. Generally, the image acquisition stage involves preprocessing, such as scaling.Image enhancement is among the simplest and most appealing areas of digital image processing. Basically, the idea behind enhancement techniques is to bring out detail that is obscured, or simply to highlight certain features of interest in an image.A familiar example of enhancement is when we increase the contrast of an imagebecause “it looks better.” It is important to keep in mind that enhancement is a very subjective area of image processing. Image restoration is an area that also deals with improving the appearance of an image. However, unlike enhancement, which is subjective, image restoration is objective, in the sense that restoration techniques tend to be based on mathematical or probabilistic models of image degradation. Enhancement, on the other hand, is based on human subjective preferences regarding what constitutes a “good” en hancement result.Color image processing is an area that has been gaining in importance because of the significant increase in the use of digital images over the Internet. It covers a number of fundamental concepts in color models and basic color processing in a digital domain. Color is used also in later chapters as the basis for extracting features of interest in an image.Wavelets are the foundation for representing images in various degrees of resolution. In particular, this material is used in this book for image data compression and for pyramidal representation, in which images are subdivided successively into smaller regions.F ig2Compression, as the name implies, deals with techniques for reducing the storage required to save an image, or the bandwidth required to transmi it.Although storagetechnology has improved significantly over the past decade, the same cannot be said for transmission capacity. This is true particularly in uses of the Internet, which are characterized by significant pictorial content. Image compression is familiar (perhaps inadvertently) to most users of computers in the form of image file extensions, such as the jpg file extension used in the JPEG (Joint Photographic Experts Group) image compression standard.Morphological processing deals with tools for extracting image components that are useful in the representation and description of shape. The material in this chapter begins a transition from processes that output images to processes that output image attributes.Segmentation procedures partition an image into its constituent parts or objects. In general, autonomous segmentation is one of the most difficult tasks in digital image processing. A rugged segmentation procedure brings the process a long way toward successful solution of imaging problems that require objects to be identified individually. On the other hand, weak or erratic segmentation algorithms almost always guarantee eventual failure. In general, the more accurate the segmentation, the more likely recognition is to succeed.Representation and description almost always follow the output of a segmentation stage, which usually is raw pixel data, constituting either the boundary of a region (i.e., the set of pixels separating one image region from another) or all the points in the region itself. In either case, converting the data to a form suitable for computer processing is necessary. The first decision that must be made is whether the data should be represented as a boundary or as a complete region. Boundary representation is appropriate when the focus is on external shape characteristics, such as corners and inflections. Regional representation is appropriate when the focus is on internal properties, such as texture or skeletal shape. In some applications, these representations complement each other. Choosing a representation is only part of the solution for transforming raw data into a form suitable for subsequent computer processing. A method must also be specified for describing the data so that features of interest are highlighted. Description, also called feature selection, deals with extracting attributes that result in some quantitative information of interest or are basic for differentiating one class of objects from another.Recognition is the pro cess that assigns a label (e.g., “vehicle”) to an object based on its descriptors. As detailed before, we conclude our coverage of digital imageprocessing with the development of methods for recognition of individual objects.So far we have said nothing about the need for prior knowledge or about the interaction between the knowledge base and the processing modules in Fig2 above. Knowledge about a problem domain is coded into an image processing system in the form of a knowledge database. This knowledge may be as simple as detailing regions of an image where the information of interest is known to be located, thus limiting the search that has to be conducted in seeking that information. The knowledge base also can be quite complex, such as an interrelated list of all major possible defects in a materials inspection problem or an image database containing high-resolution satellite images of a region in connection with change-detection applications. In addition to guiding the operation of each processing module, the knowledge base also controls the interaction between modules. This distinction is made in Fig2 above by the use of double-headed arrows between the processing modules and the knowledge base, as opposed to single-headed arrows linking the processing modules.Edge detectionEdge detection is a terminology in image processing and computer vision, particularly in the areas of feature detection and feature extraction, to refer to algorithms which aim at identifying points in a digital image at which the image brightness changes sharply or more formally has discontinuities.Although point and line detection certainly are important in any discussion on segmentation,edge dectection is by far the most common approach for detecting meaningful discounties in gray level.Although certain literature has considered the detection of ideal step edges, the edges obtained from natural images are usually not at all ideal step edges. Instead they are normally affected by one or several of the following effects:1.focal b lur caused by a finite depth-of-field and finite point spread function; 2.penumbral blur caused by shadows created by light sources of non-zero radius; 3.shading at a smooth object edge; 4.local specularities or interreflections in the vicinity of object edges.A typical edge might for instance be the border between a block of red color and a block of yellow. In contrast a line (as can be extracted by a ridge detector) can be a small number of pixels of a different color on an otherwise unchanging background. For a line, there may therefore usually be one edge on each side of the line.To illustrate why edge detection is not a trivial task, let us consider the problemof detecting edges in the following one-dimensional signal. Here, we may intuitively say that there should be an edge between the 4th and 5th pixels.If the intensity difference were smaller between the 4th and the 5th pixels and if the intensity differences between the adjacent neighbouring pixels were higher, it would not be as easy to say that there should be an edge in the corresponding region. Moreover, one could argue that this case is one in which there are several edges.Hence, to firmly state a specific threshold on how large the intensity change between two neighbouring pixels must be for us to say that there should be an edge between these pixels is not always a simple problem. Indeed, this is one of the reasons why edge detection may be a non-trivial problem unless the objects in the scene are particularly simple and the illumination conditions can be well controlled.There are many methods for edge detection, but most of them can be grouped into two categories,search-based and zero-crossing based. The search-based methods detect edges by first computing a measure of edge strength, usually a first-order derivative expression such as the gradient magnitude, and then searching for local directional maxima of the gradient magnitude using a computed estimate of the local orientation of the edge, usually the gradient direction. The zero-crossing based methods search for zero crossings in a second-order derivative expression computed from the image in order to find edges, usually the zero-crossings of the Laplacian or the zero-crossings of a non-linear differential expression, as will be described in the section on differential edge detection following below. As a pre-processing step to edge detection, a smoothing stage, typically Gaussian smoothing, is almost always applied (see also noise reduction).The edge detection methods that have been published mainly differ in the types of smoothing filters that are applied and the way the measures of edge strength are computed. As many edge detection methods rely on the computation of image gradients, they also differ in the types of filters used for computing gradient estimates in the x- and y-directions.Once we have computed a measure of edge strength (typically the gradient magnitude), the next stage is to apply a threshold, to decide whether edges are present or not at an image point. The lower the threshold, the more edges will be detected, and the result will be increasingly susceptible to noise, and also to picking outirrelevant features from the image. Conversely a high threshold may miss subtle edges, or result in fragmented edges.If the edge thresholding is applied to just the gradient magnitude image, the resulting edges will in general be thick and some type of edge thinning post-processing is necessary. For edges detected with non-maximum suppression however, the edge curves are thin by definition and the edge pixels can be linked into edge polygon by an edge linking (edge tracking) procedure. On a discrete grid, the non-maximum suppression stage can be implemented by estimating the gradient direction using first-order derivatives, then rounding off the gradient direction to multiples of 45 degrees, and finally comparing the values of the gradient magnitude in the estimated gradient direction.A commonly used approach to handle the problem of appropriate thresholds for thresholding is by using thresholding with hysteresis. This method uses multiple thresholds to find edges. We begin by using the upper threshold to find the start of an edge. Once we have a start point, we then trace the path of the edge through the image pixel by pixel, marking an edge whenever we are above the lower threshold. We stop marking our edge only when the value falls below our lower threshold. This approach makes the assumption that edges are likely to be in continuous curves, and allows us to follow a faint section of an edge we have previously seen, without meaning that every noisy pixel in the image is marked down as an edge. Still, however, we have the problem of choosing appropriate thresholding parameters, and suitable thresholding values may vary over the image.Some edge-detection operators are instead based upon second-order derivatives of the intensity. This essentially captures the rate of change in the intensity gradient. Thus, in the ideal continuous case, detection of zero-crossings in the second derivative captures local maxima in the gradient.We can come to a conclusion that,to be classified as a meaningful edge point,the transition in gray level associated with that point has to be significantly stronger than the background at that point.Since we are dealing with local computations,the method of choice to determine whether a value is “significant” or not id to use a threshold.Thus we define a point in an image as being as being an edge point if its two-dimensional first-order derivative is greater than a specified criterion of connectedness is by definition an edge.The term edge segment generally is used if the edge is short in relation to the dimensions of the image.A key problem insegmentation is to assemble edge segments into longer edges.An alternate definition if we elect to use the second-derivative is simply to define the edge ponits in an image as the zero crossings of its second derivative.The definition of an edge in this case is the same as above.It is important to note that these definitions do not guarantee success in finding edge in an image.They simply give us a formalism to look for them.First-order derivatives in an image are computed using the gradient.Second-order derivatives are obtained using the Laplacian.数字图像处理与边缘检测数字图像处理数字图像处理方法的研究源于两个主要应用领域:其一是改进图像信息以便于人们分析;其二是为使机器自动理解而对图像数据进行存储、传输及显示。
LOW-FREQUENCY ACTIVE TOWED SONAR (LFATS)LFATS is a full-feature, long-range,low-frequency variable depth sonarDeveloped for active sonar operation against modern dieselelectric submarines, LFATS has demonstrated consistent detection performance in shallow and deep water. LFATS also provides a passive mode and includes a full set of passive tools and features.COMPACT SIZELFATS is a small, lightweight, air-transportable, ruggedized system designed specifically for easy installation on small vessels. CONFIGURABLELFATS can operate in a stand-alone configuration or be easily integrated into the ship’s combat system.TACTICAL BISTATIC AND MULTISTATIC CAPABILITYA robust infrastructure permits interoperability with the HELRAS helicopter dipping sonar and all key sonobuoys.HIGHLY MANEUVERABLEOwn-ship noise reduction processing algorithms, coupled with compact twin line receivers, enable short-scope towing for efficient maneuvering, fast deployment and unencumbered operation in shallow water.COMPACT WINCH AND HANDLING SYSTEMAn ultrastable structure assures safe, reliable operation in heavy seas and permits manual or console-controlled deployment, retrieval and depth-keeping. FULL 360° COVERAGEA dual parallel array configuration and advanced signal processing achieve instantaneous, unambiguous left/right target discrimination.SPACE-SAVING TRANSMITTERTOW-BODY CONFIGURATIONInnovative technology achievesomnidirectional, large aperture acousticperformance in a compact, sleek tow-body assembly.REVERBERATION SUPRESSIONThe unique transmitter design enablesforward, aft, port and starboarddirectional transmission. This capabilitydiverts energy concentration away fromshorelines and landmasses, minimizingreverb and optimizing target detection.SONAR PERFORMANCE PREDICTIONA key ingredient to mission planning,LFATS computes and displays systemdetection capability based on modeled ormeasured environmental data.Key Features>Wide-area search>Target detection, localization andclassification>T racking and attack>Embedded trainingSonar Processing>Active processing: State-of-the-art signal processing offers acomprehensive range of single- andmulti-pulse, FM and CW processingfor detection and tracking. Targetdetection, localization andclassification>P assive processing: LFATS featuresfull 100-to-2,000 Hz continuouswideband coverage. Broadband,DEMON and narrowband analyzers,torpedo alert and extendedtracking functions constitute asuite of passive tools to track andanalyze targets.>Playback mode: Playback isseamlessly integrated intopassive and active operation,enabling postanalysis of pre-recorded mission data and is a keycomponent to operator training.>Built-in test: Power-up, continuousbackground and operator-initiatedtest modes combine to boostsystem availability and accelerateoperational readiness.UNIQUE EXTENSION/RETRACTIONMECHANISM TRANSFORMS COMPACTTOW-BODY CONFIGURATION TO ALARGE-APERTURE MULTIDIRECTIONALTRANSMITTERDISPLAYS AND OPERATOR INTERFACES>State-of-the-art workstation-based operator machineinterface: Trackball, point-and-click control, pull-down menu function and parameter selection allows easy access to key information. >Displays: A strategic balance of multifunction displays,built on a modern OpenGL framework, offer flexible search, classification and geographic formats. Ground-stabilized, high-resolution color monitors capture details in the real-time processed sonar data. > B uilt-in operator aids: To simplify operation, LFATS provides recommended mode/parameter settings, automated range-of-day estimation and data history recall. >COTS hardware: LFATS incorporates a modular, expandable open architecture to accommodate future technology.L3Harrissellsht_LFATS© 2022 L3Harris Technologies, Inc. | 09/2022NON-EXPORT CONTROLLED - These item(s)/data have been reviewed in accordance with the InternationalTraffic in Arms Regulations (ITAR), 22 CFR part 120.33, and the Export Administration Regulations (EAR), 15 CFR 734(3)(b)(3), and may be released without export restrictions.L3Harris Technologies is an agile global aerospace and defense technology innovator, delivering end-to-endsolutions that meet customers’ mission-critical needs. The company provides advanced defense and commercial technologies across air, land, sea, space and cyber domains.t 818 367 0111 | f 818 364 2491 *******************WINCH AND HANDLINGSYSTEMSHIP ELECTRONICSTOWED SUBSYSTEMSONAR OPERATORCONSOLETRANSMIT POWERAMPLIFIER 1025 W. NASA Boulevard Melbourne, FL 32919SPECIFICATIONSOperating Modes Active, passive, test, playback, multi-staticSource Level 219 dB Omnidirectional, 222 dB Sector Steered Projector Elements 16 in 4 stavesTransmission Omnidirectional or by sector Operating Depth 15-to-300 m Survival Speed 30 knotsSize Winch & Handling Subsystem:180 in. x 138 in. x 84 in.(4.5 m x 3.5 m x 2.2 m)Sonar Operator Console:60 in. x 26 in. x 68 in.(1.52 m x 0.66 m x 1.73 m)Transmit Power Amplifier:42 in. x 28 in. x 68 in.(1.07 m x 0.71 m x 1.73 m)Weight Winch & Handling: 3,954 kg (8,717 lb.)Towed Subsystem: 678 kg (1,495 lb.)Ship Electronics: 928 kg (2,045 lb.)Platforms Frigates, corvettes, small patrol boats Receive ArrayConfiguration: Twin-lineNumber of channels: 48 per lineLength: 26.5 m (86.9 ft.)Array directivity: >18 dB @ 1,380 HzLFATS PROCESSINGActiveActive Band 1,200-to-1,00 HzProcessing CW, FM, wavetrain, multi-pulse matched filtering Pulse Lengths Range-dependent, .039 to 10 sec. max.FM Bandwidth 50, 100 and 300 HzTracking 20 auto and operator-initiated Displays PPI, bearing range, Doppler range, FM A-scan, geographic overlayRange Scale5, 10, 20, 40, and 80 kyd PassivePassive Band Continuous 100-to-2,000 HzProcessing Broadband, narrowband, ALI, DEMON and tracking Displays BTR, BFI, NALI, DEMON and LOFAR Tracking 20 auto and operator-initiatedCommonOwn-ship noise reduction, doppler nullification, directional audio。
a rXiv:as tr o-ph/61772v126Oct26To appear in ApJL Mid-Infrared Diagnostics of LINERs E.Sturm,1D.Rupke,2A.Contursi,1D.-C.Kim,2D.Lutz,zer,1,3S.Veilleux,2R.Genzel,1M.Lehnert,1L.J.Tacconi,1D.Maoz,3J.Mazzarella,4S.Lord,4D.Sanders,5A.Sternberg 3ABSTRACT We report results from the first mid-infrared spectroscopic study of a com-prehensive sample of 33LINERs,observed with the Spitzer Space Telescope .We compare the properties of two different LINER populations:infrared-faint LINERs,with LINER emission arising mostly in compact nuclear regions,and infrared-luminous LINERs,which often show spatially extended (non-AGN)LINER emission.We show that these two populations can be easily distin-guished by their mid-infrared spectra in three different ways:(i)their mid-IR spectral energy distributions (SEDs),(ii)the emission features of polycyclic aro-matic hydrocarbons (PAHs),and (iii)various combinations of IR fine-structure line ratios.IR-luminous LINERs show mid-IR SEDs typical of starburst galax-ies,while the mid-IR SEDs of IR-faint LINERs are much bluer.PAH flux ratios are significantly different in the two groups.Fine structure emission lines from highly excited gas,such as [O IV],are detected in both populations,suggesting the presence of an additional AGN also in a large fraction of IR-bright LINERs,which contributes little to the combined mid-IR light.The two LINER groups occupy different regions of mid-infrared emission-line excitation diagrams.Thepositions of the various LINER types in our diagnostic diagrams provide im-portant clues regarding the power source of each LINER type.Most of these mid-infrared diagnostics can be applied at low spectral resolution,making AGN-and starburst-excited LINERs distinguishable also at high redshifts.Subject headings:infrared:galaxies—galaxies:active1.IntroductionSince their identification as a class of galactic nuclei more than25years ago(Heckman 1980),the nature of Low-Ionization Nuclear Emission-Line Regions(LINERs)has remained controversial.Their optical spectra are characterized by enhanced narrow emission lines of low ionization species,quite distinct from those of both H II regions and classical active galactic nuclei(AGNs).They are found in one third to one half of nearby galaxies of all types(e.g.,Ho,Filippenko,&Sargent1997).In many LINERs the emission is concentrated near the nucleus(a few100pc,e.g.Pogge et al.2000)but in others it extends over larger regions,up to a few kpc(Veilleux et al.1995).There is substantial evidence that many LINERs are powered by accretion onto massive black holes,and that these objects,due to low accretion rates,constitute the low-luminosity end of the AGN class(Quataert2001, Kewley et al.2006).If many LINERs at low and high redshifts are indeed low-luminosity AGNs,this would have a significant impact on major issues in astronomy such as the growth history of central black holes and the relation of AGNs to galaxy formation and evolution.Alternative scenarios for LINER excitation mechanisms have been suggested.Models focusing on photoionization by the central AGN continuum(e.g.,Ferland and Netzer1983; Groves et al.2004),have been complemented by stellar photoionization modeling(e.g.,Barth and Shields2000)and observations(e.g.,Maoz et al.1998)showing that,under certain con-ditions,a young starburst can also excite a LINER.In many cases LINER-like emission is also observed on larger spatial scales;these‘extended’LINERs tend to be associated with high infrared luminosity(see2).In addition to nuclear(Black Hole or stellar)photoion-ization,these sources can be ionized by shock heating through cloud collisions induced by accretion,galaxy interactions,mergers,or starburst-driven winds(e.g.Shull&McKee1979; Veilleux&Osterbrock1987;Dopita&Sutherland1995).Determining which LINERs are photoionized by a hard,nuclear power source and which are excited by other ionization processes is crucial for understanding not only the local LINER population,but also AGNs and starbursts at high redshifts,if the fraction of LIN-ERs among high-z galaxies is as high as in the local universe.Galaxies with high infrared luminosities and galaxy interactions are much more common at high redshifts than at z≈0 (e.g.P´e rez-Gonz´a lez et al.2005).Whether or not these galaxies host dominant AGNs is an open question.X-ray and radio observations help to identify AGNs in the nuclei of these galaxies.However,they may face problems in deriving the relative contributions of AGN and starbursts to the luminosity of those objects,in particular for close to Compton-thick AGNs(e.g.NGC6240,Lutz et al.2003).In order to help resolve these problems we have performed a mid-infrared spectroscopic study which is based on a large sample of infrared-faint and infrared-luminous LINERs.This study is thefirst of its kind,since only a few LINERs were observed with ISO as part of various projects with differing scientific goals(Satyapal,Sam-bruna,&Dudik2004),resulting in a sparse collection of mid-IR LINER line detections of rather low S/N in a heterogenous prehensive mid-infrared spectroscopic studies of LINERs have become possible only now with Spitzer.2.Sample Selection,Observations and data ProcessingWe have based our selection of LINERs on the infrared luminosity of low redshift objects. Most of the IR-faint LINERs(L IR/L B 1,with L IR,the8-1000µm luminosity,defined as in Sanders&Mirabel1996)studied in the literature have compact nuclei(e.g.Pogge et al.2000),consistent with photoionization from a central source.This central source is very likely an AGN,since most of these IR-faint LINERs have compact nuclear X-ray sources, but no extended X-ray sources(e.g.Carrillo et al.1999,Satyapal et al.2004).On the other hand,IR-luminous LINERs(L IR/L B 1)often also show compact hard X-ray cores indicative of an AGN,but in addition they often contain multiple off-nuclear X-ray point sources.There is evidence for extended LINER emission in a number of LINERs in the Revised Bright Galaxy Sample(RBGS,see Veilleux et al.1995).A famous example is the extended LINER emission in NGC6240(Veilleux et al.2003).Monreal-Ibero et al.(2006) also show IFU[NII]/Hαexcitation maps of IR-luminous LINERs showing extended LINER emission.Our two sub-samples of different IR-luminosity therefore represent statistically different LINER populations(nuclear versus extended LINER emission).The sample is summarized in Table1.We selected16IR-luminous LINERs from the RBGS,and17IR-faint LINERs from Ho et al.(1997).The IR-faint LINERs are further subdivided into5Type1and6Type2LINERs,and in addition6“transition”objects(see Ho et al.1997,and 3.1).The infrared-bright LINERs are all of Type2.With a median L IR of11.31L⊙the IR-luminous LINERs are much more luminous than the IR-faint LINERs in the IR,and there is no significant difference between the IR-faint sub-classes(L1:8.92 L⊙,L2:9.27L⊙,T2:9.31L⊙).For further multi-wavelength information for all our objects we refer to Carillo et al.(1999).Table1also lists L IR/L B ratios.The median ratios are 106and1.8for the IR-bright and IR-faint LINERs respectively.The data were obtained with the InfraRed Spectrograph(IRS;Houck et al.2004)on board Spitzer in staring mode. High resolution spectra covering10−37µm(modules SH and LH)are complemented by low-resolution spectra in the5−15µm range(modules SL2and SL1).Our data reductionprocess started with the two dimensional BCD products from the Spitzer pipeline(version S12).We used our own IDL routines to perform de-glitching,and SMART(Higdon et al. 2004)for thefinal spectrum extraction.All our targets are nearby(D≤120Mpc),but on average the IR-faint LINERs have lower redshifts(median z=0.004)than the IR-bright objects(median z=0.023).We care-fully checked that aperture effects(extended objects at different distances,andfluxes from different aperture sizes in the different IRS modules)do not significantly affect the spectra and our conclusions.For most of the targets the12and25µmflux densities agree quite well with the IRAS values(the worst deviation is a factor2in NGC5371,a nearby,IR-faint LINER which is dominated by mid-IR emission from the spiral arms);flux density discrep-ancies in the overlapping wavelength region between sub-spectra obtained with different aperture sizes are of the same order;and the profiles of most of the two-dimensional spectral images do not deviate significantly from stellar(point source)profiles.I.e.the observedflux in most of the objects is coming from a nuclear region of0.3kpc(median)diameter.We did not apply an extended source correction to theflux calibration,and we used full width apertures for the spectral extraction.More details about the sample,the observations and the data processing will be described in a forthcoming paper(D.Rupke et al.,in prep.).3.Mid-infrared properties of LINERsOur new Spitzer observations provide three ways to distinguish among the various sub-classes of LINERs:the mid-infrared SEDs,the PAH emission features,and various combi-nations of IRfine-structure line ratios.These are illustrated and discussed below.Figure1 shows the average spectra of both sub-samples(with the IR-faint LINERs further divided into the three subgroups Type1,Type2,and transition objects,see2).As described in 2the IR-faint LINERs,including the transition objects,are much less luminous in the IR than the IR-luminous ones.Before averaging,the individual spectra have been normalized at19µm.The average spectra show clear differences among the sub-groups.Because of a low dispersion within the two major groups(IR-luminous,IR-faint)almost all of our individual spectra can be unambiguously assigned to one of these two groups(Figure2).3.1.SED shapeThe mid-infrared SEDs of IR-faint LINERs are clearlyflatter(i.e.,bluer)than their IR-luminous counterparts.No strong absorption features due to silicates(at9.7and18µm)or ices(at various wavelengths)are seen in either sub-sample.Only one IR-faint LINER, NGC3998,reveals strong silicate emission(Sturm et al.2005).In contrast,the mid-infrared SEDs of infrared bright LINERs generally look very similar to those of starburst galaxies (e.g.,Sturm et al.2000).There are interesting trends in Figure1.The Type1and Type2 spectra are very similar,but below∼15µm the average Type1spectrum has an additional warm(dust)component.The average Transition LINER lies in between the Type1/Type2 objects and the IR-bright LINERs.These transition objects are classified as objects with [O I]strengths intermediate between those of H II nuclei and LINERs,which could be explained either by a dilution of“true”LINER spectra with circumnuclear star formation or by stellar photoionization(see Ho et al.1997).The differences in the mid-IR SEDs of the two major LINER types are quantified in Figure2(left panel):in a diagram of the continuumflux ratios fν(15µm)/fν(6µm)vs. fν(30µm)/fν(6µm)IR-faint and IR-bright galaxies are nicely separated.The warmer spec-trum of(some of)the Type1LINERs compared to the Type2LINERs(see also Fig.1)could be the signature of a warm AGN continuum(as expected in a comparison of Type1and Type2AGNs,assuming the AGN unification model holds for LINERs).Maoz et al.(2005) have found a similar trend in the UV continuum of IR-faint LINERs.In addition,in some of the weaker,nearby IR-faint LINERS photospheric emission from old stellar populations in the center of the host galaxies may contribute significantly to the measured continuum in the range5to∼8µing the same scaling factor as Lutz et al.(2004)to extrapolate from K-band to mid-IR,and assuming that most of the K-bandflux in our IRS aperture is stellar,we conclude that this might indeed be the case for some of the IR-faint LINERs. Stellar contribution is probably not the sole explanation,however,because it could hardly explain the difference between(the average)Type1and Type2LINERs(the K-band/15µm ratios are similar in both types),and because the typical signature of such a component,a decreasing continuum in the5-8µm range,is not prominent in most of the individual spectra.3.2.PAH ratiosFigure1also shows significant differences in the PAH features in the spectra of the two LINER groups.In IR-bright LINERs the PAH spectrum is very similar to starburst galaxies.IR-faint LINERs have very weak PAH features in the5−10µm range,but a strikingly strong11.2µm feature.We note that,in principle,a higher degree of dilution by an additional warm dust component in the IR faint LINERs could cause errors in theflux measurements of PAHs in the5−10µm range,if the continuum has structure of similar width as the PAHs.For instance,strong silicate emission or absorption around10µm,if notaccounted for,could lead to systematically falseflux values of broad PAH features like the 7.7/8.6µm PAH complex.Such an effect,however,should not play a significant role in our measurements of the relatively narrow,well defined PAHs at6.2,11.2and12.7µm.Based on theflux ratios of these PAH features at6.2,11.2,and12.7µm,we present in Figure 2(middle panel)another diagnostic diagram for the distinction between IR-bright and IR-faint LINERs.Hony et al.(2001,their Figure5)have constructed such a diagram using ISO spectra of galactic HII regions,YSOs and evolved stars/reflection nebulae.They found that the strongest11.2µm PAH features(lowest12.7/11.2and6.2/11.2ratios)are produced in evolved stars,while HII regions have the highest12.7/11.2(and6.2/11.2)ratios.They attributed this to a different degree of ionization(the11.2µm PAH carriers are neutral,those of the6.2and12.7features are ionized;see also Joblin et al.1996),as well as a processing of PAHs from large(11.2µm)to smaller(12.7µm)carriers.This could imply a less ionizing environment in the IR-faint LINERs than in the IR-bright ones,or significant differences in PAH formation and destruction.3.3.Emission line ratiosThe mid-infrared LINER spectra are rich infine structure emission lines.We have de-tected strong[O IV]lines,i.e.highly ionized gas,in both LINER sub-samples,suggestive of the presence of an AGN in∼90%of the LINERs.The presence of a weak AGN in many of the IR-luminous LINERs is consistent with the compact hard X-ray cores that many IR-bright LINERs show(see2),and is further supported by the detection of[Ne V]in about half of these sources.This detection rate is certainly a lower limit due to the limited S/N in the [Ne V]spectra.In order to explore differences in relative line strengths and physical proper-ties,we have constructed mid-infrared diagnostic diagrams involving various combinations of thesefine structure emission lines.The right panel in Figure2is an example of such a di-agram using the line ratios of[Fe II]26.0µm/[O IV]25.9µm and[O IV]25.9µm/[Ne II]12.8µm. Lutz et al.(2003)used this diagram to distinguish among excitation by early type stars, AGNs,and shocks.Starburst galaxies,Seyfert galaxies,and supernova remnants(SNRs) are clearly separated in this line ratio plane.Interestingly,IR-luminous and IR-faint LIN-ERs are easily distinguishable in this diagram:the IR-luminous LINERs all lie on a linear relation connecting starbursts and Seyferts,which might be explained by a minor AGN con-tribution in addition to the star forming regions(see mixing lines in Lutz et al.2003).The position of the IR-luminous LINERs in this diagram is an indication,that the weak AGN in these objects is more Seyfert-like,i.e.not responsible for the IR-faint LINER emission. Surprisingly,the IR-faint LINERs are clearly offthis line and lie in a region populated by SNRs.The reason for the differences between the two classes(and between IR-faint LINERSand Seyferts)is so far uncertain.The role of the hardness of the radiation spectrum and of the ionization parameter,as well as the influence of shocks in both LINER classes has to be further examined.Can,e.g.,photoionized spectra for certain hardness and ionization parameters reach the‘SNR’region in Figure2c?4.Conclusions and OutlookOur systematic study of LINERs with Spitzer confirms that IR-luminous sources of this group are very different in their properties from IR-faint LINERs.IR-luminous LINERs have infrared SEDs that are similar to those of starburst galaxies,and they are situated in different regions of several diagnostic diagrams than IR-faint LINERs.While this has been suspected from previous multi-wavelength studies,the new mid-IR data provide the cleanest way to separate those groups.Many of these differences are also visible at low spectral resolution, and thus at high redshift,making our discovery an important cosmological tool for discerning the role of AGN in galaxy formation and evolution.A more detailed investigation of this issue and a quantification of the relative contributions from AGN and star formation to the LINER spectra requires a careful disentangling of the spectra into the various components (stellar,HII region,AGN,etc.),involving templatefitting methods.Furthermore,for an understanding of the emission line properties,detailed photoionization modeling is required. This is the subject of a forthcoming paper(D.Rupke et al.,in preparation).This work is based on observations made with the Spitzer Space Telescope,which is operated by the Jet Propulsion Laboratory,California Institute of Technology under a con-tract with NASA.Support for this work was provided by NASA through an award issued by JPL/Caltech.AS and HN thank the Israel Science Foundation for support grants221/03 and232/03.HN acknowledges support by the Humboldt Foundation and thanks the host institution MPE.ReferencesBarth,A.J.&Shields,J.C.2000,PASP,112,753Brandl,B.et al.,2004,ApJS,154,188Bressan et al.1998,A&A,332,135Bressan et al.2006,ApJ,639,L55Carrillo,R.,et al.1999,RMxAA,35,187Dopita,M.A.,&Sutherland,R.S.1995,ApJ,455,468Ferland,G.J.,&Netzer,H.1983,ApJ,264,105Groves et al.2004,ApJS,153,9Heckman.T.M.1980,A&A,87,152Ho,L.C.,et al.1997,ApJS,112,315Hony,S.,et al.2001,A&A,370,1030Higdon,S.J.U.,et al.2004,PASP,116,975Houck,J.R.,et al.2004,ApJS,154,18Joblin C.,Tielens A.G.G.M.,Geballe T.R.,Wooden D.H.1996,ApJ,460,L119 Kewley,L.J.et al.2006,astro-ph/0605681Lutz,D.,et al.2003,A&A,409,867Lutz,D.,et al.2004,A&A,418,465Maoz,D.et al.1998,AJ,116,55Maoz,D.,Nagar,N.M.,Falcke,H.,&Wilson,A.S.2005,ApJ,625,699 Monreal-Ibero,A.,Arribas,S.,&Colina,L.2006,ApJ,637,138Monreal-Ibero,A.,Arribas,S.,&Colina,L.2006,astro-ph/0509681P´e rez-Gonz´a lez,P.G.,et al.2005,ApJ,630,82Pogge,R.W.et al.2000,ApJ,532,323Quataert,E.2001,ASP Conf.Proc.,224,p.71Sanders,D.B.,&Mirabel,I.F.1996,ARA&A,34,749Satyapal,S.,Sambruna,R.M.,&Dudik,R.P.2004,A&A,414,825Shull,J.M.,&McKee,C.F.1979,ApJ,227,131Sturm,E.,et al.2000,A&A,358,481Sturm,E.,et al.2005,A&A,629,L21Veilleux,S.,et al.1995,ApJS98,171Veilleux,S.&Osterbrock,1987,ApJS,63,295Verma,A.,et al.2003,A&A,403,829Table1.The LINER SampleObject Name Catalog Type L IR a L IR/L B b Object Name Catalog Type L IR a L IR/L B bNGC404Ho97d L27.83 1.1 UGC2238RBGS L211.3593NGC3507Ho97L2--NGC1204RBGS L210.96113NGC3884Ho97L1--NGC4666RBGS L210.4429NGC3998Ho97L18.660.6 UGC8387RBGS L211.65135NGC4192Ho97T29.72 4.7 NGC5218RBGS L210.6421NGC4419Ho97T29.9111 IRAS15335-0513RBGS L211.42-NGC4457Ho97L29.27 4.3 ESO602-G025RBGS L211.3347NGC5371Ho97L210.59e 4.9 NGC7591RBGS L211.1247NGC7177Ho97T2--BCDAABCD ABCDFig.1.—Composite spectra(normalized at19µm)of the16IR-luminous LINERs(red)and the IR-faint LINERs(green:5Type1LINERs,blue:6Type2LINERs,yellow:6Transition Type).–11–Fig.2.—Mid-infrared diagnostic diagrams to distinguish between IR-luminous and IR-faint LINERs.In panel a Seyfert data are from Weedman et al.(2005),Starbursts from Sturm et al.(2000,M82)and Brandl et al.(2004,NGC7714);Starburst and Seyfert data in panel c are from Sturm et al.(2002)and Verma et al.(2003).Error bars in all three panels are comparable to the symbol sizes.。
Image-based Fac ¸ade ModelingJianxiong XiaoTian FangPing Tan ∗Peng ZhaoEyal Ofek †Long QuanThe Hong Kong University of Science and Technology ∗National University of Singapore †MicrosoftFigure 1:A few fac¸ade modeling examples from the two sides of a street with 614captured images:some input images in the bottom row,the recovered model rendered in the middle row,and three zoomed sections of the recovered model rendered in the top row.AbstractWe propose in this paper a semi-automatic image-based approach to fac ¸ade modeling that uses images captured along streets and re-lies on structure from motion to recover camera positions and point clouds automatically as the initial stage for modeling.We start by considering a building fac ¸ade as a flat rectangular plane or a developable surface with an associated texture image composited from the multiple visible images.A fac ¸ade is then decomposed and structured into a Directed Acyclic Graph of rectilinear elementary patches.The decomposition is carried out top-down by a recursive subdivision,and followed by a bottom-up merging with the detec-tion of the architectural bilateral symmetry and repetitive patterns.Each subdivided patch of the flat fac ¸ade is augmented with a depth optimized using the 3D points cloud.Our system also allows for an easy user feedback in the 2D image space for the proposed decom-position and augmentation.Finally,our approach is demonstrated on a large number of fac ¸ades from a variety of street-side images.CR Categories:I.3.5[Computer Graphics]:Computational ge-ometry and object modeling—Modeling packages;I.4.5[ImageProcessing and computer vision]:Reconstruction.Keywords:Image-based modeling,building modeling,fac ¸ade modeling,city modeling,photography.1IntroductionThere is a strong demand for the photo-realistic modeling of cities for games,movies and map services such as in Google Earth and Microsoft Virtual Earth.However,most work has been done on large-scale aerial photography-based city modeling.When we zoom to ground level,the viewing experience is often disappoint-ing,with blurry models with few details.On the other hand,many potential applications require street-level representation of cities,where most of our daily activities take place.In term of spatial con-straints,the coverage of ground-level images is close-range.More data need to be captured and processed.This makes street-side modeling much more technically challenging.The current state of the art ranges from pure synthetic methods such as artificial synthesis of buildings based on grammar rules [M¨u ller et al.2006],3D scanning of street fac ¸ades [Fr¨u h and Zakhor 2003],to image-based approaches [Debevec et al.1996].M¨u ller et al.[2007]required manual assignment of depths to the fac ¸ade as they have only one image.However,we do have information from the reconstructed 3D points to automatically infer the critical depth of each primitive.Fr¨u h and Zakhor [2003]required tedious 3D scan-ning,while Debevec et al.[1996]proposed the method for a small set of images that cannot be scaled up well for large scale modelingACM Transaction on Graphics (TOG)Proceedings of SIGGRAPH Asia 2008Figure 2:Overview of the semi-automatic approach to image-based fac ¸ade modeling.of buildings.We propose a semi-automatic method to reconstruct 3D fac ¸ademodels of high visual quality from multiple ground-level street-view images.The key innovation of our approach is the intro-duction of a systematic and automatic decomposition scheme of fac ¸ades for both analysis and reconstruction.The decomposition is achieved through a recursive subdivision that preserves the archi-tectural structure to obtain a Directed Acyclic Graph representation of the fac ¸de by both top-down subdivision and bottom-up merging with local bilateral symmetries to handle repetitive patterns.This representation naturally encodes the architectural shape prior of a fac ¸ade and enables the depth of the fac ¸ade to be optimally com-puted on the surface and at the level of the subdivided regions.We also introduce a simple and intuitive user interface that assists the user to provide feedback on fac ¸ade partition.2Related workThere is a large amount of literature on fac ¸ade,building and archi-tectural modeling.We classify these studies as rule-based,image-based and vision-based modeling approaches.Rule-based methods.The procedural modeling of buildings specifies a set of rules along the lines of L-system.The methods in [Wonka et al.2003;M¨u ller et al.2006]are typical examples of procedural modeling.In general,procedural modeling needs expert specifications of the rules and may be limited in the realism of re-sulting models and their variations.Furthermore,it is very difficult to define the needed rules to generate exact existing buildings.Image-based methods.Image-based methods use images asguide to generate models of architectures interactively.Fac ¸ade de-veloped by Debevec et al.[1996]is a seminal work in this cate-gory.However,the required manual selection of features and the correspondence in different views is tedious,and cannot be scaled up well.M¨u ller et al.[2007]used the limited domain of regular fac ¸ades to highlight the importance of the windows in an architec-tural setting with one single image to create an impressive result of a building fac ¸ade while depth is manually assigned.Although,this technique is good for modeling regular buildings,it is limited to simple repetitive fac ¸ades and cannot be applicable to street-view data as in Figure 1.Oh et al.[2001]presented an interactive sys-tem to create models from a single image.They also manually as-signed the depth based on a painting metaphor.van den Hengel et al.[2007]used a sketching approach in one (or more)image.Although this method is quite general,it is also difficult to scale up for large-scale reconstruction due to the heavy manual interac-tion.There are also a few manual modeling solutions on the market,such as Adobe Canoma,RealViz ImageModeler,Eos Systems Pho-toModeler and The Pixel Farm PFTrack,which all require tedious manual model parameterizations and point correspondences.Vision-based methods.Vision-based methods automatically re-construct urban scenes from images.The typical examples are thework in [Snavely et al.2006;Goesele et al.2007],[Cornelis et al.2008]and the dedicated urban modeling work pursued by Univer-sity of North Carolina at Chapel Hill and University of Kentucky (UNC/UK)[Pollefeys et al.2007]that resulted in meshes on dense stereo reconstruction.Proper modeling with man-made structural constraints from reconstructed point clouds and stereo data has not yet been addressed.Werner and Zisserman [2002]used line seg-ments to reconstruct buildings.Dick et al.[2004]developed 3D architectural modeling from short image sequences.The approach is Bayesian and model based,but it relies on many specific archi-tectural rules and model parameters.Lukas et al.[2006;2008]developed a complete system of urban scene modeling based on aerial images.The result looks good from the top view,but not from the ground level.Our approach is therefore complementary to their system such that the street level details are added.Fr¨u h and Zakhor [2003]also used a combination of aerial imagery,ground color and LIDAR scans to construct models of fac ¸ades.However,like stereo methods,it suffers from the lack of representation for the styles in man-made architectures.Agarwala et al.[2006]composed panoramas of roughly planar scenes without producing 3D models.3OverviewOur approach is schematized in Figure 2.SFM From the captured sequence of overlapping images,we first automatically compute the structure from motion to obtain a set of semi-dense 3D points and all camera positions.We then register the reconstruction with an existing approximate model of the buildings (often recovered from the real images)using GPS data if provided or manually if geo-registration information is not available.Fac ¸ade initialization We start a building fac ¸ade as a flat rectangular plane or a developable surface that is obtained either automatically from the geo-registered approximate building model or we manu-ally mark up a line segment or a curve on the projected 3D points onto the ground plane.The texture image of the flat fac ¸ade is com-puted from the multiple visible images of the fac ¸ade.The detection of occluding objects in the texture composition is possible thanks to the multiple images with parallaxes.Fac ¸ade decomposition A fac ¸ade is then systematically decom-posed into a partition of rectangular patches based on the horizontal and vertical lines detected in the texture image.The decomposition is carried out top-down by a recursive subdivision and followed by a bottom-up merging,with detection of the architectural bilateral symmetry and repetitive patterns.The partition is finally structured into a Directed Acyclic Graph of rectilinear elementary patches.We also allow the user to edit the partition by simply adding and removing horizontal and vertical lines.Fac¸ade augmentation Each subdivided patch of theflat fac¸ade is augmented with the depth obtained from the MAP estimation of the Markov Random Field with the data cost defined by the3D points from the structure from motion.Fac¸ade completion Thefinal fac¸ade geometry is automatically re-textured from all input images.Our main technical contribution is the introduction of a systematic decomposition schema of the fac¸ade that is structured into a Direct Acyclic Graph and implemented as a top-down recursive subdivi-sion and bottom-up merging.This representation strongly embeds the architectural prior of the fac¸ades and buildings into different stages of modeling.The proposed optimization for fac¸ade depth is also unique in that it operates in the fac¸ade surface and in the super-pixel level of a whole subdivision region.4Image CollectionImage capturing We use a camera that usually faces orthogonal to the building fac¸ade and moves laterally along the streets.The camera should preferably be held straight and the neighboring two views should have sufficient overlapping to make the feature corre-spondences computable.The density and the accuracy of the recon-structed points vary,depending on the distance between the camera and the objects,and the distance between the neighboring viewing positions.Structure from motion Wefirst compute point correspondences and structure from motion for a given sequence of images.There are standard computer vision techniques for structure from mo-tion[Hartley and Zisserman2004].We use the approach described in[Lhuillier and Quan2005]to compute the camera poses and a semi-dense set of3D point clouds in space.This technique is used because it has been shown to be robust and capable of providingsufficient point clouds for object modelingpurposes.(a)(b)(c)Figure3:A simple fac¸ade can be initialized from aflat rectangle (a),a cylindrical portion(b)or a developable surface(c).5Fac¸ade InitializationIn this paper,we consider that a fac¸ade has a dominant planar struc-ture.Therefore,a fac¸ade is aflat plane with a depthfield on the plane.We also expect and assume that the depth variation within a simple fac¸ade is moderate.A real building fac¸ade having complex geometry and topology could therefore be broken down into mul-tiple simple fac¸ades.A building is merely a collection of fac¸ades, and a street is a collection of buildings.The dominant plane of the majority of the fac¸ades isflat,but it can be curved sometimes as well.We also consider the dominant surface structure to be any cylinder portion or any developable surface that can be swept by a straight line as illustrated in Figure3.To ease the description,but without loss of generality,we use aflat fac¸ade in the remainder of the paper.For the developable surface,the same methods as forflat fac¸ades in all steps are used,with trivial surface parameterizations. Some cylindrical fac¸ade examples are given in the experiments.Algorithm1Photo Consistency Check For Occlusion Detection Require:A set of N image patches P={p1,p2,...,p N}cor-responding to the projections{x i}of the3D point X. Require:η∈[0,1]to indicate when two patches are similar.1:for all p i∈P do2:s i←0⊲Accumulated similarity for p i 3:for all p j∈P do4:s ij←NCC(p i,p j)5:if s ij>ηthen s i←s i+s ij6:end if7:end for8:end for9: n←arg max i s i⊲ n is the patch with best support 10:V←∅⊲V is the index set with visible projection 11:O←∅⊲V is the index set with occluded projection 12:for all p i∈P do13:if s i n>ηthen V←V∪{i}14:else O←O∪{i}15:end if16:end for17:return V and O5.1Initial Flat RectangleThe reference system of the3D reconstruction can be geo-registered using GPS data of the camera if available or using an interactive technique.Illustrated in Figure2,the fac¸ade modeling process can begin with an existing approximate model of the build-ings often reconstructed from areal images,such as publicly avail-able from Google Earth and Microsoft Virtual Earth.Alternatively, if no such approximate model exists,a simple manual process in the current implementation is used to segment the fac¸ades,based on the projections of the3D points to the groundfloor.We draw a line segment or a curve on the ground to mark up a fac¸ade plane as aflat rectangle or a developable surface portion.The plane or surface position is automaticallyfitted to the3D points or manually adjusted if necessary.5.2Texture CompositionThe geometry of the fac¸ade is initialized as aflat ually, a fac¸ade is too big to be entirely observable in one input image.We first compose a texture image for the entire rectangle of the fac¸ade from the input images.This process is different from image mo-saic,as the images have parallax,which is helpful for removing the undesired occluding objects such as pedestrians,cars,trees,tele-graph poles and trash cans,that lies in front of the target fac¸ade. Furthermore,the fac¸ade plane position is known,compared with an unknown spatial position in stereo algorithms.Hence,the photo consistency constraint is more efficient and robust for occluding object removal,with a better texture image than a pure mosaic. Multi-view occlusion removal As in many multiple view stereo methods,photo consistency is defined as follows.Consider a 3D point X=(x,y,z,1)′with color c.If it has a projection, x i=(u i,v i,1)′=P i X in the i-th camera P i,under the Lam-bertian surface assumption,the projection x i should also have the same color,c.However,if the point is occluded by some other ob-jects in this camera,the color of the projection is usually not the same as c.Note that c is unknown.Assuming that point X is visible from multiple cameras,I={P i},and occluded by some objects in the other cameras,I′={P j},then the color,c i,of the projections in I should be the same as c,while it may be differ-ent from the color,c j,of projections in I′.Now,given a set of projection colors,{c k},the task is to identify a set,O,of the oc-(a)Indicate(b)Remove(c)Inpaint(d)Guide(e)ResultFigure4:Interactive texture refinement:(a)drawn strokes on theobject to indicate removal.(b)the object is removed.(c)automati-cally inpainting.(d)some green lines drawn to guide the structure.(e)better result achieved with the guide lines.cluded cameras.In most situations,we can assume that point X isvisible from most of the cameras.Under this assumption,we have c≈median k{c k}.Given the estimated color of the3D point c,it is now very easy to identify the occluded set,O,according to theirdistances with c.To improve the robustness,instead of a singlecolor,the image patches centered at the projections are used,andpatch similarity,normalized cross correlation(NCC),is used as ametric.The details are presented in Algorithm1.In this way,withthe assumption that the fac¸ade is almost planar,each pixel of thereference texture corresponds to a point that lies on theflat fac¸ade.Hence,for each pixel,we can identify whether it is occluded in aparticular camera.Now,for a given planar fac¸ade in space,all vis-ible images arefirst sorted according to the fronto-parallelism ofthe images with respect to the given fac¸ade.An image is said tobe more fronto-parallel if the projected surface of the fac¸ade in theimage is larger.The reference image isfirst warped from the mostfronto-parallel image,then from the lesser ones according to thevisibility of the point.Inpainting In each step,due to existence of occluding objects,some regions of the reference texture image may still be left empty.In a later step,if an empty region is not occluded and visible fromthe new camera,the region isfilled.In this way of a multi-viewinpainting,the occluded region isfilled from each single camera.At the end of the process,if some regions are still empty,a nor-mal image inpainting technique is used tofill it either automatically[Criminisi et al.2003]or interactively as described in Section5.3.Since we have adjusted the cameras according to the image corre-spondences during bundle adjustment of structure from motion,thissimple mosaic without explicit blending can already produce veryvisually pleasing results.5.3Interactive RefinementAs shown in Figure4,if the automatic texture composition result isnot satisfactory,a two-step interactive user interface is provided forrefinement.In thefirst step,the user can draw strokes to indicatewhich object or part of the texture is undesirable as in Figure4(a).The corresponding region is automatically extracted based on theinput strokes as in Figure4(b)using the method in[Li et al.2004].The removal operation can be interpreted as that the most fronto-parallel and photo-consistent texture selection,from the result ofAlgorithm1,is not what the user wants.For each pixel, n fromLine9of Algorithm1and V should be wrong.Hence,P is up-dated to exclude V:P←O.Then,if P=∅,Algorithm1isrun again.Otherwise,image inpainting[Criminisi et al.2003]isused for automatically inpainting as in Figure4(c).In the secondstep,if the automatic texturefilling is poor,the user can manuallyspecify important missing structural information by extending a fewcurves or line segments from the known to the unknown regions asin Figure4(d).Then,as in[Sun et al.2005],image patches are syn-thesized along these user-specified curves in the unknown regionusing patches selected around the curves in the known region byLoopy Belief Propagation tofind the optimal patches.After com-pleting the structural propagation,the remaining unknownregions(a)Input(b)Structure(c)WeightACB DEHF G(d)SubdivideM(e)Merge Figure5:Structure preserving subdivision.The hidden structure of the fac¸ade is extracted out to form a grid in(b).Such hypotheses are evaluated according to the edge support in(c),and the fac¸ade is recursively subdivided into several regions in(d).Since there is not enough support between Regions A,B,C,D,E,F,G,H,they are all merged into one single region M in(e).arefilled using patch-based texture synthesis as in Figure4(e).6Fac¸ade DecompositionBy decomposing a fac¸ade we try to best describe the faced struc-ture,by segmenting it to a minimal number of elements.The fac¸ades that we are considering inherit the natural horizontal and vertical directions by construction.In thefirst approximation,we may take all visible horizontal and vertical lines to construct an ir-regular partition of the fac¸ade plane into rectangles of various sizes. This partition captures the global rectilinear structure of the fac¸ades and buildings and also keeps all discontinuities of the fac¸ade sub-structures.This usually gives an over-segmentation of the image into patches.But this over-segmentation has several advantages. The over-segmenting lines can also be regarded as auxiliary lines that regularize the compositional units of the fac¸ades and buildings. Some’hidden’rectilinear structures of the fac¸ade during the con-struction can also be rediscovered by this over-segmentation pro-cess.6.1Hidden Structure DiscoveryTo discover the structure inside the fac¸ade,the edge of the reference texture image isfirst detected[Canny1986].With such edge maps, Hough transform[Duda and Hart1972]is used to recover the lines. To improve the robustness,the direction of the Hough transform is constrained to only horizontal and vertical,which happens in most architectural fac¸ades.The detected lines now form a grid to parti-tion the whole reference image,and this grid contains many non-overlapping short line segments by taking intersections of Hough lines as endpoints as in Figure5(b).These line segments are now the hypothesis to partition the fac¸ade.The Hough transformation is good for structure discovery since it can extract the hidden global information from the fac¸ade and align line segments to this hidden structure.However,some line segments in the formed grid may not really be a partition boundary between different regions.Hence,the weight,w e,is defined for each line segment,e,to indicate the like-lihood that this line segment is a boundary of two different regions as shown in Figure5(c).This weight is computed as the number of edge points from the Canny edge map covered by the line segment.Remark on over-segmented partition It is true that the current partition schema is subject to segmentation parameters.But it is important to note that usually a slightly over-segmented partition is not harmful for the purpose of modeling.A perfect partition cer-tainly eases the regularization of the fac¸ade augmentation by depth as presented in the next section.Nevertheless,an imperfect,partic-ularly a slight over-segmented partition,does not affect the model-ing results when the3D points are dense and the optimization works well.(a)Edge weight support(b)Regional statistics supportFigure 6:Merging support evaluation.6.2Recursive SubdivisionGiven a region,D ,in the texture image,it is divided into two sub rectangular regions,D 1and D 2,such that D =D 1∪D 2,by a line segment L with strongest support from the edge points.After D is subdivided into two separate regions,the subdivision procedures continue on the two regions,D 1and D 2,recursively.The recursive subdivision procedure is stopped if either the target region,D ,is too small to be subdivided,or there is not enough support for a division hypothesis,i.e.,region D is very smooth.For a fac ¸ade,the bilateral symmetry about a vertical axis may not exist for the whole fac ¸ade,but it exists locally and can be used for more robust subdivision.First,for each region,D ,the NCC score,s D ,of the two halves,D 1and D 2,vertically divided at the center of D is computed.If s D >η,region D is considered to have bilateral symmetry.Then,the edge map of D 1and D 2are averaged,and subdivision is recursively done on D 1only.Finally,the subdivision in D 1is reflected across the axis to become the subdivision of D 2,and merged the two subdivisions into the subdivision of D .Recursive subdivision is good to preserve boundaries for man-made structural styles.However,it may produce some unnecessary fragments for depth computation and rendering as in Figure 5(d).Hence,as a post-processing,if two neighboring leaf subdivision re-gions,A and B ,has not enough support,s AB ,to separate them,they are merged into one region.The support,s AB ,to separate two neighbor regions,A and B ,is defined to be the strongest weight of all the line segments on the border between A and B :s AB =max e {w e }.However,the weights of line segments can only offer a local image statistic on the border.To improve the ro-bustness,a dual information region statistic between A and B can be used more globally.As in Figure 6,Since regions A and B may not have the same size,this region statistic similarity is defined as follows:First,an axis is defined on the border between A and B ,and region B is mirrored on this axis to have a region,−→B .The over-lapped region,A ∩−→B between A and −→B is defined to be the pixelsfrom A with locations inside −→B .In a similar way,←−A ∩B containsthe pixels from B with locations inside ←−A ,and then it is mirrored tobecome −−−−→←−A ∩B according the the same axis.The normalized crosscorrelation (NCC)between A ∩−→B and −−−−→←−A ∩B is used to define the regional similarity of A and B .In this way,only the symmetric part of A and B is used for region comparison.Therefore,the effect of the other far-away parts of the region is avoided,which will happen if the size of A and B is dramatically different and global statistics,such as the color histogram,are used.Weighted by a parameter,κ,the support,s AB ,to separate two neighboring regions,A and B ,is now defined ass AB =max e{w e }−κNCC (A ∩−→B ,−−−−→←−A ∩B ).Note that the representation of the fac ¸ade is a binary recursive tree before merging and a Directed Acyclic Graph (DAG)after region merging.The DAG representation can innately support the Level of Detail rendering technique.When great details are demanded,the rendering engine can go down the rendering graph to expand all detailed leaves and render them correspondingly.Vice versa,the(x 1,y 1)(x 4,y 4)(x 2,y 2)(x 3,y 3)(a)Fac ¸ade(b)DAGFigure 7:A DAG for repetitive pattern representation.intermediate node is rendered and all its descendents are pruned atrendering time.6.3Repetitive Pattern RepresentationThe repetitive patterns of a fac ¸ade locally exist in many fac ¸ades and most of them are windows.[M¨u ller et al.2007]used a compli-cated technique for synchronization of subdivisions between differ-ent windows.To save storage space and to ease the synchroniza-tion task,in our method,only one subdivision representation for the same type of windows is maintained.Precisely,a window tem-plate is first detected by a trained model [Berg et al.2007]or man-ually indicated on the texture images.The templates are matched across the reference texture image using NCC as the measurement.If good matches exist,they are aligned to the horizontal or vertical direction by a hierarchical clustering,and the Canny edge maps on these regions are averaged.During the subdivision,each matched region is isolated by shrinking a bounding rectangle on the average edge maps until it is snapped to strong edges,and it is regarded as a whole leaf region.The edges inside these isolated regions should not affect the global structure,and hence these edge points are not used during the global subdivision procedure.Then,as in Figure 7,all the matched leaf regions are linked to the root of a common subdivision DAG for that type of window,by introducing 2D trans-lation nodes for the pivot position.Recursive subdivision is again executed on the average edge maps of all matched regions.To pre-serve photo realism,the textures in these regions are not shared and only the subdivision DAG and their respective depths are shared.Furthermore,to improve the robustness of the subdivision,the ver-tical bilateral symmetric is taken as a hard constraint for windows.6.4Interactive Subdivision RefinementIn most situations,the automatic subdivision works satisfactorily.If the user wants to refine the subdivision layout further,three line op-erations and two region operations are provided.The current auto-matic subdivision operates on the horizontal and vertical directions for robustness and simplicity.The fifth ‘carve’operator allows the user to sketch arbitrarily shaped objects manually,which appear less frequently,to be included in the fac ¸ade representation.AddIn an existing region,the user can sketch a stroke to indi-cate the partition as in Figure 8(a).The edge points near the stroke are forced to become salient,and hence the subdivision engine can figure the line segment out and partition the region.DeleteThe user can sketch a zigzag stroke to cross out a linesegment as in Figure 8(b).ChangeThe user can first delete the partition line segments andthen add a new line segment.Alternatively,the user can directly sketch a stroke.Then,the line segment across by the stroke will be deleted and a new line segment will be constructed accordingly as in Figure 8(c).After the operation,all descendants with the target。
Chinese Word Segmentation Using Minimal Linguistic KnowledgeAitao ChenSchool of Information Management and SystemsUniversity of California at BerkeleyBerkeley,CA94720,USAaitao@AbstractThis paper presents a primarily data-driven Chi-nese word segmentation system and its perfor-mances on the closed track using two corpora atthefirst international Chinese word segmentationbakeoff.The system consists of a new words rec-ognizer,a base segmentation algorithm,and pro-cedures for combining single characters,suffixes,and checking segmentation consistencies.1IntroductionAt thefirst Chinese word segmentation bakeoff,we partici-pated in the closed track using the Academia Sinica corpus(for short)and the Beijing University corpus(forshort).We will refer to the segmented texts in the trainingcorpus as the training data,and to both the unsegmentedtesting texts and the segmented texts(the reference texts)as the testing data.For details on the word segmentationbakeoff,see(Sproat and Emerson,2003).2Word segmentationNew texts are segmented in four steps which are describedin this section.New words are automatically extracted fromthe unsegmented testing texts and added to the base dictio-nary consisting of words from the training data before thetesting texts are segmented,line by line.2.1Base segmentation algorithmGiven a dictionary and a sentence,our base segmenta-tion algorithmfinds all possible segmentations of the sen-tence with respect to the dictionary,computes the prob-ability of each segmentation,and chooses the segmenta-tion with the highest probability.If a sentence of char-acters,,has a segmentation of words,,then the probability of the segmentationis estimated as,where denotes a segmentation of a sentence.The prob-ability of a word is estimated from the training corpus as, where is the number of times that character occurs in the training data,and is the number of times that character is in a word of two or more characters. We do not want to combine the single characters that oc-cur as words alone more often than not.For both the PK training data and the AS training data,we divided the train-ing data into two parts,two thirds for training,and one third for system development.We found that setting the thresh-old of the in-word probability to0.85or around works best on the development data.After the initial segmentation of a sentence,the consecutive single-characters are com-bined into one word if their in-word probabilities are over the threshold of0.85.The text fragmentcontains a new word which is not in the PK training data.After the initial segmentation,the text is segmented into/////,which is subsequently changed into//after combining the three consecutive characters.The in-word probabilities for the three characters and are0.94,0.98,and 0.99,respectively.2.3Combining suffixesA small set of characters,such as and fre-quently occur as the last character in words.We selected 145such characters from the PK training corpus,and113 from the AS corpus.After combining single characters,we combine a suffix character with the word preceding it if the preceding word is at least two-character long.2.4Consistency checkThe last step is to perform consistency checks.A seg-mented sentence,after combining single characters and suf-fixes,is checked against the training data to make sure that a text fragment in a testing sentence is segmented in the same way as in the training data if it also occurs in the training data.From the PK training corpus,we cre-ated a phrase segmentation table consisting of word quad-grams,trigrams,bigrams,and unigrams,together with their segmentations and frequencies.Our phrase table created from the AS corpus does not include word quad-grams to reduce the size of the phrase table.For example,from the training text///we create the following entries(only some are listed to save space): text fragment freq segmentationAfter a new sentence is processed by thefirst three steps, we look up every word quad-grams of the segmented sen-tence in the phrase segmentation table.When a word quad-gram is found in the phrase segmentation table with a differ-ent segmentation,we replace the segmentation of the word quad-gram in the segmented sentence by its segmentation found in the phrase table.This process is continued to word trigrams,word bigrams,and word unigrams.The idea is that if a text fragment in a new sentence is found in the training data,then it should be segmented in the same way as in the training data.As an example,in the PK testing data,the sentence is segmented into///////after thefirst three steps(the two characters and are not,but should be,combined because the in-word probability of character which is0.71,is below the pre-defined threshold of0.85).The word bigram is found in the phrase segmentation table with a differ-ent segmentation,//So the segmentation /is changed to the segmentation// in thefinal segmented sentence.In essence,when a text fragment has two or more segmentations,its surrounding context,which can be the preceding word,the following word,or both,is utilized to choose the most appropriate segmentation.When a text fragment in a testing sentence never occurred in the same context in the training data,then the most frequent segmentation found in the training data is chosen.Consider the text again,in the testing data, is segmented into//by our base algorithm.In this case,never occurred in the context of or The consistency check step changes//into/// since is segmented into/515times,but is treated as one word105times in the training data. 3New words recognitionWe developed a few procedures to identify new words in the testing data.Ourfirst procedure is designed to recog-nize numbers,dates,percent,time,foreign words,etc.We defined a set of characters consisting of characters such as the digits‘0’to‘9’(in ASCII and GB),the letters‘a’to ’z’,‘A’to‘Z’(in ASCII and GB),‘’,and the like.Any consecutive sequence of the characters that are in this pre-defined set of characters is extracted and post-processed.A set of rules is implemented in the post-processor.One such rule is that if an extracted text fragments ends with the character and contains any character inthen remove the ending character and keep the remaining fragment as a word.For example,our recognizer will extract the text fragment andsince all the characters are in the pre-defined set of charac-ters.The post-processor will strip off the trailing character and return and as words.For per-sonal names,we developed a program to extract the names preceding texts such as and a pro-gram to detect and extract names in a sequence of names separated by the Chinese punctuation“”,such asa program to extractdict P1pkd10.8380.050 2pkd20.8920.347 3pkd30.9200.507 4pkd30.9350.610 5pkd30.9400.655 6pkd30.9380.647steps R F10.9500.9430.97010.9500.9470.9681-20.9510.9510.9641-30.9490.9510.9611-40.9660.9610.980 Table2:Results for the closed track using the AS corpus. corpus R Fasd10.9120.000PK0.9090.8670.972 Table3:Performances of the maximum matching(forward) using words from the training data.pus,pkd2consists of the words in pkd1and the words con-verted from pkd1by changing the GB encoding to ASCII encoding for the numeric digits and the English letters,and pkd3consists of the words in pkd2and the words automat-ically extracted from the PK testing texts using the proce-dures described in section3.The columns labeled R,P and F give the recall,precision,and F score,respectively.The columns labeled and show the recall on out-of-vocabulary words and the recall on in-vocabulary words, respectively.All evaluation scores reported in this paperare computed using the score program written by Richard Sproat.We refer readers to(Sproat and Emerson,2003)for details on the evaluation measures.For example,row4in table1gives the results using pkd3dictionary when a sen-tence is segmented by the base algorithm,and then the sin-gle characters in the initial segmentation are combined,but suffixes are not attached and consistency check is not per-formed.The last row in table2presents our official results for the closed track using the AS corpus.The asd1dictio-nary contains only the words from the AS training corpus, while the asd2consists of the words in asd1and the new words automatically extracted from the AS testing texts us-ing the new words recognition described in section3.The results show that new words recognition and joining single characters contributed the most to the increase in precision, while the consistency check contributed the most to the in-crease in recall.Table3gives the results of the maximum matching using only the words in the training data.While the difference between the F-scores of the maximum match-ing and the base algorithm is small for the PK corpus,the F-score difference for the AS corpus is much larger.Our base algorithm performed substantially better than the max-imum matching for the AS corpus.The performances of our base algorithm on the testing data using the words from the training data are presented in row1in table1for the corpus,and row1in table2for the corpus.5DiscussionsIn this section we will examine in some details the problem of segmentation inconsistencies within the training data,within the testing data,and between training data and test-ing data.Due to space limit,we will only report ourfind-ings in the PK corpus though the same kinds of inconsis-tencies also occur in the AS corpus.We understand that it is difficult,or even impossible,to completely eliminatesegmentation inconsistencies.However,perhaps we couldlearn more about the impact of segmentation inconsisten-cies on a system’s performance by taking a close look at theproblem.We wrote a program that takes as input a segmented cor-pus and prints out the shortest text fragments in the corpus that have two or more segmentations.For each text frag-ment,the program also prints out how the text fragment is segmented,and how many times it is segmented in a partic-ular way.While some of the text fragments,such asand truly have two different segmentations,depend-ing on the contexts in which they occur or the meaningsof the text fragments,others are segmented inconsistently. We ran this program on the PK testing data and found21unique shortest text fragments,which occur87times in to-tal,that have two different segmentations.Some of the text fragments,such as are inconsistently segmented. The fragment occurs twice in the testing data and is segmented into/in one case,but treated as one word in the other case.We found1,500unique shortest textfragments in the PK training data that have two or more seg-mentations,and97unique shortest text fragments that are segmented differently in the training data and in the test-ing data.For example,the text is treated as one word in the training data,but is segmented into// /in the testing data.We found11,136unique short-est text fragments that have two or more segmentations in the AS training data,21unique shortest text fragments that have two or more segmentations in the AS testing data,and 38unique shortest text fragments that have different seg-mentations in the AS training data and in the AS testing data.Segmentation inconsistencies not only exists withintraining and testing data,but also between training and test-ing data.For example,the text fragment occurs35 times in the PK training data and is consistently segmented into”/but the same text fragment,occurring twice in the testing data,is segmented into//in both cases.The text occurs67times in the training data and is treated as one word in all67cases,but the same text,occurring4times in the testing data,is seg-mented into/in all4cases.The text occurs 16times in the training data,and is treated as one word in all cases,but in the testing data,it is treated as one word in three cases and segmented into/in one case.The text is segmented into/in8cases,but treated as one word in one case in the training data.A couple of text fragments seem to be incorrectly segmented.The textin the testing data is segmented into /and the text segmented into/ Our segmented texts of the PK testing data differ fromthe reference segmented texts for580text fragments(427 unique).Out of these580text fragments,126text frag-ments are among the shortest text fragments that have onesegmentation in the training data,but another in the test-ing data.This implies that up to21.7%of the mistakes committed by our system may have been impacted by thesegmentation inconsistencies between the PK training data and the PK testing data.Since there are only38uniqueshortest text fragments found in the AS corpus that are seg-mented differently in the training data and the testing data, the inconsistency problem probably had less impact on ourAS results.Out of the same580text fragments,359text fragments(62%)are new words in the PK testing data.For example,the proper name which is a new word, is incorrectly segmented into/by our system.An-other example is the new word which is treated as one word in the testing data,but is segmented into/ //by our system.Some of the longer text frag-ments that are incorrectly segmented may also involve newwords,so at least62%,but under80%,of the incorrectly segmented text fragments are either new words or involve new words.6ConclusionWe have presented our word segmentation system and the results for the closed track using the corpus and the corpus.The new words recognition,combining single char-acters,and checking consistencies contributed the most to the increase in precision and recall over the performance of the base segmentation algorithm,which works better than maximum matching.For the closed track experiment using the corpus,we found that62%of the text fragments that are incorrectly segmented by our system are actually new words,which clearly shows that to further improve the performance of our system,a better new words recognition algorithm is necessary.Our failure analysis also indicates that up to21.7%of the mistakes made by our system for the PK closed track may have been impacted by the seg-mentation inconsistencies between the training and testing data.ReferencesRichard Sproat and Tom Emerson.2003.The First Interna-tional Chinese Word Segmentation Bakeoff.In proceed-ings of the Second SIGHAN Workshop on Chinese Lan-guage Processing,July11-12,2003,Sapporo,Japan.。
SOURCE SEPARATION IN STRUCTURED NONLINEAR MODELSAnisse TalebAustralian Telecommunications Research InstituteSchool of Electrical and Computer EngineeringCurtin University of TechnologyGPO Box U1987,Perth WA6845.Australiaemail:anisse@.auABSTRACTThis paper discusses several issues related to blind source sepa-ration in nonlinear models.Specifically,separability results show that separation in the general case is impossible,however,for spe-cific nonlinear models the problem does have a solution.A specific set of parametric nonlinear mixtures is considered,this set has the Lie group structure.In the parameter set,a group operation is de-fined and a relative gradient is defined.The latter is applied to design stochastic algorithms for which the equivariance property is shown.1.INTRODUCTION AND PROBLEM STATEMENT Blind source separation in instantaneous and convolutive linear mixtures has been intensively studied over the last decade.How-ever,in a general real world situation the observed signals do not fit this model.Nonlinear models,because of their better approxi-mation capabilities,appear to be powerful tools for modeling such practical situations.Unfortunately,little work has been dedicated to the problem of blind source separation in nonlinear mixtures.Several contributions have been made to the problem[3,9,10, 13,11]however these lacked theoretical justification,relying more on analogies with linear models,rather than providing new theory for the nonlinear blind source separation problem.In[12],Taleb and Jutten propose the separation of a particular class of nonlinear mixtures,known as the post nonlinear mixtures.The observations are a component-wise nonlinear function of a linear instantaneous mixture.The problem of source separation in general nonlinear instanta-neous mixtures consists of retrieving unobserved sources,by only observing a nonlinear mixturewhich writes as:(1) where is an unknown nonlinear one-to-one mapping.This is done by constructing a nonlinear transform(separation struc-ture)in order to isolate in each component of the output vectorthe image of one source:(2) where is a permutation on,and is a function which represents a probable residual distortion.The problem in this form is ill-posed,without additional as-sumptions on the sources signals there is an infinite number of solutions.By assuming statistical independence of the sources, the problem becomes quasi well-posed in the case of linear source separation.In our case,given this assumption,the role of the separation structure is to transform the observations vector to a vector with independent components.Is the statistical independence assumption sufficient to obtain a separation in the sense of equa-tion(2)?The next section is concerned with this question.2.SEPARABILITY AND INDETERMINACIES Provided the independence assumption holds,separability then consists of determining the form of the transformations which leave the components of independent.2.1.Trivial transformationsTrivial transformations are those which map a random vector with independent components into a random vector with independent components thus conserving the independence property.It is shown that a one-to-one mapping is trivial if and only if it can be written:(3) where are arbitrary functions and is any permutation over .From this result,it appears clearly that the objective of source separation is to force to be trivial.2.2.Darmois ResultsIn the general case,if the global transformation is not restricted to belong to a certain set,it has been established by Darmois[8] that any random vector can,using a simple constructive approach, be mapped to a random vector with independent components by an infinite number of ways,.As a corollary of this result,there exist an infinite number of transformations,without particular structure,which alge-braically mix the variables while conserving their statistical inde-pendence.Source separation is then impossible by only using the statistical independence.2.3.Specific modelThe constructive method used by Darmois was mainly based on the free-of-constraints specification of the transformation.In fact,if one restricts the transformation to belong to a specific set,this supplementary constraint associated with the independence as-sumption reduces dramatically the indeterminacies.The charac-terization of the indeterminacies for a specific model follows from solving an independence conservation functional equation:(4)The set of distributions for which the solutions of this equations are not trivial contains those distributions for which separation is impossible.By determining this set,one identifies the sources for which separation is possible.For these sources,they are restored up to a trivial transformation of the model.3.STRUCTURED NONLINEAR MODELS3.1.The set of separating transformationsLet us consider the generic nonlinear mixture of independent sources which writes as(1).As mentioned above,the global trans-formation must not model any nonlinear transfor-mation,in fact this will generate strong indeterminacies which provide solutions without interest.We then suppose that has a particular form,formally is constrained to belong to a specific set.Each element of the set of separating transformations is uniquely determined by a vector of parameters belonging to .We then have:(5) More over,this set is assumed to be structured:The couple forms a group.This group structure imposed on induces a group structure on the parameter set with a certain law satisfying(6) The neutral element of is denoted,and corresponds to the neutral element of:.The inverse of,denoted,is defined by the relations.It is interesting to notice that since,at least in theory,is de-signed to invert the mixing transformation,the latter also be-longs to,hence there exists such that.The global transformation writes then as and obviously belongs to.3.2.A relative gradient inThe previous algebraic properties are the basis for the definition of a relative gradient on.The relative gradient for linear mixtures was introduced by Cardoso et al.[6],independently identified by Cichoki et al.[7]as being efficient andfinally known as natural gradient by Amari[1].The definition of the relative gradient has been extended to location-scale models by Cardoso[5].A com-parison of the natural and relative gradient has been studied in[4].The idea behind the relative gradient is to objectively compare the variations of a function at different points.The objectivity de-pends on the structure of the function’s parameter set.In our case, we want to have a measure of variation with respect to relative deviations from the identity transformation.To put this idea into execution,let,and let a small variation of,we then have:(7) The transformations in the neighborhood of the identity can be pa-rameterized by,where is any normed vector,and is a small scalar.Hence,the right translation maps any neighbor-hood of the identity to a neighborhood of,which can be written as:(8) and which means that we have a relative deviation of from.From the previous discussion,one can define the relative gradi-ent with respect to of a function with scalar values by the -dimensional vector function,(9) where is a function such that(12)The relative gradient as defined by(9)naturally provides with a Riemannian structure.The metric tensor is given by the matrixwhich is positive definite since the right translation is one-to-one.This metric is the only one which leaves vector lengths invariant under the group translation opera-tion i.e..For instance,let,when varies the point defines a curve in the manifold.The tangent vector to the curve at the point,is defined by:4.ESTIMATING EQUATIONSStatistical independence can be measured by different manners. Mutual information is one independence measure.It is defined by:(15) where denote the differential entropy,.Mutual information is always positive,and vanishes if and only if the components of the random vector are statistically independent.The separation structure can then be estimated by minimizing the output mutual information :argmin argmin(16) It is clear that,in practice,the optimization of(16)cannot be done directly since one does not have access to the output distributions, these are either replaced by a guess or by‘plugging’at the outputa density estimation procedure.The relative gradient of mutual information can be computed by using thefirst order expansion of when varies as,and it is found that(17) where is the vector of score func-tions,defined bytrace(21)where for all,.This equation, when solved,provide a batch estimate of.5.GENERIC ADAPTIVE EQUIV ARIANT ALGORITHMS Most stochastic adaptive algorithms use an additive update rule, for instance the estimation of a parameter writes as[2]:(22) where represent the on-line observations of the system,andis the sequence of vectors to be recursively updated.The function defines how the parameter is updated as a function of the new observations,while defines a small perturbation on the algorithm.The most common form of adaptive algorithms correspond to .Convergence of the adaptive algorithm(22)is governed by the associated Ordinary Differential Equation(ODE)which is defined as:(23) where,for stationary on-line observations,the mean vector field is given by.Roughly speaking,the behavior of the algorithm(22)will be approximated by that of the ODE(23),the reader is invited to con-sult[2]and the technical details therein.In our context,due to the use of the relative gradient,the pro-posed adaptive algorithm will use the serial update rule which has been used by Cardoso and Laheld[6]for adaptive linear source separation case.The serial updating rule relies heavily on the fact that we use a transformation model belonging to a group of trans-formations,for instance the group in linear source separa-tion.It consists of plugging a transformation close to the identity at the output of the current estimate to get the new estimate(Fig.1).This suggests the following non-additive update rule:(24) where is the output at time of the system:.Fig.1.Serial update ruleClearly,the proposed algorithm(24)does not follow,in the gen-eral case,the general form of an adaptive algorithm(22).This oc-curs because the binary operation is,in general,non-distributive with respect to vector addition.Linear source separation is an ex-ception since matrix multiplication is distributive with respect to addition.For sufficiently small,the convergence of(24)will only be affected byfirst order terms in.In fact,assuming that is twice continuously differentiable,a second order expansion of(24) can be written as:(25)with bounded for and.The mean vectorfield associated with the algorithm is then(26)the term inside the expectation operator is the classical gradient of the cost function.Here,we see that the use of the relative gra-dient is equivalent to a simple stochastic gradient descent in the Riemannian manifold possessing the metric tensor.Let be a solution of the ODE with an initial point,and let,then:(27) where we used that.If is the parameter of the mixing system,then,the global parameter of,will be solution of:(28)It is obvious that the trajectory depends only on ,hence changing the parameter of the mixing system would have the same effect on the algorithm global trajectory as changing the algorithm initial condition.The key point is that,as pointed out by Cardoso et al.[6]in the linear case,since the objective is the recovery of the initial source signals,the performances of the adaptive algorithm are essentially determined by the closeness of the global system parameter to the neutral element,and does not depend on the individual values of or.Hence,by the use of the relative gradient,the adaptive algorithm inherits the equivariant property from batch algorithms.The curious reader may ask why use the update rule(24),when we can use the following:(29) which corresponds to the standard stochastic discretization of the ODE?In fact,in the design of a stochastic adaptive algorithm,one starts by specifying the algorithm’s ODE and then derives a stochastic Euler-like discretization of the latter.For infinitely small step sizes,the two rules are equivalent,however,thefirst up-date rule(24)respects the uniform performance property in dis-crete time while the latter does not.In fact,applying to the two members of(24),and denoting leads to:(30) which presents the same equivariance property as(28)but in discrete-time.The complete study of the algorithm global perfor-mance follows the classical of the Lyapunov equation[2],solution of which provides the covariance matrix of:. This issue is problem dependent and will not be addressed in this paper.To summarize,the use of the relative gradient,whose definition is consistent with the parameter group operation,allows the design of non-additive adaptive equivariant algorithms.This fact,already known for the linear source separation problem,is here extended to more complex and general source separation models.6.CONCLUSIONThe blind source separation problem in its full generality is a very challenging statistical problem.Nonlinear instantaneous mixtures provide one level of this generality.It is shown that the generaliza-tion is not straightforward,the underlying indeterminacies when dealing with general nonlinear models are very strong.This makes the problem ill-posed and suggests its recasting for specific mod-els.Structured nonlinear models are one specific model of nonlinear mixtures.Taking into account the Lie group structure they posses, one can define a relative gradient whose use in an adaptive con-text leads to adaptive equivariant algorithms.This constitutes a generalization of equivariant linear source separation algorithms.7.REFERENCES[1]S-I.Amari.Natural gradient works efficiently in learning.Neural Computation,10:251–276,1998.[2] A.Benveniste,M.M´e tivier,and P.Priouret.Adaptive Al-gorithms and Stochastic Approximations.Springer-Verlag, 1990.[3]G.Burel.Blind separation of sources:a nonlinear neuralalgorithme.Neural Networks,5:937–947,1992.[4]J.-F.Cardoso.Learning in manifolds:the case of sourceseparation.In Proc.SSAP’98,1998.[5]J.-F.Cardoso and S.-I.Amari.Maximum likelihood sourceseparation:equivariance and adaptivity.In Proc.of SYSID’97,11th IFAC symposium on system identification, Fukuoka,Japan,pages1063–1068,1997.[6]J-F.Cardoso and held.Equivariant adaptive source sep-aration.IEEE Trans.on S.P.,44(12):3017–3030,December 1996.[7] A.Cichoki and Unbehauen R.New neural networks with on-line learning for blind identification and blind separation of sources.IEEE Transactions on Circuits and Systems-I:Fun-damental Theory and Applications,43:894–906,November 1996.[8]G.Darmois.Analyse des liaisons de probabilit´e.In Proceed-ings Int.Stat.Conferences1947,volume III A,page231, Washington(D.C.),1951.[9]P.Pajunen,A.Hyvarinen,and J.Karhunen.Nonlinear sourceseparation by self-organizing maps.In ICONIP96,vol-ume2,pages1207–1210,Hong-Kong,September1996. [10]P.Pajunen and J.Karhunen.A maximum likelihood ap-proach to nonlinear blind source separation.In ICANN’97, pages541–546,Lausanne(Switzerland),October1997. [11] C.G.Puntonet,M.R.Alvarez,A.Prieto,and B.Prieto.Sep-aration of sources in a class of post-nonlinear mixtures.In ESANN98,pages321–326,Bruges(Belgium),April1998.[12] A.Taleb and C.Jutten.Source separation in post nonlinearmixtures.IEEE Trans.on SP,47(10):2807–2820,1999. [13]H.H.Yang,S.Amari,and rmation-theoretic approach to blind separation of sources in nonlinear mixture.Signal Processing,64(3):291–300,1998.。