Smoothing Supernova Data to Reconstruct the Expansion History of the Universe and its Age

格式：pdf
大小：1.38 MB
文档页数：17

下载文档原格式

mnn模型量化剪枝蒸馏

mnn模型量化剪枝蒸馏英文回答：Quantization, pruning, and distillation are three commonly used techniques in model compression to reduce the size and improve the efficiency of deep neural networks. In this answer, I will explain the process of quantization, pruning, and distillation in the context of MNN model compression.Quantization is the process of reducing the precision of weights and activations in a neural network. By quantizing the model, we can represent the weights and activations using fewer bits, which leads to a smaller model size and faster inference time. For example, instead of using 32-bit floating-point numbers, we can use 8-bit integers to represent the weights and activations. This reduces the memory footprint and allows for more efficient computation on hardware with limited resources.Pruning, on the other hand, involves removing unnecessary connections or neurons from a neural network. The idea behind pruning is that not all connections or neurons contribute equally to the network's performance. By removing the less important connections or neurons, we can reduce the model size and improve the inference speed without sacrificing much accuracy. Pruning can be done based on various criteria, such as weight magnitude or activation importance. For example, we can prune connections with small weights or neurons that have low activation values.Distillation is a technique that involves training a smaller "student" network to mimic the behavior of a larger "teacher" network. The teacher network is usually a larger and more accurate model, while the student network is a smaller and less accurate model. The student network is trained to match the output probabilities of the teacher network, using a combination of the teacher's soft targets and the ground truth labels. The idea behind distillation is that the student network can learn from the teacher network's knowledge and generalize better than if it wastrained from scratch. This allows us to compress the knowledge of the larger model into a smaller model without sacrificing much accuracy.To illustrate the process of quantization, pruning, and distillation, let's consider the example of compressing a large image classification model.First, we can start by quantizing the weights and activations of the model. For instance, we can convert the 32-bit floating-point weights to 8-bit integers. This reduces the model size and allows for faster inference on hardware with limited resources.Next, we can apply pruning to remove unnecessary connections or neurons from the model. For example, we can prune connections with small weights or neurons that have low activation values. This further reduces the model size and improves the inference speed.Finally, we can use distillation to train a smaller student network to mimic the behavior of the larger teachernetwork. The student network is trained to match the output probabilities of the teacher network, using a combination of the teacher's soft targets and the ground truth labels. This allows us to compress the knowledge of the larger model into a smaller model without sacrificing much accuracy.中文回答：量化、剪枝和蒸馏是模型压缩中常用的三种技术，用于减小深度神经网络的大小并提高效率。

[ToG13]Poisson Surface Reconstruction

Screened Poisson Surface ReconstructionMICHAEL KAZHDANJohns Hopkins UniversityandHUGUES HOPPEMicrosoft ResearchPoisson surface reconstruction creates watertight surfaces from oriented point sets.In this work we extend the technique to explicitly incorporate the points as interpolation constraints.The extension can be interpreted as a generalization of the underlying mathematical framework to a screened Poisson equation.In contrast to other image and geometry processing techniques,the screening term is deﬁned over a sparse set of points rather than over the full domain.We show that these sparse constraints can nonetheless be integrated efﬁciently.Because the modiﬁed linear system retains the sameﬁnite-element discretization,the sparsity structure is unchanged,and the system can still be solved using a multigrid approach. Moreover we present several algorithmic improvements that together reduce the time complexity of the solver to linear in the number of points, thereby enabling faster,higher-quality surface reconstructions.Categories and Subject Descriptors:I.3.5[Computer Graphics]:Compu-tational Geometry and Object ModelingAdditional Key Words and Phrases:screened Poisson equation,adaptive octree,ﬁnite elements,surfaceﬁttingACM Reference Format:Kazhdan,M.,and Hoppe,H.Screened Poisson surface reconstruction. ACM Trans.Graph.NN,N,Article NN(Month YYYY),PP pages.DOI=10.1145/XXXXXXX.YYYYYYY/10.1145/XXXXXXX.YYYYYYY1.INTRODUCTIONPoisson surface reconstruction[Kazhdan et al.2006]is a well known technique for creating watertight surfaces from oriented point samples acquired with3D range scanners.The technique is resilient to noisy data and misregistration artifacts.However, as noted by several researchers,it suffers from a tendency to over-smooth the data[Alliez et al.2007;Manson et al.2008; Calakli and Taubin2011;Berger et al.2011;Digne et al.2011].In this work,we explore modifying the Poisson reconstruc-tion algorithm to incorporate positional constraints.This mod-iﬁcation is inspired by the recent reconstruction technique of Calakli and Taubin[2011].It also relates to recent work in im-age and geometry processing[Nehab et al.2005;Bhat et al.2008; Chuang and Kazhdan2011],in which a dataﬁdelity term is used to“screen”the associated Poisson equation.In our surface recon-struction context,this screening term corresponds to a soft con-straint that encourages the reconstructed isosurface to pass through the input points.The approach we propose differs from the traditional screened Poisson formulation in that the position and gradient constraints are deﬁned over different domain types.Whereas gradients are constrained over the full3D space,positional constraints are introduced only over the input points,which lie near a2D manifold. We show how these two types of constraints can be efﬁciently integrated,so that we can leverage the original multigrid structure to solve the linear system without incurring a signiﬁcant overhead in space or time.To demonstrate the beneﬁts of screening,Figure1compares results of the traditional Poisson surface reconstruction and the screened Poisson formulation on a subset of11.4M points from the scan of Michelangelo’s David[Levoy et al.2000].Both reconstructions are computed over a spatial octree of depth10,corresponding to an effective voxel resolution of10243.Screening generates a model that better captures the input data(as visualized by the surface cross-sections overlaid with the projection of nearby samples), even though both reconstructions have similar complexity(6.8M and6.9M triangles respectively)and required similar processing time(230and272seconds respectively,without parallelization).1 Another contribution of our work is to modify both the octree structure and the multigrid implementation to reduce the time complexity of solving the Poisson system from log-linear to linear in the number of input points.Moreover we show that hierarchical point clustering enables screened Poisson reconstruction to attain this same linear complexity.2.RELA TED WORKReconstructing surfaces from scanned points is an important and extensively studied problem in computer graphics.The numerous approaches can be broadly categorized as follows. Combinatorial Algorithms.Many schemes form a triangula-tion using a subset of the input points[Cazals and Giesen2006]. Space is often discretized using a tetrahedralization or a voxel grid,and the resulting elements are partitioned into inside and outside regions using an analysis of cells[Amenta et al.2001; Boissonnat and Oudot2005;Podolak and Rusinkiewicz2005], eigenvector computation[Kolluri et al.2004],or graph cut [Labatut et al.2009;Hornung and Kobbelt2006].Implicit Functions.In the presence of sampling noise,a common approach is toﬁt the points using the zero set of an implicit func-tion,such as a sum of radial bases[Carr et al.2001]or piecewise polynomial functions[Ohtake et al.2005;Nagai et al.2009].Many techniques estimate a signed-distance function[Hoppe et al.1992; 1The performance of the unscreened solver is measured using our imple-mentation with screening weight set to zero.The implementation of the original Poisson reconstruction runs in412seconds.ACM Transactions on Graphics,V ol.VV,No.N,Article XXX,Publication date:Month YYYY.2•M.Kazhdan and H.HoppeFig.1:Reconstruction of the David head ‡,comparing traditional Poisson surface reconstruction (left)and screened Poisson surface reconstruction which incorporates point constraints (center).The rightmost diagram plots pixel depth (z )values along the colored segments together with the positions of nearby samples.The introduction of point constraints signiﬁcantly improves ﬁt accuracy,sharpening the reconstruction without amplifying noise.Bajaj et al.1995;Curless and Levoy 1996].If the input points are unoriented,an important step is to correctly infer the sign of the resulting distance ﬁeld [Mullen et al.2010].Our work extends Poisson surface reconstruction [Kazhdan et al.2006],in which the implicit function corresponds to the model’s indicator function χ.The function χis often deﬁned to have value 1inside and value 0outside the model.To simplify the derivations,inthis paper we deﬁne χto be 12inside and −12outside,so that its zero isosurface passes near the points.The function χis solved using a Laplacian system discretized over a multiresolution B-spline basis,as reviewed in Section 3.Alliez et al.[2007]form a Laplacian system over a tetrahedral-ization,and constrain the solution’s biharmonic energy;the de-sired function is obtained as the solution to an eigenvector prob-lem.Manson et al.[2008]represent the indicator function χusing a wavelet basis,and efﬁciently compute the basis coefﬁcients using simple local sums over an adapted octree.Calakli and Taubin [2011]optimize a signed-distance function to have value zero at the points,have derivatives that agree with the point normals,and minimize a Hessian smoothness norm.The resulting optimization involves a bilaplacian operator,which requires estimating derivatives of higher order than in the Laplacian.The reconstructed surfaces are shown to have good accuracy,strongly suggesting the importance of explicitly ﬁtting the points within the optimization.This motivated us to explore whether a Laplacian system could be extended in this respect,and also be compatible with a multigrid solver.Screened Poisson Surface Fitting.The method of Nehab et al.[2005],which simultaneously ﬁts position and normal constraints,may also be viewed as the solution of a screened Poisson equation.The ﬁtting algorithm assumes that a 2D parametric domain (i.e.,a plane or triangle mesh)is already established.The position and derivative constraints are both deﬁned over this 2D domain.In contrast,in Poisson surface reconstruction the 2D domain manifold is initially unknown,and therefore the goal is to infer an indicator function χrather than a parametric function.This leads to a hybrid problem with derivative (Laplacian)constraints deﬁned densely over 3D and position constraints deﬁned sparsely on the set of points sampled near the unknown 2D manifold.3.REVIEW OF POISSON SURFACE RECONSTRUCTIONThe approach of Poisson surface reconstruction is based on the observation that the (inward pointing)normal ﬁeld of the boundary of a solid can be interpreted as the gradient of the solid’s indicator function.Thus,given a set of oriented points sampling the boundary,a watertight mesh can be obtained by (1)transforming the oriented point samples into a continuous vector ﬁeld in 3D,(2)ﬁnding a scalar function whose gradients best match the vector ﬁeld,and (3)extracting the appropriate isosurface.Because our work focuses primarily on the second step,we review it here in more detail.Scalar Function Fitting.Given a vector ﬁeld V :R 3→R 3,thegoal is to solve for the scalar function χ:R 3→R minimizing:E (χ)=∇χ(p )− V (p ) 2d p .(1)Using the Euler-Lagrange formulation,the minimum is obtainedby solving the Poisson equation:∆χ=∇· V .System Discretization.The Galerkin formulation is used totransform this into a ﬁnite-dimensional system [Fletcher 1984].First,a basis {B 1,...,B N }:R 3→R is chosen,namely a collection of trivariate (usually triquadratic)B-spline functions.With respect to this basis,the discretization becomes:∆χ,B i [0,1]3= ∇· V ,B i [0,1]31≤i ≤Nwhere ·,· [0,1]3is the standard inner-product on the space of(scalar-and vector-valued)functions deﬁned on the unit cube:F ,G [0,1]3=[0,1]3F (p )·G (p )d p , U , V [0,1]3=[0,1]3U (p ), V (p ) d p .Since the solution is itself expressed in terms of the basis functions:χ(p )=N∑i =1x i B i (p ),ACM Transactions on Graphics,V ol.VV ,No.N,Article XXX,Publication date:Month YYYY .Screened Poisson Surface Reconstruction•3ﬁnding the coefﬁcients{x i}of the solution reduces to solving the linear system Ax=b where:A i j= ∇B i,∇B j [0,1]3and b i= V,∇B i [0,1]3.(2) The basis functions{B1,...,B N}are chosen to be compactly supported,so most pairs of functions do not have overlapping support,and thus the matrix A is sparse.Because the solution is expected to be smooth away from the input samples,the linear system is discretized byﬁrst adapting an octree to the input samples and then associating an(appropriately scaled and translated)trivariate B-spline function to each octree node. This provides high-resolution detail in the vicinity of the surface while reducing the overall dimensionality of the system.System Solution.Given the hierarchy deﬁned by an octree of depth D,a multigrid approach is used to solve the linear system. The basis functions are partitioned according to the depths of their associated nodes and,for each depth d,a linear system A d x d=b d is deﬁned using the corresponding B-splines{B d1,...,B d Nd},such thatχ(p)=∑D d=0∑i x d i B d i(p).Because the octree-selected B-spline functions do not form a complete grid at each depth,it is generally not possible to prolong the solution x d at depth d into the solution x d+1at depth d+1. (The B-spline associated with a given node is a sum of B-spline functions associated not only with its own child nodes,but also with child nodes of its neighbors.)Instead,the constraints at depth d+1are adjusted to account for the part of the solution already realized at coarser depths.Pseudocode for a cascadic solver,where the solution is only relaxed on the up-stroke of the V-cycle,is given in Algorithm1.Algorithm1:Cascadic Poisson Solver1For d∈{0,...,D}Iterate from coarse toﬁne2For d ∈{0,...,d−1}Remove the constraints3b d=b d−A dd x d met at coarser depths4Relax A d x d=b d Adjust the system at depth dHere,A dd is the N d×N d matrix used to transform solution coefﬁcients at depth d into constraints at depth d:A dd i j= ∇B d i,∇B d j [0,1]3.Note that,by deﬁnition,A d=A dd.Isosurface Extraction.Solving the Poisson equation,one obtains a functionχthat approximates the indicator function.Ideally,the function’s zero level-set should therefore correspond to the desired surface.In practice however,the functionχcan differ from the true indicator function due to several sources of error:—The point sampling may be noisy,possibly containing outliers.—The Galerkin discretization is only an approximation of the continuous problem.—The point sampling density is approximated during octree construction.To mitigate these errors,in[Kazhdan et al.2006]the implicit function is adjusted by globally subtracting the average value of the function at the input samples.4.INCORPORA TING POINT CONSTRAINTSThe original Poisson surface reconstruction algorithm adjusts the implicit function using a single global offset such that its average value at all points is zero.However,the presence of errors can cause the implicit function to drift so that no global offset is satisfactory. Instead,we seek to explicitly interpolate the points.Given the set of input points P with weights w:P→R≥0,we add to the energy of Equation1a term that penalizes the function’s deviation from zero at the samples:E(χ)=V(p)−∇χ(p) 2d p+α·Area(P)∑p∈P∑p∈Pw(p)χ2(p)(3)whereαis a weight that trades off the importance ofﬁtting the gradients andﬁtting the values,and Area(P)is the area of the reconstructed surface,estimated by computing the local sampling density as in[Kazhdan et al.2006].In our implementation,we set the per-sample weights w(p)=1,although one can also use conﬁdence values if these are available.The energy can be expressed concisely asE(χ)= V−∇χ, V−∇χ [0,1]3+α χ,χ (w,P)(4)where ·,· (w,P)is the bilinear,symmetric,positive,semi-deﬁnite form on the space of functions in the unit-cube,obtained by taking the weighted sum of function values:F,G (w,P)=Area(P)∑p∈P w(p)∑p∈Pw(p)·F(p)·G(p).4.1Interpretation as a Screened Poisson EquationThe energy in Equation4combines a gradient constraint integrated over the spatial domain with a value constraint summed at discrete points.As shown in the appendix,its minimization can be interpreted as a screened Poisson equation(∆−α˜I)χ=∇· V with an appropriately deﬁned operator˜I.4.2DiscretizationWe apply a discretization similar to that in Section3to the minimization of the energy in Equation4.The coefﬁcients of the solutionχwith respect to the basis{B1,...,B N}are again obtained by solving a linear system of the form Ax=b.The right-hand-side b is unchanged because the constrained value at the sample points is zero.Matrix A now includes the point constraints:A i j= ∇B i,∇B j [0,1]3+α B i,B j (w,P).(5) Note that incorporating the point constraints does not change the sparsity of matrix A because B i(p)·B j(p)is nonzero only if the supports of the two functions overlap,in which case the Poisson equation has already introduced a nonzero entry in the matrix.As in Section3,we solve this linear system using a cascadic multigrid algorithm–iterating over the octree depths from coarsest toﬁnest,adjusting the constraints,and relaxing the system.Similar to Equation5,the matrix used to transform a solution at depth d to a constraint at depth d is expressed as:A dd i j= ∇B d i,∇B d j [0,1]3+α B d i,B d j (w,P).ACM Transactions on Graphics,V ol.VV,No.N,Article XXX,Publication date:Month YYYY.4•M.Kazhdan and H.HoppeFig.2:Visualizations of the reconstructed implicit function along a planar slice through the cow ‡(shown in blue on the left),for the original Poisson solver,and for the screened Poisson solver without and with scale-independent screening.This operator adjusts the constraint b d (line 3of Algorithm 1)not only by removing the Poisson constraints met at coarser resolutions,but also by modifying the constrained values at points where the coarser solution does not evaluate to zero.4.3Scale-Independent ScreeningTo balance the two energy terms in Equation 3,it is desirable to adjust the screening parameter αsuch that (1)the reconstructed surface shape is invariant under scaling of the input points with respect to the solver domain,and (2)the prolongation of a solution at a coarse depth is an accurate estimate of the solution at a ﬁner depth in the cascadic multigrid approach.We achieve both these goals by adjusting the relative weighting of position and gradient constraints across the different octree depths.Noting that the magnitude of the gradient constraint scales with resolution,we double the weight of the interpolation constraint with each depth:A ddi j = ∇B d i ,∇B dj [0,1]3+2d α B d i ,B dj (w ,P ).The adaptive weight of 2d is chosen to keep the Laplacian and screening constraints around the surface in balance.To see this,assume that the points are locally planar,and consider the row of the system matrix corresponding to an octree node overlapping the points.The coefﬁcients of the system in that row are the sum of Laplacian and screening terms.If we consider the rows corresponding to the child nodes that overlap the surface,we ﬁnd that the contribution from the Laplacian constraints scales by a factor of 1/2while the contribution from the screening term scales by a factor of 1/4.2Thus,scaling the screening weights by a factor of two with each resolution keeps the two terms in balance.Figure 2shows the beneﬁt of scale-independent screening in reconstructing a cow model.The leftmost image shows a plane passing through the bounding cube of the cow,and the images to the right show the values of the computed indicator function along that plane,for different implementations of the solver.As the ﬁgure shows,the unscreened Poisson solver provides a good approximation of the indicator functions,with values inside (resp.outside)the surface approximately 1/2(resp.-1/2).However,applying the same solver to the screened Poisson equation (second from right)provides a solution that is only correct near the input samples and returns to zero near the faces of the bounding cube,2Forthe Laplacian term,the Laplacian scales by a factor of 4with reﬁnement,and volumetric integrals scale by a factor of 1/8.For the screening term,area integrals scale by a factor of 1/4.potentially resulting in spurious surface sheets away from the surface.It is only with scale-independent screening (right)that we obtain a high-quality solution to the screened Poisson ing this resolution adaptive weighting,our system has the property that the reconstruction obtained by solving at depth D is identical to the reconstruction that would be obtained by scaling the point set by 1/2and solving at depth D +1.To see this,we consider the two energies that guide the reconstruc-tion,E V (χ)measuring the extent to which the gradients of the so-lution match the prescribed vector ﬁeld,and E (w ,P )(χ)measuring the extent to which the solution meets the screening constraint:E V (χ)=V (p )−∇χ(p )2d p E (w ,P )(χ)=Area (P )∑p ∈P w (p )∑p ∈Pw (p )χ2(p ).Scaling by 1/2,we obtain a new point set (˜w ,˜P)with positions scaled by 1/2,unchanged weights,˜w (p )=w (2p ),and scaled area,Area (˜P )=Area (P )/4;a new scalar ﬁeld,˜χ(p )=χ(2p );and a new vector ﬁeld,˜ V (p )=2 V (2p ).Computing the correspondingenergies,we get:E ˜ V (˜χ)=1E V(χ)and E (˜w ,˜P )(˜χ)=1E (w ,P )(χ).Thus,scaling the screening weight by a factor of two with eachsuccessive depth ensures that the sum of energies is unchanged (up to multiplication by a constant)so the minimizer remains the same.4.4Boundary ConditionsIn order to deﬁne the linear system,it is necessary to deﬁne the behavior of the function space along the boundary of the integration domain.In the original Poisson reconstruction the authors imposed Dirichlet boundary conditions,forcing the implicit function to havea value of −12along the boundary.In the present work we extend the implementation to support Neumann boundary conditions as well,forcing the normal derivative to be zero along the boundary.In principle these two boundary conditions are equivalent for watertight surfaces,since the indicator function has a constant negative value outside the model.However,in the presence of missing data we ﬁnd Neumann constraints to be less restrictive because they only require that the implicit function have zero derivative across the boundary of the integration domain,a property that is compatible with the gradient constraint since the guiding vector ﬁeld V is set to zero away from the samples.(Note that when the surface does cross the boundary of the domain,the Neumann boundary constraints create a bias to crossing the domain boundary orthogonally.)Figure 3shows the practical implications of this choice when reconstructing the Angel model,which was only scanned from the front.The left image shows the original point set and the reconstructions using Dirichlet and Neumann boundary conditions are shown to the right.As the ﬁgure shows,imposing Dirichlet constraints creates a water-tight surface that closes off before reaching the boundary while using Neumann constraints allows the surface to extend out to the boundary of the domain.ACM Transactions on Graphics,V ol.VV ,No.N,Article XXX,Publication date:Month YYYY .Screened Poisson Surface Reconstruction•5Fig.3:Reconstructions of the Angel point set‡(left)using Dirichlet(center) and Neumann(right)boundary conditions.Similar results can be seen at the bases of the models in Figures1 and4a,with the original Poisson reconstructions obtained using Dirichlet constraints and the screened reconstructions obtained using Neumann constraints.5.IMPROVED ALGORITHMIC COMPLEXITYIn this section we discuss the efﬁciency of our reconstruction al-gorithm.We begin by analyzing the complexity of the algorithm described above.Then,we present two algorithmic improvements. Theﬁrst describes how hierarchical clustering can be used to re-duce the screening overhead at coarser resolutions.The second ap-plies to both the unscreened and screened solver implementations, showing that the asymptotic time complexity in both cases can be reduced to be linear in the number of input points.5.1Efﬁciency of basic solverLet us begin by analyzing the computational complexity of the unscreened and screened solvers.We assume that the points P are evenly distributed over a surface,so that the depth of the adapted octree is D=O(log|P|)and the number of octree nodes at depth d is O(4d).We also note that the number of nonzero entries in matrix A dd is O(4d),since the matrix has O(4d)rows and each row has at most53nonzero entries.(Since we use second-order B-splines, basis functions are supported within their one-ring neighborhoods and the support of two functions will overlap only if one is within the two-ring neighborhood of the other.)Assuming that the matrices A dd have already been computed,the computational complexity for the different steps in Algorithm1is: Step3:O(4d)–since A dd has O(4d)nonzero entries.Step4:O(4d)–since A d has O(4d)nonzero entries and the number of relaxation steps performed is constant.Steps2-3:∑d−1d =0O(4d)=O(4d·d).Steps2-4:O(4d·d+4d)=O(4d·d).Steps1-4:∑D d=0O(4d·d)=O(4D·D)=O(|P|·log|P|). There still remains the computation of matrices A dd .For the unscreened solver,the complexity of computing A dd is O(4d),since each entry can be computed in constant time.Thus, the overall time complexity remains O(|P|·log|P|).For the screened solver,the complexity of computing A dd is O(|P|)since deﬁning the coefﬁcients requires accumulating the screening contribution from each of the points,and each point contributes to a constant number of rows.Thus,the overall time complexity is dominated by the cost of evaluating the coefﬁcients of A dd which is:D∑d=0d−1∑d =0O(|P|)=O(|P|·D2)=O(|P|·log2|P|).5.2Hierarchical Clustering of Point ConstraintsOurﬁrst modiﬁcation is based on the observation that since the basis functions at coarser resolutions are smooth,it is unnecessary to constrain them at the precise sample locations.Instead,we cluster the weighted points as in[Rusinkiewicz and Levoy2000]. Speciﬁcally,for each depth d,we deﬁne(w d,P d)where p i∈P d is the weighted average position of the points falling into octree node i at depth d,and w d(p i)is the sum of the associated weights.3 If all input points have weight w(p)=1,then w d(p i)is simply the number of points falling into node i.This alters the computation of the system matrix coefﬁcients:A dd i j= ∇B d i,∇B d j [0,1]3+2dα B d i,B d j (w d,P d).Note that since d>d ,the value B d i,B d j (w d,P d)is obtained by summing over points stored with theﬁner resolution.In particular,the complexity of computing A dd for the screened solver becomes O(|P d|)=O(4d),which is the same as that of the unscreened solver,and both implementations now have an overall time complexity of O(|P|·log|P|).On typical examples,hierarchical clustering reduces execution time by a factor of almost two,and the reconstructed surface is visually indistinguishable.5.3Conforming OctreesTo account for the adaptivity of the octree,Algorithm1subtracts off the constraints met at all coarser resolutions before relaxing at a given depth(steps2-3),resulting in an algorithm with log-linear time complexity.We obtain an implementation with linear complexity by forcing the octree to be conforming.Speciﬁcally, we deﬁne two octree cells to be mutually visible if the supports of their associated B-splines overlap,and we require that if a cell at depth d is in the octree,then all visible cells at depth d−1must also be in the tree.Making the tree conforming requires the addition of new nodes at coarser depths,but this still results in O(4d)nodes at depth d.While the conforming octree does not satisfy the condition that a coarser solution can be prolonged into aﬁner one,it has the property that the solution obtained at depths{0,...,d−1}that is visible to a node at depth d can be expressed entirely in terms of the coefﬁcients at depth d−ing an accumulation vector to store the visible part of the solution,we obtain the linear-time implementation in Algorithm2.3Note that the weight w d(p)is unrelated to the screening weight2d introduced in Section4.3for scale-independent screening.ACM Transactions on Graphics,V ol.VV,No.N,Article XXX,Publication date:Month YYYY.6•M.Kazhdan and H.HoppeHere,P d d−1is the B-spline prolongation operator,expressing a solution at depth d−1in terms of coefﬁcients at depth d.The number of nonzero entries in P d d−1is O(4d),since each column has at most43nonzero entries,so steps2-5of Algorithm2all have complexity O(4d).Thus,the overall complexity of both the unscreened and screened solvers becomes O(|P|).Algorithm2:Conforming Cascadic Poisson Solver1For d∈{0,...,D}Iterate from coarse toﬁne.2ˆx d−1=P d−1d−2ˆx d−2Upsample coarseraccumulation vector.3ˆx d−1=ˆx d−1+x d−1Add in coarser solution.4b d=b d−A d d−1ˆx d−1Remove constraintsmet at coarser depths.5Relax A d x d=b d Adjust the system at depth d.5.4Implementation DetailsThe algorithm is implemented in C++,using OpenMP for multi-threaded parallelization.We use a conjugate-gradient solver to re-lax the system at each multigrid level.With the exception of the octree construction,most of the operations involved in the Poisson reconstruction can be categorized as operations that either“accu-mulate”or“distribute”information[Bolitho et al.2007,2009].The former do not introduce write-on-write conﬂicts and are trivial to parallelize.The latter only involve linear operations,and are par-allelized using a standard map-reduce approach:in the map phase we create a duplicate copy of the data for each thread to distribute values into,and in the reduce phase we merge the copies by taking their sum.6.RESULTSWe evaluate the algorithm(Screened)by comparing its accuracy and computational efﬁciency with several prior methods:the original Poisson reconstruction of Kazhdan et al.[2006](Poisson), the Wavelet reconstruction of Manson et al.[2008](Wavelet),and the Smooth Signed Distance reconstruction of Calakli and Taubin [2011](SSD).For the new algorithm,we set the screening weight toα=4and use Neumann boundary conditions in all experiments.(Numerical results obtained using Dirichlet boundaries were indistinguishable.) For the prior methods,we set algorithmic parameters to values recommended by the authors,using Haar Wavelets in the Wavelet reconstruction and setting the value/normal/Hessian weights to 1/1/0.25in the SSD reconstruction.For Poisson,SSD,and Screened we set the“samples-per-node”parameter to1and the “bounding-box-scale”parameter to1.1.(For Wavelet the bounding box scale is hard-coded at1and there is no parameter to adjust the sampling density.)6.1AccuracyWe run three different types of experiments.Real Scanner Data.To evaluate the accuracy of the different reconstruction algorithms on real-world data,we gathered several scanned datasets:the Awakening(10M points),the Stanford Bunny (0.2M points),the David(11M points),the Lucy(1.0M points), and the Neptune(2.4M points).For each dataset,we randomly partitioned the points into two equal-sized subsets:input points for the reconstruction algorithms,and validation points to measure point-to-reconstruction distances.Figure4a shows reconstructions results for the Neptune and David models at depth10.It also shows surface cross-sections overlaid with the validation points in their vicinity.These images reveal that the Poisson reconstruction(far left),and to a lesser extent the SSD reconstruction(center left),over-smooth the data,while the Wavelet reconstruction(center left)has apparent derivative discontinuities.In contrast,our screened Poisson approach(far right)provides a reconstruction that faithfullyﬁts the samples without introducing noise.Figure4b shows quantitative results across all datasets,in the form of RMS errors,measured using the distances from the validation points to the reconstructed surface.(We also computed the maximum error,but found that its sensitivity to individual outlier points made it an unreliable and unindicative statistic.)As theﬁgure indicates,the Screened Poisson reconstruction(blue)is always more accurate than both the original Poisson reconstruction algorithm(red)and the Wavelet reconstruction(purple),and generates reconstruction whose RMS errors are comparable to or smaller than those of the SSD reconstruction(green).Clean Uniformly Sampled Data.To evaluate reconstruction accuracy on clean data,we used the approach of Osada et al.[2001] to generate oriented point sets by uniformly sampling the surfaces of the Fandisk,Armadillo Man,Dragon,and Raptor models.For each model,we generated datasets of100K and1M points and reconstructed surfaces from each point set using the four different reconstruction algorithms.As an example,Figure5a shows the reconstructions of the fandisk and raptor models using1M point samples at depth10.Despite the lack of noise in the input data,the Wavelet reconstruction has spurious high-frequency detail.Focusing on the sharp edges in the model,we also observe that the screened Poisson reconstruction introduces less smoothing,providing a reconstruction that is truer to the original data than either the original Poisson or the SSD reconstructions.Figure5b plots RMS errors across all models,measured bidirec-tionally between the original surface and the reconstructed surface using the Metro tool[Cignoni and Scopigno1998].As in the case of real scanner data,screened Poisson reconstruction always out-performs the original Poisson and Wavelet reconstructions,and is comparable to or better than the SSD reconstruction. Reconstruction Benchmark.We use the benchmark of Berger et al.[2011]to evaluate the accuracy of the algorithms under different simulations of scanner error,including nonuniform sampling,noise,and misalignment.The dataset consists of mul-tiple virtual scans of implicit surfaces representing the Anchor, Dancing Children,Daratech,Gargoyle,and Quasimodo models. As an example,Figure6a visualizes the error in the reconstructions of the anchor model from a virtual scan consisting of210K points (demarked with a dashed rectangle in Figure6b)at depth9.The error is visualized using a red-green-blue scale,with red signifyingACM Transactions on Graphics,V ol.VV,No.N,Article XXX,Publication date:Month YYYY.。

python 泰森多边形 -回复

python 泰森多边形-回复Python泰森多边形——从数据集到可视化解读引言：泰森多边形（Voronoi Diagram）是一种用于解决空间数据分析问题的经典方法，它可以将平面上的点集划分为多个区域，并且每个区域内的点都离该区域内的某一个特定点最近。

在本文中，我们将探讨如何使用Python 中的SciPy库来构建泰森多边形，并通过可视化解读的方式对数据进行分析。

第一步：准备工作首先，我们需要安装Python和SciPy库。

可以通过pip命令来安装SciPy 库，如果还没有安装Python，需要先安装Python环境。

安装完成后，我们可以通过导入SciPy库来验证是否安装成功。

import scipyif scipy.__version__:print("SciPy successfully installed!")else:print("SciPy installation failed!")第二步：生成数据集在使用泰森多边形之前，我们需要准备一个数据集。

假设我们有一组平面上的点，可以通过随机生成的方式来创建数据集。

import numpy as npnp.random.seed(0)n_points = 100points = np.random.random((n_points, 2))在这段代码中，我们使用了NumPy库来生成一个大小为（100, 2）的二维数组，其中每个元素都是0到1之间的随机数。

第三步：构建泰森多边形在构建泰森多边形之前，我们需要导入scipy.spatial模块中的Voronoi 类。

from scipy.spatial import Voronoi, voronoi_plot_2dvor = Voronoi(points)在这段代码中，我们将数据集作为Voronoi类的参数来创建一个Voronoi 对象。

第四步：可视化泰森多边形为了能够更好地理解泰森多边形的分区情况，我们可以使用Matplotlib 库来进行可视化。

多尺度上采样方法的轻量级图像超分辨率重建

第 22卷第 4期2023年 4月Vol.22 No.4Apr.2023软件导刊Software Guide多尺度上采样方法的轻量级图像超分辨率重建蔡靖，曾胜强（上海理工大学光电信息与计算机工程学院，上海 200093）摘要：目前，大多数图像超分辨率网络通过加深卷积神经网络层数与拓展网络宽度提升重建能力，但极大增加了模型复杂度。

为此，提出一种轻量级图像超分辨率算法，通过双分支特征提取算法可使网络模型一次融合并输出不同尺度的特征信息，组合像素注意力分支分别对各像素添加权重，仅以较少参数为代价增强像素细节的特征表达。

同时，上采样部分结合亚像素卷积与邻域插值方法，分别提取特征深度、空间尺度信息，输出最终图像。

此外，组合注意力机制的亚像素卷积分支也进一步强化了重要信息，使输出图像具有更好的视觉效果。

实验表明，该模型在参数量仅为351K的情况下达到了与参数量为1 592K的CARN模型相似的重建性能，在部分测试集中的SSIM值高于CARN，证实了所提方法的有效性，可为轻量级图像超分辨率重建提供新的解决方法。

关键词：图像超分辨率重建；轻量级；像素注意力；多尺度上采样；图像处理DOI：10.11907/rjdk.221516开放科学（资源服务）标识码（OSID）：中图分类号：TP391.41 文献标识码：A文章编号：1672-7800（2023）004-0168-07Lightweight Image Super-resolution Reconstruction using Multi-scaleUpsampling MethodCAI Jing， ZENG Sheng-qiang（School of Optical-Electrical and Computer Engineering，University of Shanghai for Science and Technology，Shanghai 200093， China）Abstract：At present， most image super-resolution networks improve the reconstruction ability by deepening the convolution neural network layers and expanding the network width， but greatly increase the model complexity. To this end， a lightweight image super-resolution algo‐rithm is proposed. Through the two-branch feature extraction algorithm， the network model can be fused and output the feature information of different scales at one time， and the pixel attention branches are combined to add weights to each pixel respectively， which only enhances the feature expression of pixel details at the cost of fewer parameters. In addition， the up-sampling part combines subpixel convolution and neigh‐borhood interpolation methods to extract feature depth and spatial scale information respectively， and output the final image. In addition， the subpixel convolution integral branch of the combined attention mechanism further strengthens the important information and makes the output image have better visual effect. The experimental results show that the model achieves similar reconstruction performance to the CARN model with a parameter quantity of 1 592K when the parameter quantity is only 351K， and the SSIM value in some test sets is higher than the CARN value， which confirms the effectiveness of the proposed method and can provide a new solution for lightweight image super-resolution recon‐struction.Key Words：image super-resolution； lightweight； pixel attention； multi-scale upsampling； image processing0 引言图像超分辨率重建是指将低分辨率图像重建为与之对应的高分辨率图像重建，在机器视觉和图像处理领域是非常重要的课题。

pytorch的dataloader中数据增强技巧

pytorch的dataloader中数据增强技巧数据增强是在训练过程中对数据进行一系列随机变换，目的是增加训练样本的多样性，提高模型的鲁棒性和泛化能力。

在PyTorch中，通过DataLoader和transforms模块可以方便地实现数据增强技巧。

下面将介绍几种常用的数据增强技巧。

1. 随机水平翻转：随机将图像进行水平翻转，通过transforms.RandomHorizontalFlip()函数实现。

这种操作可以增加样本的多样性，尤其对于图像中左右对称的物体，如车辆、人脸等，有很好的效果。

2. 随机垂直翻转：随机将图像进行垂直翻转，通过transforms.RandomVerticalFlip()函数实现。

与水平翻转类似，这种操作也可以增加样本的多样性。

3. 随机裁剪：随机从图像中裁剪出指定大小的区域，通过transforms.RandomCrop()函数实现。

这可以模拟不同的拍摄角度或者目标对象位于图像不同位置的情况，增加样本的变化。

4. 随机旋转：随机对图像进行旋转，通过transforms.RandomRotation()函数实现。

这种操作可以增加样本的多样性，尤其对于目标对象旋转不变的应用场景，如人脸识别、物体检测等，有很好的效果。

5. 随机仿射变换：随机对图像进行仿射变换，通过transforms.RandomAffine()函数实现。

这种操作可以模拟图像的旋转、平移、缩放、错切等几何变换，增加样本的变化。

6. 随机亮度和对比度调整：随机调整图像的亮度和对比度，通过transforms.ColorJitter()函数实现。

这种操作可以模拟不同光照条件下的图像，增加样本的多样性。

7. 随机颜色扰动：随机对图像进行颜色扰动，包括修改亮度、对比度、饱和度和色调等，通过transforms.ColorJitter()函数实现。

这种操作可以模拟不同的摄像头或图像处理设备的效果，增加样本的多样性。

模型超参数英文标准格式

模型超参数英文标准格式在机器学习和深度学习中，超参数（Hyperparameters）是模型训练过程中设置的参数，其值在训练之前需要手动进行选择或调整。

以下是超参数的英文标准格式：
1. Learning rate：学习率
2. Batch size：批量大小
3. Number of epochs：训练轮数
4. Hidden layer size：隐藏层大小
5. Dropout rate：随机失活率
6. Regularization strength：正则化强度
7. Number of layers：层数
8. Activation function：激活函数
9. Optimization algorithm：优化算法
10. Weight initialization：权重初始化
11. Learning rate decay：学习率衰减
12. Momentum：动量
13. Loss function：损失函数
这些是一些常见的超参数，其英文标准格式在机器学习和深度学习的文献和实践中被广泛使用。

请注意，具体的超参数名称和格式可能会因不同的算法、库或框架而有所变化，但上述列出的超参数是相对通用的，适用于大多数机器学习和深度学习任务。

1/ 1。

fine-to-coarse reconstruction算法-概述说明以及解释

fine-to-coarse reconstruction算法-概述说明以及解释1.引言1.1 概述:在计算机视觉领域，图像重建是一项重要的任务，其目的是从输入的低分辨率图像中生成高质量的高分辨率图像。

Fine-to-Coarse Reconstruction算法是一种常用的图像重建算法，它通过逐渐增加图像的分辨率级别，从粗到细地重建图像，以获得更加清晰、细节丰富的图像。

Fine-to-Coarse Reconstruction算法在图像处理和计算机视觉中有着广泛的应用，能够有效地提高图像质量和细节信息的还原程度。

本文将详细介绍Fine-to-Coarse Reconstruction算法的原理、应用和优势，希望能为读者提供深入了解和应用该算法的指导。

1.2 文章结构本文主要分为引言、正文和结论三部分。

在引言部分中，我们将对Fine-to-Coarse Reconstruction算法进行概述，并介绍文章的结构和目的。

在正文部分，我们将详细介绍Fine-to-Coarse Reconstruction算法的原理以及其在实际应用中的表现。

我们将重点讨论该算法在图像处理、计算机视觉等领域的应用，并探讨其优势和局限性。

最后，在结论部分，我们将对整篇文章进行总结，展望Fine-to-Coarse Reconstruction算法的未来发展方向，并留下一些思考和结束语。

整个文章结构清晰，层次分明，将帮助读者全面了解和理解Fine-to-Coarse Reconstruction算法的重要性和价值。

1.3 目的Fine-to-Coarse Reconstruction算法的目的是通过逐步从细节到整体的重建过程，实现对图像或模型的高效重建。

通过逐步迭代的方式，算法能够在保持细节的同时，提高重建的速度和准确性。

本文旨在深入探讨Fine-to-Coarse Reconstruction算法的原理、应用和优势，以期能够为相关研究和应用提供更多的启发和帮助。

一种改进的高斯频率域压缩感知稀疏反演方法(英文)

AbstractCompressive sensing and sparse inversion methods have gained a significant amount of attention in recent years due to their capability to accurately reconstruct signals from measurements with significantly less data than previously possible. In this paper, a modified Gaussian frequency domain compressive sensing and sparse inversion method is proposed, which leverages the proven strengths of the traditional method to enhance its accuracy and performance. Simulation results demonstrate that the proposed method can achieve a higher signal-to- noise ratio and a better reconstruction quality than its traditional counterpart, while also reducing the computational complexity of the inversion procedure.IntroductionCompressive sensing (CS) is an emerging field that has garnered significant interest in recent years because it leverages the sparsity of signals to reduce the number of measurements required to accurately reconstruct the signal. This has many advantages over traditional signal processing methods, including faster data acquisition times, reduced power consumption, and lower data storage requirements. CS has been successfully applied to a wide range of fields, including medical imaging, wireless communications, and surveillance.One of the most commonly used methods in compressive sensing is the Gaussian frequency domain compressive sensing and sparse inversion (GFD-CS) method. In this method, compressive measurements are acquired by multiplying the original signal with a randomly generated sensing matrix. The measurements are then transformed into the frequency domain using the Fourier transform, and the sparse signal is reconstructed using a sparsity promoting algorithm.In recent years, researchers have made numerous improvementsto the GFD-CS method, with the goal of improving its reconstruction accuracy, reducing its computational complexity, and enhancing its robustness to noise. In this paper, we propose a modified GFD-CS method that combines several techniques to achieve these objectives.Proposed MethodThe proposed method builds upon the well-established GFD-CS method, with several key modifications. The first modification is the use of a hierarchical sparsity-promoting algorithm, which promotes sparsity at both the signal level and the transform level. This is achieved by applying the hierarchical thresholding technique to the coefficients corresponding to the higher frequency components of the transformed signal.The second modification is the use of a novel error feedback mechanism, which reduces the impact of measurement noise on the reconstructed signal. Specifically, the proposed method utilizes an iterative algorithm that updates the measurement error based on the difference between the reconstructed signal and the measured signal. This feedback mechanism effectively increases the signal-to-noise ratio of the reconstructed signal, improving its accuracy and robustness to noise.The third modification is the use of a low-rank approximation method, which reduces the computational complexity of the inversion algorithm while maintaining reconstruction accuracy. This is achieved by decomposing the sensing matrix into a product of two lower dimensional matrices, which can be subsequently inverted using a more efficient algorithm.Simulation ResultsTo evaluate the effectiveness of the proposed method, we conducted simulations using synthetic data sets. Three different signal types were considered: a sinusoidal signal, a pulse signal, and an image signal. The results of the simulations were compared to those obtained using the traditional GFD-CS method.The simulation results demonstrate that the proposed method outperforms the traditional GFD-CS method in terms of signal-to-noise ratio and reconstruction quality. Specifically, the proposed method achieves a higher signal-to-noise ratio and lower mean squared error for all three types of signals considered. Furthermore, the proposed method achieves these results with a reduced computational complexity compared to the traditional method.ConclusionThe results of our simulations demonstrate the effectiveness of the proposed method in enhancing the accuracy and performance of the GFD-CS method. The combination of sparsity promotion, error feedback, and low-rank approximation techniques significantly improves the signal-to-noise ratio and reconstruction quality, while reducing thecomputational complexity of the inversion procedure. Our proposed method has potential applications in a wide range of fields, including medical imaging, wireless communications, and surveillance.。

纹理物体缺陷的视觉检测算法研究--优秀毕业论文

摘要
在竞争激烈的工业自动化生产过程中，机器视觉对产品质量的把关起着举足轻重的作用，机器视觉在缺陷检测技术方面的应用也逐渐普遍起来。与常规的检测技术相比，自动化的视觉检测系统更加经济、快捷、高效与安全。纹理物体在工业生产中广泛存在，像用于半导体装配和封装底板和发光二极管，现代化电子系统中的印制电路板，以及纺织行业中的布匹和织物等都可认为是含有纹理特征的物体。本论文主要致力于纹理物体的缺陷检测技术研究，为纹理物体的自动化检测提供高效而可靠的检测算法。纹理是描述图像内容的重要特征，纹理分析也已经被成功的应用与纹理分割和纹理分类当中。本研究提出了一种基于纹理分析技术和参考比较方式的缺陷检测算法。这种算法能容忍物体变形引起的图像配准误差，对纹理的影响也具有鲁棒性。本算法旨在为检测出的缺陷区域提供丰富而重要的物理意义，如缺陷区域的大小、形状、亮度对比度及空间分布等。同时，在参考图像可行的情况下，本算法可用于同质纹理物体和非同质纹理物体的检测，对非纹理物体的检测也可取得不错的效果。在整个检测过程中，我们采用了可调控金字塔的纹理分析和重构技术。与传统的小波纹理分析技术不同，我们在小波域中加入处理物体变形和纹理影响的容忍度控制算法，来实现容忍物体变形和对纹理影响鲁棒的目的。最后可调控金字塔的重构保证了缺陷区域物理意义恢复的准确性。实验阶段，我们检测了一系列具有实际应用价值的图像。实验结果表明本文提出的纹理物体缺陷检测算法具有高效性和易于实现性。关键字: 缺陷检测；纹理；物体变形；可调控金字塔；重构
Keywords: defect detection, texture, object distortion, steerable pyramid, reconstruction
II

Spotlight SAR data focusing based on a two-step processing approach

Spotlight SAR Data Focusing Based on a Two-StepProcessing ApproachRiccardo Lanari,Senior Member,IEEE,Manlio Tesauro,Eugenio Sansosti,Member,IEEE,and Gianfranco FornaroAbstract—We present a new spotlight SAR data-focusing algo-rithm based on a two-step processing strategy that combines the advantages of two commonly adopted processing approaches:the efficiency of SPECAN algorithms and the precision of stripmap fo-cusing techniques.The first step of the proposed algorithm imple-ments a linear and space-invariant azimuth filtering that is carried out via a deramping-based technique representing a simplified ver-sion of the SPECAN approach.This operation allows us to perform a bulk azimuth raw data compression and to achieve a pixel spacing smaller than(or equal to)the expected azimuth resolution of the fully focused image.Thus,the azimuth spectral folding phenom-enon,typically affecting the spotlight data,is overcome,and the space-variant characteristics of the stripmap system transfer func-tion are preserved.Accordingly,the residual and precise focusing of the SAR data is achieved by applying a conventional stripmap processing procedure requiring a minor modification and imple-mented in the frequency domain.The extension of the proposed technique to the case of high bandwidth transmitted chirp signals is also discussed.Experiments carried out on real and simulated data confirm the validity of the presented approach,which is mainly focused on spaceborne systems.Index Terms—Raw data focusing,spectral analysis(SPECAN) processing algorithms.I.I NTRODUCTIONS YNTHETIC aperture radar(SAR)spotlight mode allows the generation of microwave images with high geometric resolutions[1],[2].This result is achieved by steering the radar antenna beam,during the raw data acquisition interval,to al-ways illuminate the same area on the ground(spot).Accord-ingly,from each target located in the lighted area,a large number of backscattered echoes is received,and their coherent combina-tion allows to obtain the required azimuth resolution.Similarly, high resolution in the range direction is achieved by transmitting a high bandwidth chirp followed by a further data processing on each received echo.The first algorithms proposed for spotlight raw data pro-cessing are based on the similarity between spotlight SAR systems and computer tomography:they are usually referredManuscript received March30,2000;revised November29,2000.This work was partially supported by the Italian Space Agency,Roma,Italy.The spotlight SIR-C data have been processed at the Jet Propulsion Laboratory,Pasadena,CA. nari,E.Sansosti,and G.Fornaro are with the Istituto di Ricerca per l’Elettromagnetismo e i Componenti Elettronici(IRECE)328I-80124 Napoli,Italy,(e-mail:lanari@r.it;sansosti@r.it; fornaro@r.it).M.Tesauro is with the Dipartimento di Ingegneria dell’Innovazione,Univer-sitàdegli Studi di Lecce,I-73100Lecce,Italy(e-mail:manlio.tesauro@unile.it). Publisher Item Identifier S0196-2892(01)07625-2.to as polar format and convolution backprojection techniques [3]–[5].The former are computationally efficient but request a nontrivial interpolation step from a polar to rectangular grid: the image quality can be therefore affected by uncompensated range curvature effects[6]and interpolation errors.The latter allow overcoming these limitations but are generally inefficient if implementations on dedicated architectures are not consid-ered[5].Most recently,the development of spotlight raw data processing algorithms based on stripmap mode focusing techniques operating in the frequency domain has received increasing interest[7]–[10].Indeed,strip-mode processing procedures that are precise,efficient,and requiring less stringent approximations(compared to those involved in the tomographic approaches)are available[11]–[13].However,a relevant limitation to the straightforward application of these techniques to the spotlight data processing is represented by the fact that the raw signal azimuth bandwidth is,in the spotlight case,generally greater(often much greater)than the azimuth sampling frequency,referred to as pulse repetition frequency(prf).As a consequence,data processing carried out in the Fourier domain,as that involved in efficient strip-mode focusing,cannot be directly implemented on the full aperture because of the consequential azimuth spectrum folding effect.A way to overcome this limitation is based on partitioning the received signal into azimuth blocks whose block-bandwidths are smaller than the sampling frequency.Standard strip mode focusing techniques are then applied to each data block and the processed signals are then combined to generate the fully-resolved spotlight image[9],[10].Completely different pro-cessing solutions,based on a nontrivial reconstruction of the unfolded azimuth spectrum from the folded one associated to the raw signal,are also available[7],[8].On the other hand,a relatively simple spotlight processing al-gorithm can be implemented by applying the spectral analysis (SPECAN)technique[14].In this case,the received raw data are azimuth focused via the application of a deramping func-tion(a multiplication by a properly chosen chirp signal)fol-lowed by a final azimuth FT operation.The azimuth deramping factor is updated in range to allow for the compensation of the space-varying characteristic of the received data due to the(az-imuth)chirp rate range variation(focus depth).This procedure is attractive as far as computational efficiency and capability to overcome the azimuth spectral folding effect are concerned. However,its main limitation is represented by the lack of a pre-cise range cell migration(RCM)compensation that is often rel-evant in spotlight mode SAR systems due to high resolution re-quirements.0196–2892/01$10.00©2001IEEEIn this paper,we propose an alternative spotlight data fo-cusing technique based on decoupling the overall focusing oper-ation in two main steps.The key point of the proposed approach is to combine the advantages of efficient SPECAN and precise stripmap focusing approaches.In particular,the first processing step carries out a filtering operation aimed to achieve a bulk az-imuth raw data compression and an output pixel spacing smaller than(or equal to)the expected final azimuth resolution.Similar to SPECAN processing algorithms,this filtering operation is ef-ficiently carried out via a deramping-based approach[14]but, at variance of the former,the chirp rate of the deramping func-tion is kept constant and properly fixed at a convenient value. This is a key point in the proposed processing procedure that al-lows preserving the space variant characteristic of the residual system transfer function(STF).A discussion on the impact of the chirp rate selection on possible artifacts that may appear at the image borders is also provided.The second processing step carries out the residual focusing of the data via the use of a conventional stripmap processing pro-cedure implemented in the frequency domain and requiring only minor modifications in the available codes.This spectral do-main focusing operation is now possible because,following the bulk azimuth compression,the folding effect of the raw signal azimuth spectrum has been totally overcome.More precisely, this second(residual)processing step performs the precise RCM compensation,the data range compression and the residual az-imuth data compression;the latter accounts for higher order terms not compensated in the bulk azimuth processing step.The minor modifications to be performed in available stripmap pro-cessing codes are essentially a change of the azimuth filter func-tion,which accounts for the already compensated quadratic az-imuth phase term,and a change in the azimuth pixel spacing of the input data.It is worth noting the role that the bulk azimuth compres-sion operation plays in our approach to a preprocessing step that extends the processing capability of conventional stripmap fo-cusing procedures to spotlight data.In addition,the proposed processing algorithm does not require a specific manipulation and/or interpolation of the data,such as those necessary in az-imuth block divisions or in unfolded signal spectrum reconstruc-tion-based algorithms.Accordingly,we have finally achieved a processing procedure that is simple,precise and computation-ally efficient because it does not imply any significant increase of the raw data matrix dimensions and only includes fast Fourier transforms(FFTs)and matrix multiplication.Moreover,it can be easily extended to the case of high bandwidth transmitted sig-nals wherein spectral folding effects could appear in the range direction as well.In our case,the implemented solution is based again on a deramping approach,that is,at variance of conven-tional focusing techniques performed following the A/D con-version rather than before.A number of experiments carried out on a simulated and a real data set,the latter acquired by the experimental C-band sensor of the SIR-C system during the SIR-C/X-SAR mission in1994[9],demonstrate the validity of the presented approach.As a final remark,we want to stress that the presented anal-ysis is focused on spaceborne systems typically characterized by small squint angles[15]during the acquisition(often lessthan Fig.1.Spotlight system geometry.1-axis,assumed coincident with the platform flight path,is referred to as azimuthdirection are the(closest approach)target range and look angle,respectively.1We assume in the following that the sensor,mounted onboard a platform moving at the constantvelocity,transmits,attimes(1) whereangular carrierfrequency;chirp rate,beingis the systemwavelength,,the two-way antenna pattern factor,and being the azimuth dimension of the real,onboard antenna.Note that the assumed simplificationon allows avoiding the antenna footprint dependence on the platform location whose impact is 1Note that we have assumed the platform trajectory to be a straight line which is appropriate for airborne but not for spaceborne sensors.However,it can be shown that spaceborne data can be processed in the same manner as airborne data if the closed approach distance and the azimuth velocity are properly con-sidered[16]or,more precisely,via the appropriate sensor-target distance eval-uation[17].LANARI et al.:SPOTLIGHT SAR DATA FOCUSING 1995inessential for the following analysis.A more detailed discus-sion on this matter can be found in [10].Let us now consider a pointtargetandreceived onboard is represented,after the heterodyne process [that removes the fast varyingterm(4a)FTis the azimuth (spatial)frequency.Equation (7)shows that theazimuth spectrum is centered on thefrequencyand that the signal bandwidthiswith respect to the strip mode case,for whichit wouldbe.In this case,weget(8)Since the maximum valueof,i.e.,that relative to the nearest range,should be con-sidered.This is assumed hereafter,although we underline that in the spotlight case,due to the typically limited range extension of the illuminated spot,the range dependenceof (9)in order to avoid any azimuth spectral folding effect [18].On the other hand,this sampling frequency increase would lead to large data rates and could generate severe range ambiguity problems [15].Accordingly,the valuesofand are the raw data and thefocused image azimuth pixel dimensions,respectively,the latter chosen in agreement with the Nyquist limit available from (8).In this case,we get from(9)(10)1996IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING,VOL.39,NO.9,SEPTEMBER2001 Equation(10)clarifies that the azimuth number of pixel inthe raw data set and in the focused image,i.e.,,respectively,are comparable,and they become closeras.III.B ULK A ZIMUTH R AW D ATA C OMPRESSIONLet us now investigate a possible solution to the azimuth spec-tral folding effect discussed in the previous section.The pro-posed approach is based on a linear and space-invariant azimuthfiltering operation that performs a bulk azimuth data compres-sion and achieves an output pixel spacing,satisfying the Nyquistlimit shown in(8).This operation is efficiently implementedwithout any large zero padding step,via a deramping-basedtechnique[1],[2],[14].A.Continuous Domain AnalysisKey point of the presented technique is the azimuth convolu-tion between the raw data and the quadratic phasesignal(11)whereinand are the nearest and the farthest ranges of theilluminated spot,respectively,and represents the range valueof a generic point target located within the spot.No specific as-sumption has yet been made on the factor in(11),althoughwe anticipate that the impact of any particular selection for thisterm is later discussed in detail.We also underline that the rawsignal range component,accounting for the range independent(RI)and range dependent(RD)RCM effects,is neglected inthe azimuth convolution operation presented in this section.Allthese components are restored and accounted for during the sub-sequent and highly precise second processing step.The azimuth convolution between thesignal in(5),forthe case of an isolated target,and thefunction in(11)gives(12)wherein thesymbolof the target and on the valueof.The second line in(12)shows that this azimuth convo-lution is essentially a deramping based(SPECAN)processing,involving a chirp multiplication of the azimuth signal,a subse-quent FT and a residual phase cancellation.Indeed,but for theabove mentioned approximations,this processing step allows usto achieve an azimuth compression which is full only for thosetargets locatedat.This point can be clarified by recon-sidering(12).Indeed,if weassume(13)wherein the imaged target is fully azimuth focused.Forany,by assuming the validity of theSPM method3weget,for which the resulting signal is centeredaround andextendsfor.However,because the range ex-tension of the spot area is typically very small,we canassumeand,even in this limiting case,a compression effect,althoughpartial with respect to that achieved in(13),is obtained.The obtained results apply to the case of an isolated target,however they can be easily extended to the case of an illumi-nated area.Accounting for the azimuth spotdimensionwith(16)3Generalization to those cases where SPM cannot be applied can be derivedas in[19].Here we are interested only in having a rough measure of the targetecho extension following the first processing step.LANARI et al.:SPOTLIGHT SAR DATA FOCUSING 1997with ,i.e.,with a pixel spacing satisfying the Nyquist limit of the spotlight signal,see (8)and (9).Accordingly,sim-ilarly to what is shown in the continuous analysis presented in the previous section,(12)becomesbeing the nearest integer operator.Note that,dueto(10),with (18)where thefactorrepresents the output azimuth data repli-cation.Accordingly,under the validity of the inequality in (18),not only the azimuth spectral folding effects are avoided,see (16),but also no data wrap around occurs in the azimuth direc-tion.We note that the validity of the aforementioned inequality in (18)is generally satisfied due to the presence of a slight az-imuth data oversampling carried out on the spotlight signal with respect to the Nyquist rate that we would have with the system operating in the stripmap mode.This point can be clarified by accounting for (15)in the inequality in (18).In this case,wegetgives a value ofaboutfor the right-hand side factor in (21).This leads tothe newinequality,which is satisfied for most real spotlight SAR systems.Of course in the (rare)case of an insufficient oversampling factor,a balancing choice would be represented by setting at the midrange swath,thus leading to a resolution degradation at the image near and far range edges.Anyway,we remark that this is generally not a very critical issue because,due to the antenna beam steering,those targets would be in any case characterized by a lower resolution [10].Based on (18),we can finally rewrite (17)asfollows:of the orderofis required and implemented via the substitu-tionin (11)[and equivalently in (17)]to compen-sate for this effect.Secondly,due to the appearance of a sig-nificant range walk effect [15]in the RCM,an additional edge degradation could appear even at midrange.Although the paper is focused on low squint angle acquisitions,we stress that well known procedures applied for mitigating the range walk effect in deramping-based focusing approaches could be considered [20].However,this is worth pursuing for future studies.4Notealso that,at variance with what is shown in (22),a more conventionalexpression of the DFT operation can be consideredimplying1with n =0P=2;...;P=201[18].Inthis case,a trivial manipulation of (22)is required.1998IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING,VOL.39,NO.9,SEPTEMBER2001 IV.R ESIDUAL D ATA F OCUSING VIA S TRIPMAPP ROCESSING T ECHNIQUESLet us concentrate on the result of the bulk azimuth com-pression step shown in(12).We underline that,following thisoperation,the folding effect influencing the azimuth spectrumis avoided and the space-variant characteristics of the systemtransfer function are maintained.Accordingly,it is possible tocarry out the residual focusing of the data via the use of effi-cient and precise techniques originally designed for stripmapSAR data focusing that are implemented in the frequency do-main.To clarify this point,we refer to the expression of the receivedsignal over a distributed scene by resorting the linearity of thesystemrepresents the reflectivity function of the illumi-nated scene including the fast varying phase term in(24).Thereceived data spectrum can be written asfollows:is the range(spatial)frequency,andcan be found via the application ofthe SPM,leading to the following expression[15]:(27)where,and.In this case,wehaveFT(29)wherein the phasefactor accounts for thebulk compression step.By finally substituting(27)in(29),weget(30)withinsteadof[15].Moreover,the folding effects influencing the azimuth raw signalspectrum(see Section II)have been avoided due to the alreadycarried out bulk azimuth compression step leading to the newpixelspacing shown in(18).We also note that the space-variant characteristics of the system transfer function are pre-served by the bulk compression.This nonlinear mapping of therange frequencies,i.e.,bysimply accounting for the system transfer functioncomponentinsteadof and by considering the new azimuth sam-pling frequency.In particular,we have considered the stripmap processing ap-proach described in[12],and the overall processing block dia-gram is shown in Fig.2.In this case,the first step carries out thebulk azimuth compression,while the residual focusing is im-plemented as follows:the filtering operation,carried out in thetwo-dimensional(2-D)frequency domain via the filterfunctionallows us to fully focus the midspot area by accountingLANARI et al.:SPOTLIGHT SAR DATA FOCUSING1999Fig. 2.Two-step focusing procedure block diagram.Note that i=0P=2;...;P=201and l=0M=2;...;M=201.Moreover,1=(P1x)and1=(M1ri i in i l i l inr r i l lo re i rara r l l i in i ir i ra r i i ira r i r i im l inis r i i la r in l ii r i l r if l id i x r ii r r ic l x r i re ii i2000IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING,VOL.39,NO.9,SEPTEMBER2001 Fig.4.C-band VV-polarized image of the Sidney zone obtained by applyingthe focusing approach of Fig.5to the raw data set acquired in1994by theSIR-C system operating in an experimental spotlight mode.The expectedazimuth resolution is about1m,but the image is represented with an azimuthpixel spacing of about6.5m to avoid the geometric distortions caused bydifferent dimensions of the pixel in range and azimuth directions.The extensionof the area is of about1.7km24.5km.range compressionof(zeropadded to increase its extensionfrom)and thesignalin(31),becomingLANARI et al.:SPOTLIGHT SAR DATA FOCUSING2001Fig.5.Simulated image obtained after the bulk azimuth compression(rangecompression has been also implemented).The range corresponding to~r ishighlighted.Clearly,although not explicitly mentioned,the range pixelspacing resulting from the range focusing operation of(33)must also be considered for the implementation of theresidual focusing step.As final remarks,we underline that all the operations involvedin(33)are assumed,in our case,to be carried out after the A/Dconversion in the receiver.Moreover,the computational effi-ciency of the procedure in Fig.3can be further improved bycombining the range compression operation and the compen-sation of the scaling factor2002IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING,VOL.39,NO.9,SEPTEMBER 2001TABLE IIR ESULTSOF THEA NALYSIS C ARRIED O UT ONTHE I MAGED P OINTT ARGETS OF F IG .61994by the C-band sensor of the SIR-C system operated in the experimental spotlight mode (see Table I for a description of the system parameters).In this case,because of theratiokm,andthe near,mid,and far range distancesarekm,km,and km,respectively.Theselected value ofiskm.Accordingly,based on the analysis of Section III,we can evaluate the minimum andmaximum range distance,forexampleand ,which ensure the absence of degradation at the edges of the image,by using (20).They are givenbyKmand(36)thus guaranteeing the possibility of focusing the overall scene.The image obtained by applying the procedure of Fig.2is pre-sented in Fig.4.It clearly shows the focusing capability of the proposed algorithm.However,the absence of known reference targets in the scene does not allow any significant quantitative measurement of the quality of the obtained image.Accordingly,in order to assess the performance of the proposed approach,we have generated a simulated data set representing the signal backscattered by a sequence of three point targets aligned in the range direction and located over an absorbing background.The system parameters are again those of Table I.To better clarify the effect of the bulk azimuth compression step,we show the result obtained by applying this operation (see Fig.5).As expected,the achieved azimuth compression 5effect is more relevant for the target located at a range closer to .We additionally remark that the azimuth extension of the bulk com-pressed data is of 2048samples,and it has been increased 6with respect to the raw data,by about 20%(the azimuth raw data length was of 1700samples),but no additional data dimension increase is required in the residual focusing step.Note also in5Inorder to improve the readability of the result,a range compression step has been also carried out.6This allowed the use of high efficient FFT codes with a power of two data lengths[18].Fig.7.High resolution simulated image obtained by applying the focusing procedure of Fig.3.The contour plots of the three imaged point targets are also shown.Fig.5the effect of the uncompensated range cell migration ef-fect.The fully focused image is finally shown in Fig.6.The results of the measurements carried out on the imaged point targets of Fig.6are summarized in Table II wherein the theoretical az-imuth resolution values are those pertinent to the selected point reflector.The inspection of Table II clarifies the high perfor-mance of the presented technique for what concerns the ampli-tude characteristics of the target responses.The phase accuracy has been also assessed;it is about 1LANARI et al.:SPOTLIGHT SAR DATA FOCUSING2003 TABLE IIIR ESULTS OF THE A NALYSIS C ARRIED O UT ON THE I MAGED P OINTT ARGETS OF F IG.712004IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING,VOL.39,NO.9,SEPTEMBER 2001different foreign research institutes such as the Institute of Space and Astronau-tical Science (ISAS),Tokyo,Japan,the German Aerospace Research Establish-ment (DLR),Oberpfafenhoffen,Germany,and the Jet Propulsion Laboratory (JPL),Pasadena,CA,where he received a NASA recognition for the innovative development of a ScanSAR processor for the SRTM mission.His main research activities are in the SAR data processing field as well as in IFSAR techniques.On this topic,he has authored 30international journal papers and,more recently,a book Synthetic Aperture Radar Processing (Boca Raton,FL:CRC).He also holds two patents on SAR raw data processing techniques.nari has been Chairman at several international conferences and was invited to join the technical program committee for the IGARSS Conference in 2000and2001.Manlio Tesauro received the Laurea degree (summa cum laude)in electronic engineering and the Ph.D.degree in electronic engineering and computer sci-ence,both from the University of Napoli “Federico II,”Napoli,Italy,in 1992and 1998,respectively.In 1998and 1999,he was with the Istituto di Ricerca per l’Elettromagnetismo ed I Componenti Elettronici (IRECE),Napoli,National Research Council (CNR),with a grant from Telespazio.Since 2000,he has been a Research Scientist with the Dipartimento di Ingegneria dell’Innovazione,University of Lecce,Lecce,Italy.In February 2000,he was a member of the Italian Team in the ASI Ground Data Processing Chain during the Shuttle Radar Topography Mission (SRTM)at the Jet Propulsion Laboratory,Pasadena,CA.His main interests are in the field of statistical signal processing with emphasis on SAR and IFSARprocessing.Eugenio Sansosti (M’96)received the Laurea degree (summa cum laude)in electronic engineering from the University of Napoli “Federico II,”Napoli,Italy,in 1995.Since 1997,he has been with the Istituto di Ricerca per l’Elettromagnetismo e I Componenti Elettronici (IRECE),National Research Council (CNR),where he currently holds a Full Researcher position.He is also an Adjunct Professor of electrical Communica-tions at the University of Cassino,Cassino,Italy.He was a Guest Scientist with the Jet Propulsion Labora-tory,Pasadena,CA,from August 1997to February 1998,and again in February 2000in support of the NASA Shuttle Radar Topography Mission.In November and December 2000,he worked as an Image Processing Adviser at the Istituto Tecnologico de Aeronautica (ITA),Sao Josédos Campos SP,Brazil.His main research interests are in airborne and spaceborne synthetic aperture radar (SAR)data processing,SAR interferometry,and differential SARinterferometry.Gianfranco Fornaro received the Laurea degree in electronic engineering from the University of Napoli “Federico II,”Napoli,Italy,in 1992,and the Ph.D.degree from the University of Rome “La Sapienza,”Rome,Italy,in 1997.He is currently a Full Researcher at the Istituto di Ricerca per l’Elettromagnetismo e i Componenti Elettronici (IRECE),Italian National Research Council (CNR)and Adjunct Professor of Communi-cation,University of Cassino,Cassino,Italy.He has been a Visiting Scientist with the German AerospaceEstablishment (DLR),Oberpfafenhoffen,Germany,and the Politecnico de Milano,Milano,Italy,and has been a Lecturer with the Istituto Tecnologico de Aeronautica (ITA),Sao Josédos Campos SP,Brasil.His main research interests are in the signal processing field with applications to the synthetic aperture radar (SAR)data processing,SAR interferometry,and differential SAR interferometry.Dr.Fornaro was awarded the Mountbatten Premium Award by the Institution of Electrical Engineers (IEE)in 1997.。

VoxelNet_ End-to-End Learning for Point Cloud Base

VoxelNet:End-to-End Learning for Point Cloud Based3D Object DetectionYin ZhouApple Inc****************Oncel TuzelApple Inc****************AbstractAccurate detection of objects in3D point clouds is a central problem in many applications,such as autonomous navigation,housekeeping robots,and augmented/virtual re-ality.To interface a highly sparse LiDAR point cloud with a region proposal network(RPN),most existing efforts have focused on hand-crafted feature representations,for exam-ple,a bird’s eye view projection.In this work,we remove the need of manual feature engineering for3D point clouds and propose VoxelNet,a generic3D detection network that uniﬁes feature extraction and bounding box prediction into a single stage,end-to-end trainable deep network.Speciﬁ-cally,VoxelNet divides a point cloud into equally spaced3D voxels and transforms a group of points within each voxel into a uniﬁed feature representation through the newly in-troduced voxel feature encoding(VFE)layer.In this way, the point cloud is encoded as a descriptive volumetric rep-resentation,which is then connected to a RPN to generate detections.Experiments on the KITTI car detection bench-mark show that VoxelNet outperforms the state-of-the-art LiDAR based3D detection methods by a large margin.Fur-thermore,our network learns an effective discriminative representation of objects with various geometries,leading to encouraging results in3D detection of pedestrians and cyclists,based on only LiDAR.1.IntroductionPoint cloud based3D object detection is an important component of a variety of real-world applications,such as autonomous navigation[11,14],housekeeping robots[26], and augmented/virtual reality[27].Compared to image-based detection,LiDAR provides reliable depth informa-tion that can be used to accurately localize objects and characterize their shapes[21,5].However,unlike im-ages,LiDAR point clouds are sparse and have highly vari-able point density,due to factors such as non-uniform sampling of the3D space,effective range of the sensors, occlusion,and the relative pose.To handle these chal-lenges,many approaches manually crafted featurerepresen-Figure1.V oxelNet directly operates on the raw point cloud(no need for feature engineering)and produces the3D detection re-sults using a single end-to-end trainable network.tations for point clouds that are tuned for3D object detec-tion.Several methods project point clouds into a perspec-tive view and apply image-based feature extraction tech-niques[28,15,22].Other approaches rasterize point clouds into a3D voxel grid and encode each voxel with hand-crafted features[41,9,37,38,21,5].However,these man-ual design choices introduce an information bottleneck that prevents these approaches from effectively exploiting3D shape information and the required invariances for the de-tection task.A major breakthrough in recognition[20]and detection[13]tasks on images was due to moving from hand-crafted features to machine-learned features.Recently,Qi et al.[29]proposed PointNet,an end-to-end deep neural network that learns point-wise features di-rectly from point clouds.This approach demonstrated im-pressive results on3D object recognition,3D object part segmentation,and point-wise semantic segmentation tasks.In[30],an improved version of PointNet was introduced which enabled the network to learn local structures at dif-ferent scales.To achieve satisfactory results,these two ap-proaches trained feature transformer networks on all input points(∼1k points).Since typical point clouds obtained using LiDARs contain∼100k points,training the architec-1Figure2.V oxelNet architecture.The feature learning network takes a raw point cloud as input,partitions the space into voxels,and transforms points within each voxel to a vector representation characterizing the shape information.The space is represented as a sparse 4D tensor.The convolutional middle layers processes the4D tensor to aggregate spatial context.Finally,a RPN generates the3D detection.tures as in[29,30]results in high computational and mem-ory requirements.Scaling up3D feature learning networks to orders of magnitude more points and to3D detection tasks are the main challenges that we address in this paper.Region proposal network(RPN)[32]is a highly opti-mized algorithm for efﬁcient object detection[17,5,31, 24].However,this approach requires data to be dense and organized in a tensor structure(e.g.image,video)which is not the case for typical LiDAR point clouds.In this pa-per,we close the gap between point set feature learning and RPN for3D detection task.We present V oxelNet,a generic3D detection framework that simultaneously learns a discriminative feature represen-tation from point clouds and predicts accurate3D bounding boxes,in an end-to-end fashion,as shown in Figure2.We design a novel voxel feature encoding(VFE)layer,which enables inter-point interaction within a voxel,by combin-ing point-wise features with a locally aggregated feature. Stacking multiple VFE layers allows learning complex fea-tures for characterizing local3D shape information.Specif-ically,V oxelNet divides the point cloud into equally spaced 3D voxels,encodes each voxel via stacked VFE layers,and then3D convolution further aggregates local voxel features, transforming the point cloud into a high-dimensional volu-metric representation.Finally,a RPN consumes the vol-umetric representation and yields the detection result.This efﬁcient algorithm beneﬁts both from the sparse point struc-ture and efﬁcient parallel processing on the voxel grid.We evaluate V oxelNet on the bird’s eye view detection and the full3D detection tasks,provided by the KITTI benchmark[11].Experimental results show that V oxelNet outperforms the state-of-the-art LiDAR based3D detection methods by a large margin.We also demonstrate that V oxel-Net achieves highly encouraging results in detecting pedes-trians and cyclists from LiDAR point cloud.1.1.Related WorkRapid development of3D sensor technology has moti-vated researchers to develop efﬁcient representations to de-tect and localize objects in point clouds.Some of the earlier methods for feature representation are[39,8,7,19,40,33, 6,25,1,34,2].These hand-crafted features yield satisfac-tory results when rich and detailed3D shape information is available.However their inability to adapt to more complex shapes and scenes,and learn required invariances from data resulted in limited success for uncontrolled scenarios such as autonomous navigation.Given that images provide detailed texture information, many algorithms infered the3D bounding boxes from2D images[4,3,42,43,44,36].However,the accuracy of image-based3D detection approaches are bounded by the accuracy of the depth estimation.Several LIDAR based3D object detection techniques utilize a voxel grid representation.[41,9]encode each nonempty voxel with6statistical quantities that are de-rived from all the points contained within the voxel.[37] fuses multiple local statistics to represent each voxel.[38] computes the truncated signed distance on the voxel grid.[21]uses binary encoding for the3D voxel grid.[5]in-troduces a multi-view representation for a LiDAR point cloud by computing a multi-channel feature map in the bird’s eye view and the cylindral coordinates in the frontal view.Several other studies project point clouds onto a per-spective view and then use image-based feature encoding公众号DASOU-整理schemes[28,15,22].There are also several multi-modal fusion methods that combine images and LiDAR to improve detection accu-racy[10,16,5].These methods provide improved perfor-mance compared to LiDAR-only3D detection,particularly for small objects(pedestrians,cyclists)or when the objectsare far,since cameras provide an order of magnitude more measurements than LiDAR.However the need for an addi-tional camera that is time synchronized and calibrated with the LiDAR restricts their use and makes the solution more sensitive to sensor failure modes.In this work we focus on LiDAR-only detection.1.2.Contributions•We propose a novel end-to-end trainable deep archi-tecture for point-cloud-based3D detection,V oxelNet, that directly operates on sparse3D points and avoids information bottlenecks introduced by manual feature engineering.•We present an efﬁcient method to implement V oxelNet which beneﬁts both from the sparse point structure and efﬁcient parallel processing on the voxel grid.•We conduct experiments on KITTI benchmark and show that V oxelNet produces state-of-the-art results in LiDAR-based car,pedestrian,and cyclist detection benchmarks.2.VoxelNetIn this section we explain the architecture of V oxelNet, the loss function used for training,and an efﬁcient algo-rithm to implement the network.2.1.VoxelNet ArchitectureThe proposed V oxelNet consists of three functional blocks:(1)Feature learning network,(2)Convolutional middle layers,and(3)Region proposal network[32],as il-lustrated in Figure2.We provide a detailed introduction of V oxelNet in the following sections.2.1.1Feature Learning NetworkVoxel Partition Given a point cloud,we subdivide the3D space into equally spaced voxels as shown in Figure2.Sup-pose the point cloud encompasses3D space with range D, H,W along the Z,Y,X axes respectively.We deﬁne each voxel of size v D,v H,and v W accordingly.The resulting 3D voxel grid is of size D =D/v D,H =H/v H,W = W/v W.Here,for simplicity,we assume D,H,W are a multiple of v D,v H,v W.Grouping We group the points according to the voxel they reside in.Due to factors such as distance,occlusion,ob-ject’s relative pose,and non-uniform sampling,the LiDARFullyConnectedNeuralNetPoint-wiseInputPoint-wiseFeatureElement-wiseMaxpoolPoint-wiseConcatenateLocallyAggregatedFeaturePoint-wiseconcatenatedFeatureFigure3.V oxel feature encoding layer.point cloud is sparse and has highly variable point density throughout the space.Therefore,after grouping,a voxel will contain a variable number of points.An illustration is shown in Figure2,where V oxel-1has signiﬁcantly more points than V oxel-2and V oxel-4,while V oxel-3contains no point.Random Sampling Typically a high-deﬁnition LiDAR point cloud is composed of∼100k points.Directly pro-cessing all the points not only imposes increased mem-ory/efﬁciency burdens on the computing platform,but also highly variable point density throughout the space might bias the detection.To this end,we randomly sample aﬁxed number,T,of points from those voxels containing more than T points.This sampling strategy has two purposes,(1)computational savings(see Section2.3for details);and(2)decreases the imbalance of points between the voxels which reduces the sampling bias,and adds more variation to training.Stacked Voxel Feature Encoding The key innovation is the chain of VFE layers.For simplicity,Figure2illustrates the hierarchical feature encoding process for one voxel. Without loss of generality,we use VFE Layer-1to describe the details in the following paragraph.Figure3shows the architecture for VFE Layer-1.Denote V={p i=[x i,y i,z i,r i]T∈R4}i=1...t as a non-empty voxel containing t≤T LiDAR points,where p i contains XYZ coordinates for the i-th point and r i is the received reﬂectance.Weﬁrst compute the local mean as the centroid of all the points in V,denoted as(v x,v y,v z). Then we augment each point p i with the relative offset w.r.t. the centroid and obtain the input feature set V in={ˆp i= [x i,y i,z i,r i,x i−v x,y i−v y,z i−v z]T∈R7}i=1...t.Next, eachˆp i is transformed through the fully connected network (FCN)into a feature space,where we can aggregate in-formation from the point features f i∈R m to encode the shape of the surface contained within the voxel.The FCN is composed of a linear layer,a batch normalization(BN) layer,and a rectiﬁed linear unit(ReLU)layer.After obtain-ing point-wise feature representations,we use element-wise MaxPooling across all f i associated to V to get the locally aggregated feature˜f∈R m for V.Finally,we augmenteach f i with˜f to form the point-wise concatenated featureas f outi =[f T i,˜f T]T∈R2m.Thus we obtain the outputfeature set V out={f outi }i...t.All non-empty voxels areencoded in the same way and they share the same set of parameters in FCN.We use VFE-i(c in,c out)to represent the i-th VFE layer that transforms input features of dimension c in into output features of dimension c out.The linear layer learns a ma-trix of size c in×(c out/2),and the point-wise concatenation yields the output of dimension c out.Because the output feature combines both point-wise features and locally aggregated feature,stacking VFE lay-ers encodes point interactions within a voxel and enables theﬁnal feature representation to learn descriptive shape information.The voxel-wise feature is obtained by trans-forming the output of VFE-n into R C via FCN and apply-ing element-wise Maxpool where C is the dimension of the voxel-wise feature,as shown in Figure2.Sparse Tensor Representation By processing only the non-empty voxels,we obtain a list of voxel features,each uniquely associated to the spatial coordinates of a particu-lar non-empty voxel.The obtained list of voxel-wise fea-tures can be represented as a sparse4D tensor,of size C×D ×H ×W as shown in Figure2.Although the point cloud contains∼100k points,more than90%of vox-els typically are empty.Representing non-empty voxel fea-tures as a sparse tensor greatly reduces the memory usage and computation cost during backpropagation,and it is a critical step in our efﬁcient implementation.2.1.2Convolutional Middle LayersWe use Conv M D(c in,c out,k,s,p)to represent an M-dimensional convolution operator where c in and c out are the number of input and output channels,k,s,and p are the M-dimensional vectors corresponding to kernel size,stride size and padding size respectively.When the size across the M-dimensions are the same,we use a scalar to represent the size e.g.k for k=(k,k,k).Each convolutional middle layer applies3D convolution,BN layer,and ReLU layer sequentially.The convolutional middle layers aggregate voxel-wise features within a pro-gressively expanding receptiveﬁeld,adding more context to the shape description.The detailed sizes of theﬁlters in the convolutional middle layers are explained in Section3.2.1.3Region Proposal NetworkRecently,region proposal networks[32]have become an important building block of top-performing object detec-tion frameworks[38,5,23].In this work,we make several key modiﬁcations to the RPN architecture proposed in[32], and combine it with the feature learning network and con-volutional middle layers to form an end-to-end trainable pipeline.The input to our RPN is the feature map provided by the convolutional middle layers.The architecture of this network is illustrated in Figure4.The network has three blocks of fully convolutional layers.Theﬁrst layer of each block downsamples the feature map by half via a convolu-tion with a stride size of2,followed by a sequence of con-volutions of stride1(×q means q applications of theﬁlter). After each convolution layer,BN and ReLU operations are applied.We then upsample the output of every block to a ﬁxed size and concatanate to construct the high resolution feature map.Finally,this feature map is mapped to the de-sired learning targets:(1)a probability score map and(2)a regression map.2.2.Loss FunctionLet{a pos i}i=1...N pos be the set of N pos positive an-chors and{a neg j}j=1...N neg be the set of N neg negative anchors.We parameterize a3D ground truth box as (x g c,y g c,z g c,l g,w g,h g,θg),where x g c,y g c,z g c represent the center location,l g,w g,h g are length,width,height of the box,andθg is the yaw rotation around Z-axis.To re-trieve the ground truth box from a matching positive anchor parameterized as(x a c,y a c,z a c,l a,w a,h a,θa),we deﬁne the residual vector u∗∈R7containing the7regression tar-gets corresponding to center location∆x,∆y,∆z,three di-Voxel Input Feature BufferVoxel CoordinateBufferK T7Sparse TensorK31Voxel-wise FeatureK C 1Point CloudIndexingMemory CopyS t a c k e d V F EFigure 5.Illustration of efﬁcient implementation.mensions ∆l,∆w,∆h ,and the rotation ∆θ,which are com-puted as:∆x =x g c −x a cd a ,∆y =y g c −y a c d a ,∆z =z gc −z a c h a ,∆l =log(l g l a ),∆w =log(w g w a ),∆h =log(h gh a ),(1)∆θ=θg −θawhere d a =(l a )2+(w a )2is the diagonal of the base of the anchor box.Here,we aim to directly estimate the oriented 3D box and normalize ∆x and ∆y homogeneously with the diagonal d a ,which is different from [32,38,22,21,4,3,5].We deﬁne the loss function as follows:L =α1N pos i L cls (p posi ,1)+β1N neg jL cls (p neg j ,0)+1N posiL reg (u i ,u ∗i )(2)where p pos i and p neg j represent the softmax output for posi-tive anchor a posi and negative anchor a neg j respectively,whileu i ∈R 7and u ∗i ∈R 7are the regression output and ground truth for positive anchor a pos i .The ﬁrst two terms are the normalized classiﬁcation loss for {a pos i }i =1...N pos and {a negj }j =1...N neg ,where the L cls stands for binary cross en-tropy loss and α,βare postive constants balancing the rel-ative importance.The last term L reg is the regression loss,where we use the SmoothL1function [12,32].2.3.Efﬁcient ImplementationGPUs are optimized for processing dense tensor struc-tures.The problem with working directly with the point cloud is that the points are sparsely distributed across space and each voxel has a variable number of points.We devised a method that converts the point cloud into a dense tensor structure where stacked VFE operations can be processed in parallel across points and voxels.The method is summarized in Figure 5.We initialize aK ×T ×7dimensional tensor structure to store the voxel input feature buffer where K is the maximum number of non-empty voxels,T is the maximum number of points per voxel,and 7is the input encoding dimension for each point.The points are randomized before processing.For each point in the point cloud,we check if the corresponding voxel already exists.This lookup operation is done efﬁ-ciently in O (1)using a hash table where the voxel coordi-nate is used as the hash key.If the voxel is already initial-ized we insert the point to the voxel location if there are less than T points,otherwise the point is ignored.If the voxel is not initialized,we initialize a new voxel,store its coordi-nate in the voxel coordinate buffer,and insert the point to this voxel location.The voxel input feature and coordinate buffers can be constructed via a single pass over the point list,therefore its complexity is O (n ).To further improve the memory/compute efﬁciency it is possible to only store a limited number of voxels (K )and ignore points coming from voxels with few points.After the voxel input buffer is constructed,the stacked VFE only involves point level and voxel level dense oper-ations which can be computed on a GPU in parallel.Note that,after concatenation operations in VFE,we reset the features corresponding to empty points to zero such that they do not affect the computed voxel features.Finally,using the stored coordinate buffer we reorganize the com-puted sparse voxel-wise structures to the dense voxel grid.The following convolutional middle layers and RPN oper-ations work on a dense voxel grid which can be efﬁciently implemented on a GPU.3.Training DetailsIn this section,we explain the implementation details of the V oxelNet and the training procedure.work DetailsOur experimental setup is based on the LiDAR speciﬁ-cations of the KITTI dataset [11].Car Detection For this task,we consider point clouds within the range of [−3,1]×[−40,40]×[0,70.4]meters along Z,Y ,X axis respectively.Points that are projected outside of image boundaries are removed [5].We choose a voxel size of v D =0.4,v H =0.2,v W =0.2meters,which leads to D =10,H =400,W =352.We set T =35as the maximum number of randomly sam-pled points in each non-empty voxel.We use two VFE layers VFE-1(7,32)and VFE-2(32,128).The ﬁnal FCN maps VFE-2output to R 128.Thus our feature learning net generates a sparse tensor of shape 128×10×400×352.To aggregate voxel-wise features,we employ three convo-lution middle layers sequentially as Conv3D(128,64,3,(2,1,1),(1,1,1)),Conv3D(64,64,3,(1,1,1),(0,1,1)),andConv3D(64,64,3,(2,1,1),(1,1,1)),which yields a4D ten-sor of size64×2×400×352.After reshaping,the input to RPN is a feature map of size128×400×352,where the dimensions correspond to channel,height,and width of the3D tensor.Figure4illustrates the detailed network ar-chitecture for this task.Unlike[5],we use only one anchor size,l a=3.9,w a=1.6,h a=1.56meters,centered at z a c=−1.0meters with two rotations,0and90degrees. Our anchor matching criteria is as follows:An anchor is considered as positive if it has the highest Intersection over Union(IoU)with a ground truth or its IoU with ground truth is above0.6(in bird’s eye view).An anchor is considered as negative if the IoU between it and all ground truth boxes is less than0.45.We treat anchors as don’t care if they have 0.45≤IoU≤0.6with any ground truth.We setα=1.5 andβ=1in Eqn.2.Pedestrian and Cyclist Detection The input range1is [−3,1]×[−20,20]×[0,48]meters along Z,Y,X axis re-spectively.We use the same voxel size as for car detection, which yields D=10,H=200,W=240.We set T=45 in order to obtain more LiDAR points for better capturing shape information.The feature learning network and con-volutional middle layers are identical to the networks used in the car detection task.For the RPN,we make one mod-iﬁcation to block1in Figure4by changing the stride size in theﬁrst2D convolution from2to1.This allowsﬁner resolution in anchor matching,which is necessary for de-tecting pedestrians and cyclists.We use anchor size l a= 0.8,w a=0.6,h a=1.73meters centered at z a c=−0.6 meters with0and90degrees rotation for pedestrian detec-tion and use anchor size l a=1.76,w a=0.6,h a=1.73 meters centered at z a c=−0.6with0and90degrees rota-tion for cyclist detection.The speciﬁc anchor matching cri-teria is as follows:We assign an anchor as postive if it has the highest IoU with a ground truth,or its IoU with ground truth is above0.5.An anchor is considered as negative if its IoU with every ground truth is less than0.35.For anchors having0.35≤IoU≤0.5with any ground truth,we treat them as don’t care.During training,we use stochastic gradient descent (SGD)with learning rate0.01for theﬁrst150epochs and decrease the learning rate to0.001for the last10epochs. We use a batchsize of16point clouds.3.2.Data AugmentationWith less than4000training point clouds,training our network from scratch will inevitably suffer from overﬁtting. To reduce this issue,we introduce three different forms of data augmentation.The augmented training data are gener-ated on-the-ﬂy without the need to be stored on disk[20].1Our empirical observation suggests that beyond this range,LiDAR returns from pedestrians and cyclists become very sparse and therefore detection results will be unreliable.Deﬁne set M={p i=[x i,y i,z i,r i]T∈R4}i=1,...,N as the whole point cloud,consisting of N points.We parame-terize a3D bouding box b i as(x c,y c,z c,l,w,h,θ),where x c,y c,z c are center locations,l,w,h are length,width, height,andθis the yaw rotation around Z-axis.We de-ﬁneΩi={p|x∈[x c−l/2,x c+l/2],y∈[y c−w/2,y c+ w/2],z∈[z c−h/2,z c+h/2],p∈M}as the set con-taining all LiDAR points within b i,where p=[x,y,z,r] denotes a particular LiDAR point in the whole set M.Theﬁrst form of data augmentation applies perturbation independently to each ground truth3D bounding box to-gether with those LiDAR points within the box.Speciﬁ-cally,around Z-axis we rotate b i and the associatedΩi with respect to(x c,y c,z c)by a uniformally distributed random variable∆θ∈[−π/10,+π/10].Then we add a translation (∆x,∆y,∆z)to the XYZ components of b i and to each point inΩi,where∆x,∆y,∆z are drawn independently from a Gaussian distribution with mean zero and standard deviation1.0.To avoid physically impossible outcomes,we perform a collision test between any two boxes after the per-turbation and revert to the original if a collision is detected. Since the perturbation is applied to each ground truth box and the associated LiDAR points independently,the net-work is able to learn from substantially more variations than from the original training data.Secondly,we apply global scaling to all ground truth boxes b i and to the whole point cloud M.Speciﬁcally, we multiply the XYZ coordinates and the three dimen-sions of each b i,and the XYZ coordinates of all points in M with a random variable drawn from uniform distri-bution[0.95,1.05].Introducing global scale augmentation improves robustness of the network for detecting objects with various sizes and distances as shown in image-based classiﬁcation[35,18]and detection tasks[12,17].Finally,we apply global rotation to all ground truth boxes b i and to the whole point cloud M.The rotation is applied along Z-axis and around(0,0,0).The global ro-tation offset is determined by sampling from uniform dis-tribution[−π/4,+π/4].By rotating the entire point cloud, we simulate the vehicle making a turn.4.ExperimentsWe evaluate V oxelNet on the KITTI3D object detection benchmark[11]which contains7,481training images/point clouds and7,518test images/point clouds,covering three categories:Car,Pedestrian,and Cyclist.For each class, detection outcomes are evaluated based on three difﬁculty levels:easy,moderate,and hard,which are determined ac-cording to the object size,occlusion state,and truncation level.Since the ground truth for the test set is not avail-able and the access to the test server is limited,we con-duct comprehensive evaluation using the protocol described in[4,3,5]and subdivide the training data into a training setMethod ModalityCar Pedestrian CyclistEasy Moderate Hard Easy Moderate Hard Easy Moderate HardMono3D[3]Mono 5.22 5.19 4.13N/A N/A N/A N/A N/A N/A 3DOP[4]Stereo12.639.497.59N/A N/A N/A N/A N/A N/A VeloFCN[22]LiDAR40.1432.0830.47N/A N/A N/A N/A N/A N/A MV(BV+FV)[5]LiDAR86.1877.3276.33N/A N/A N/A N/A N/A N/A MV(BV+FV+RGB)[5]LiDAR+Mono86.5578.1076.67N/A N/A N/A N/A N/A N/A HC-baseline LiDAR88.2678.4277.6658.9653.7951.4763.6342.7541.06 V oxelNet LiDAR89.6084.8178.5765.9561.0556.9874.4152.1850.49 Table1.Performance comparison in bird’s eye view detection:average precision(in%)on KITTI validation set.Method ModalityCar Pedestrian CyclistEasy Moderate Hard Easy Moderate Hard Easy Moderate HardMono3D[3]Mono 2.53 2.31 2.31N/A N/A N/A N/A N/A N/A 3DOP[4]Stereo 6.55 5.07 4.10N/A N/A N/A N/A N/A N/A VeloFCN[22]LiDAR15.2013.6615.98N/A N/A N/A N/A N/A N/A MV(BV+FV)[5]LiDAR71.1956.6055.30N/A N/A N/A N/A N/A N/A MV(BV+FV+RGB)[5]LiDAR+Mono71.2962.6856.56N/A N/A N/A N/A N/A N/A HC-baseline LiDAR71.7359.7555.6943.9540.1837.4855.3536.0734.15 V oxelNet LiDAR81.9765.4662.8557.8653.4248.8767.1747.6545.11 Table2.Performance comparison in3D detection:average precision(in%)on KITTI validation set.and a validation set,which results in3,712data samples for training and3,769data samples for validation.The split avoids samples from the same sequence being included in both the training and the validation set[3].Finally we also present the test results using the KITTI server.For the Car category,we compare the proposed method with several top-performing algorithms,including image based approaches:Mono3D[3]and3DOP[4];LiDAR based approaches:VeloFCN[22]and3D-FCN[21];and a multi-modal approach MV[5].Mono3D[3],3DOP[4]and MV[5]use a pre-trained model for initialization whereas we train V oxelNet from scratch using only the LiDAR data provided in KITTI.To analyze the importance of end-to-end learning,we implement a strong baseline that is derived from the V ox-elNet architecture but uses hand-crafted features instead of the proposed feature learning network.We call this model the hand-crafted baseline(HC-baseline).HC-baseline uses the bird’s eye view features described in[5]which are computed at0.1m resolution.Different from[5],we in-crease the number of height channels from4to16to cap-ture more detailed shape information–further increasing the number of height channels did not lead to performance improvement.We replace the convolutional middle lay-ers of V oxelNet with similar size2D convolutional layers, which are Conv2D(16,32,3,1,1),Conv2D(32,64,3,2, 1),Conv2D(64,128,3,1,1).Finally RPN is identical in V oxelNet and HC-baseline.The total number of parame-ters in HC-baseline and V oxelNet are very similar.We train the HC-baseline using the same training procedure and data augmentation described in Section3.4.1.Evaluation on KITTI Validation SetMetrics We follow the ofﬁcial KITTI evaluation protocol, where the IoU threshold is0.7for class Car and is0.5for class Pedestrian and Cyclist.The IoU threshold is the same for both bird’s eye view and full3D evaluation.We compare the methods using the average precision(AP)metric. Evaluation in Bird’s Eye View The evaluation result is presented in Table1.V oxelNet consistently outperforms all the competing approaches across all three difﬁculty levels. HC-baseline also achieves satisfactory performance com-pared to the state-of-the-art[5],which shows that our base region proposal network(RPN)is effective.For Pedestrian and Cyclist detection tasks in bird’s eye view,we compare the proposed V oxelNet with HC-baseline.V oxelNet yields substantially higher AP than the HC-baseline for these more challenging categories,which shows that end-to-end learn-ing is essential for point-cloud based detection.We would like to note that[21]reported88.9%,77.3%, and72.7%for easy,moderate,and hard levels respectively, but these results are obtained based on a different split of 6,000training frames and∼1,500validation frames,and they are not directly comparable with algorithms in Table1. Therefore,we do not include these results in the table. Evaluation in3D Compared to the bird’s eye view de-tection,which requires only accurate localization of ob-jects in the2D plane,3D detection is a more challeng-ing task as it requiresﬁner localization of shapes in3D space.Table2summarizes the comparison.For the class Car,V oxelNet signiﬁcantly outperforms all other ap-proaches in AP across all difﬁculty levels.Speciﬁcally, using only LiDAR,V oxelNet signiﬁcantly outperforms the。

Geometric Modeling

Geometric ModelingGeometric modeling is a crucial aspect of computer graphics and design, playing a significant role in various fields such as engineering, architecture, animation, and gaming. It involves the creation and manipulation of geometric shapes and structures in a digital environment, allowing for the visualization and representation of complex objects and scenes. However, despite its importance, geometric modeling presents several challenges and limitations that need to be addressed in order to improve its efficiency and effectiveness. One of the primary issues in geometric modeling is the complexity of representing real-world objects and environments in a digital format. The process of converting physical objects into digital models involves capturing and processing a vast amount of data, which can be time-consuming and resource-intensive. This is particularly challenging when dealing with intricate and irregular shapes, as it requires advanced techniques such as surface reconstruction and mesh generation to accurately capture the details of the object. As a result, geometric modeling often requires a balance between precision and efficiency, as the level of detail in the model directly impacts its computational cost and performance. Another challenge in geometric modeling is the need for seamless integration with other design and simulation tools. In many applications, geometric models are used as a basis for further analysis and manipulation, such as finite element analysis in engineering or physics-based simulations in animation. Therefore, it is essential for geometric modeling software to be compatible with other software and data formats, allowing for the transfer and utilization of geometric models across different platforms. This interoperability is crucial for streamlining the design and production process, as it enables seamless collaboration and data exchange between different teams and disciplines. Furthermore, geometric modeling also faces challenges related to the representation and manipulation of geometric data. Traditional modeling techniques, such as boundary representation (B-rep) and constructive solid geometry (CSG), have limitations in representing complex and organic shapes, often leading to issues such as geometric inaccuracies and topological errors. To address this, advanced modeling techniques such as non-uniform rational B-splines (NURBS) and subdivision surfaces have been developed toprovide more flexible and accurate representations of geometric shapes. However, these techniques also come with their own set of challenges, such as increased computational complexity and difficulty in controlling the shape of the model. In addition to technical challenges, geometric modeling also raises ethical and societal considerations, particularly in the context of digital representation and manipulation. As the boundary between physical and digital reality becomes increasingly blurred, issues such as intellectual property rights, privacy, and authenticity of digital models have become more prominent. For example, the unauthorized use and reproduction of digital models can lead to copyright infringement and legal disputes, highlighting the need for robust mechanisms to protect the intellectual property of digital content creators. Similarly, the rise of deepfakes and digital forgeries has raised concerns about the potential misuse of geometric modeling technology for malicious purposes, such as misinformation and identity theft. It is crucial for the industry to address these ethical concerns and develop standards and regulations to ensure the responsible use of geometric modeling technology. Despite these challenges, the field of geometric modeling continues to evolve and advance, driven by the growing demand forrealistic and interactive digital experiences. Recent developments in machine learning and artificial intelligence have shown promise in addressing some of the technical limitations of geometric modeling, such as automated feature recognition and shape optimization. Furthermore, the increasing availability of powerful hardware and software tools has enabled more efficient and accessible geometric modeling workflows, empowering designers and artists to create intricate and immersive digital content. With ongoing research and innovation, it is likely that many of the current challenges in geometric modeling will be overcome, leading to more sophisticated and versatile tools for digital design and visualization. In conclusion, geometric modeling is a critical component of modern digital design and visualization, enabling the creation and manipulation of complex geometric shapes and structures. However, the field faces several challenges related to the representation, integration, and ethical implications of geometric models. By addressing these challenges through technological innovation and ethical considerations, the industry can continue to push the boundaries of what ispossible in digital design and create more immersive and impactful experiences for users.。

(NEW)介绍农业知识英文

Title: The Importance of AgricultureAgriculture, the backbone of civilization, is a pivotal sector that encompasses the production of food, fibers, and other products derived from plants and animals. It is an intricate science that involves understanding soil management, crop rotation, pest control, and the use of technology to enhance yields.Sustainable agriculture aims to balance productivity with environmental stewardship, ensuring that farming practices do not deplete resources but rather replenish them for future generations. This includes techniques like organic farming, which eschews chemical fertilizers and pesticides in favor of natural alternatives, and permaculture, which designs agricultural systems mimicking the patterns found in nature.The role of agriculture extends beyond food production; it influences economic growth, social structures, and even political stability. Rural communities rely on this sector as a primary source of income and livelihood.Innovations in precision agriculture, such as the use of drones, satellite imaging, and data analytics, are transforming the field by enabling farmers to make informed decisions, optimize resource use, and increase efficiency.Understanding and valuing the complexities of agriculture is crucial for supporting sustainable development and securing our food supply well into the future.。

mosaic数据增强计算公式

mosaic数据增强计算公式英文回答：Data augmentation is a technique used to increase the size and diversity of a dataset by applying various transformations to the original data. In the context of mosaic data augmentation, the goal is to create new images by combining different patches or tiles from multiple images.To calculate the mosaic data augmentation, we first need to select a set of source images that will be used to create the mosaic. These source images can be from the same dataset or different datasets, depending on the specific task or application.Once the source images are selected, we can proceed with the mosaic data augmentation process. Here are the steps involved:1. Patch Extraction: In this step, we extract patchesor tiles from the source images. The size and shape of the patches can vary depending on the requirements. For example, we can extract square patches of size 64x64 pixels.2. Patch Placement: After extracting the patches, we need to place them in a mosaic grid. The grid can have a fixed size or be dynamically adjusted based on the numberof patches and their sizes. The patches can be placed randomly or in a specific order.3. Patch Blending: To create a seamless mosaic, we need to blend the patches together. This can be done using various blending techniques, such as alpha blending or feathering. The goal is to make the transitions between patches smooth and natural.4. Data Augmentation: Once the mosaic is created, wecan apply additional data augmentation techniques tofurther increase the diversity of the dataset. This can include random rotations, flips, or color transformations.5. Output Generation: Finally, we generate the augmented dataset by saving the mosaic images along with their corresponding labels or annotations. These images can then be used for training machine learning models or other tasks.For example, let's say we have a dataset of flower images. To create a mosaic data augmentation, we select four source images from the dataset. We extract patches of size 64x64 pixels from each image and place them in a 2x2 grid. We blend the patches together using alpha blending to create a seamless mosaic. We can then apply randomrotations and flips to the mosaic image to further augment the dataset. The resulting augmented dataset will have a larger size and more diversity, which can improve the performance of our models.中文回答：数据增强是一种通过对原始数据应用各种变换来增加数据集大小和多样性的技术。

AI训练中的超参数优化寻找最佳参数的方法

AI训练中的超参数优化寻找最佳参数的方法1. 引言在机器学习和深度学习领域，超参数（hyperparameters）是指那些不能通过模型本身学习得到的参数。

它们需要在训练开始之前进行手动调整，以使得模型达到最佳的性能。

超参数包括学习率、批次大小、隐藏单元的数量等。

而超参数优化则是指寻找最佳超参数的过程，这对于模型的收敛速度、泛化能力和鲁棒性起着至关重要的作用。

2. 传统的超参数优化方法传统的超参数优化方法通常采用了网格搜索（Grid Search）和随机搜索（Random Search）两种策略。

网格搜索：网格搜索是一种穷举搜索的方法，它通过预定义的超参数组合来构建一个超参数空间，并遍历该空间中的所有组合，找到最佳的超参数。

虽然这种方法简单直观，但是当超参数的数量增加时，搜索空间的大小会呈指数级增长，导致搜索效率低下。

随机搜索：随机搜索通过在超参数空间中随机采样来寻找最佳超参数。

相对于网格搜索，随机搜索具有更高的搜索效率，尤其适用于超参数空间较大的情况。

然而，这种方法可能会错过一些具有潜力的超参数组合。

3. 基于优化算法的超参数优化方法由于传统的超参数优化方法存在一些局限性，研究者们提出了基于优化算法的超参数优化方法，其中最典型的是贝叶斯优化和遗传算法。

贝叶斯优化：贝叶斯优化通过对超参数空间进行建模和采样，以明确超参数与性能之间的关系，并在实时探索和利用的过程中找到最佳超参数。

它通过基于历史数据的先验推断来指导搜索过程，从而在有限次采样的情况下获得更好的结果。

遗传算法：遗传算法模拟了进化的过程，通过不断更新超参数的组合来实现目标函数的最小化或最大化。

它通过模拟自然选择、交叉和变异等操作来逐步优化超参数，从而找到最佳超参数。

遗传算法的概念启发人们开发了很多其他优化算法，如粒子群算法、蚁群算法等。

4. 自动机器学习工具（AutoML）自动机器学习工具（AutoML）是利用机器学习和优化算法自动完成模型选择、超参数优化和特征工程等任务的工具。

点云曲面重建 python -回复

点云曲面重建python -回复什么是点云曲面重建？点云曲面重建（Point Cloud Surface Reconstruction）是指从离散的三维点云数据中恢复出连续的曲面模型。

点云是由一系列的三维点构成的数据集合，每个点包含了其在空间中的坐标信息。

点云曲面重建在许多领域中都有广泛应用，如计算机图形学、机器人视觉、地质勘探等。

为什么需要进行点云曲面重建？点云数据是从三维传感器（如激光扫描仪、深度相机等）获取的结果。

然而，点云数据通常是不规则、稀疏和噪声干扰的。

为了对点云数据进行进一步的分析和应用，常常需要将其转化为连续的曲面模型。

点云曲面重建可以提供更加紧凑和结构化的表达方式，并且方便后续的处理和分析。

点云曲面重建的方法有哪些？目前，有许多点云曲面重建的方法被提出并广泛应用。

这些方法可以分为两类：基于表面重建的方法和基于体素的方法。

1. 基于表面重建的方法：这类方法通常从点云数据中提取一组三角形网格，通过将点云数据投影到二维平面上，进行网格化和拓扑关系建立，最终重建出曲面模型。

其中，较为常用的方法有：- 三角化方法：如Delaunay 三角剖分，将点云数据按照一定的准则分割为一组互相连接的三角形。

- 隐式曲面方法：如移动最小二乘（Moving Least Squares, MLS）、高斯过程回归（Gaussian Process Regression, GPR），通过对点云数据进行参数化拟合得到曲面模型。

- 曲率估计方法：如法线估计、曲率估计，通过计算点云数据的法线信息和曲率信息，构建曲面模型。

2. 基于体素的方法：这类方法将点云数据转化为三维体素网格，通过体素的表示和插值等方式，重建出连续的曲面模型。

基于体素的点云曲面重建方法具有较好的鲁棒性和灵活性，常用的方法有：- Poisson重建：通过计算体素场的梯度信息，根据泊松方程恢复连续的曲面模型。

- 网格方法：基于体素网格，通过插值和剖分的方式，得到曲面模型。

西门子PLM软件Parasolid 产品介绍说明书

Siemens PLM Software ParasolidSummaryParasolid® software is the world’s premier 3D geometric modeling component, selected by leading application vendors and end-user organizations spanning multiple industries as their preferred platform for delivering innovative 3D solutions with unparalleled modeling power, versatility and interoperability. A key offering within Siemens PLM Software’s PLM Components family of software products, Parasolid is tar-geted at a broad range of applications across the product lifecycle and provides robust, high-quality functionality that is easy to use and cost-effective to implement.World-class geometric modeling for demanding 3D applications Parasolid supports solid modeling, facet modeling, generalized cellular modeling, direct modeling and freeform surface/sheet modeling within an integrated framework. Parasolid is available in three commercial packages: Designer, Editor and Communicator – each of which is offered with convergent modeling technology as an option – and is also available to the aca-demic community via an Educator package. The functional scope and typical application at each level are outlined below. The table on the next page summarizes the corre-sponding functionality.Parasolid Designer delivers the full power of Parasolid functionality for unlimited creation, manipulation, interrogation and storage of 3D models. Over 900 object-based API functions provide the most comprehensive and robust 3D modeling platform for demanding 3D applications. Parasolid Editor provides an extended subset of Parasolid functionality that is ideal for analysis, manufacturing and other downstream applications that need to easily manipulate, edit, repair or simplify 3D models without the need for advanced modeling operations.Parasolid benefits• Provides ideal foundationfor innovative 3D applica-tion development• Reduces development costsand risks by providing aproven 3D modelingsolution• Ensures state-of-the-artquality and robustness• Convergent modeling tech-nology seamlesslyintegrates classic b-rep andfacet b-rep modeling opera-tions in a unifiedarchitecture• Offers world-class technicalsupport for rapidtime-to-market• Enables instantcompatibility with otherParasolid-based applicationsthrough translation-freeexchange of 3D data/plmcomponentsParasolidPLM COMPONENTSParasolid Communicator comprises versatile base functionality, including interoper a bility, visualization and data interro g ation capabili-ties, that provides a platform for applications to consume existing 3D models.Parasolid Educator, complementing the above commercial packages, provides academic institutions with the full power of Parasolid functionality for teaching,research and industrial collaboration.Parasolid facts• Fully integrated modeling of 3D curves, surfaces and solids with over 900 API functions • Modeling foundation for hundreds of the world’s leading CAD, CAM and CAE applications• Corporate standard forSiemens’ NX™, Solid Edge®, Femap™ and Teamcenter® software solutions • Used in over 3.5 million seats of application soft-ware globally• Licensed by 170 software vendors for integration into more than 350 applications • Provides industry-leading robustness with over a mil-lion quality tests run daily • Provides unmatched two-way data compatibility via Parasolid native XT format Parasolid usageParasolid is the component of choice for both cloud-based solutions and traditional stand-alone workstations. Parasolid is deployed across a wide range of PLM application domains, including:• Mechanical CAD • CAM/manufacturing • CAE/validation • AEC• Visualization • Data exchange • Interoperability • Knowledge-based engineering • CMM/inspection • CNC/machine tools • Corporate R&D • Academic R&DPLM COMPONENTSFoundation capabilitiesParasolid is built on critical foundation capabilities that enable Parasolid to be deployed successfully in a wide variety of software applications. Enabled across all relevant functionality, Parasolid foundation capabilities include:• Tolerant modeling for intrinsically reliable modeling with imported data• Convergent modeling technology, available as a licensed option, seamlessly integrates classic b-rep and facet b-rep modeling operations in a unified architecture• Attributes and callbacks for application-specific character-istics and behavior• Session and partitioned rollback for flexible history and undo/redo implementation• Data management and tracking for managing models and associated data as they evolve• Thread safety and symmetric multi-processing support for optimal performance on multi-processor machines• Model storage in forwards and backwards compatible native XT format• .NET Binding to integrate Parasolid into .NET applications written in C#• Broad platform coverage including comprehensive support for Windows, Linux, Unix and MacGetting startedParasolid is delivered with a compre h ensive set of documen-tation and developer resources, including a complete Jumpstart Kit of tools that promote easy integration of Parasolid into new and existing applications:• Full Product Documentation Suite in html and pdf formats • Parasolid Workshop prototypingenvironment for Windows• Example Application Resourcesto get you up and running• Code Example Suite illustratesbest implementation practice• Parasolid ‘Getting Started’ Guideanswers your questions• Parasolid Overview summarizesParasolid capabilities• Parasolid API Training Materialsto educate the team Support, training and consultingParasolid has a renowned technical support, training and consulting team, dedicated to helping customers achieve the best possible implementation by providing expert advice on all matters related to Parasolid usage.Responsive telephone and email support is backed by an online support center that provides round-the-clock access to frequent product updates, as well as customer-specific issue reporting and tracking.In addition, specialized training and consulting services are available that can be tailored to customer requirements. Whether you are starting fresh, extending an existing appli-cation or transitioning from other modeling technology, the Parasolid support, training and consulting team is with you every step of the way.Interoperability productsThe Parasolid product suite is augmented by a range of add-on products that provide high-quality interoperability with third-party CAD data. These include Parasolid Bodyshop, a specialized tool for boosting the success of 3D data exchange by cleaning and repairing imported models, and Parasolid Translator toolkits for converting model data between Parasolid and other major standard and proprietary CAD formats, including STEP, IGES, Catia V4, Catia V5, Pro/ Engineer and ACIS(SAT).Siemens PLM Software partners with Tech Soft 3D to offer Hoops Exchange. This highly-integrated and industry-proven 3D data collaboration solution for Parasolid provides high performance import, export, healing and visualization toolsfor a wide range of 3D file formats.Siemens PLM SoftwareAmericas +1 314 264 8499Europe +44 (0) 1276 413200Asia-Pacific +852 2230 3308/plmrespective holders.5661-Y7 3/16 BConvergent modeling: facetmodel of knee joint withb-rep surgical guide, mod-eled in single architecture.。

大模型预训练参数更新流程

大模型预训练参数更新流程Pre-training large models has become a popular approach in natural language processing and computer vision tasks. These models are first trained on massive datasets to learn general patterns and representations of the data before being fine-tuned on specific tasks. One of the key challenges in maintaining large pre-trained models is updating their parameters efficiently without losing previously learned knowledge.大规模模型预先训练已成为自然语言处理和计算机视觉任务中的一种流行方法。

这些模型首先在大型数据集上进行训练，以学习数据的一般模式和表示，然后再在特定任务上进行微调。

在维护大型预训练模型时的一个关键挑战是高效地更新它们的参数，而不会丢失先前学到的知识。

When updating the parameters of a large pre-trained model, it is crucial to strike a balance between retaining previous knowledge and adapting to new data. This requires careful management of the learning rate, batch size, and training steps to prevent catastrophic forgetting. One common approach is to use techniques like gradualunfreezing of layers, where the lower layers are frozen initially and gradually unfrozen as training progresses.在更新大型预训练模型的参数时，关键之处在于在保留先前知识和适应新数据之间取得平衡。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

a r X i v :a s t r o -p h /0505329v 4 28 M a r 2007Mon.Not.R.Astron.Soc.000,000–000(0000)Printed 2February 2008(MN L A T E X style ﬁle v2.2)Smoothing Supernova Data to Reconstruct the ExpansionHistory of the Universe and its AgeArman Shaﬁeloo 1,3,Ujjaini Alam 1,4,Varun Sahni 1,5and Alexei A.Starobinsky 2,61Inter University Centre for Astronomy &Astrophysics,Pune,India 2Landau Institute for Theoretical Physics,119334Moscow,Russia 3arman@iucaa.ernet.in 4ujjaini@iucaa.ernet.in 5varun@iucaa.ernet.in 6alstar@landau.ac.ru2February 2008ABSTRACTWe propose a non-parametric method of smoothing supernova data over redshift using a Gaussian kernel in order to reconstruct important cosmological quantities including H (z )and w (z )in a model independent manner.This method is shown to be successful in discriminating between diﬀerent models of dark energy when the quality of data is commensurate with that expected from the future SuperNova Acceleration Probe (SNAP).We ﬁnd that the Hubble parameter is especially well-determined and useful for this purpose.The look back time of the universe may also be determined to avery high degree of accuracy (<∼0.2%)in this method.By reﬁning the method,it is also possible to obtain reasonable bounds on the equation of state of dark energy.We explore a new diagnostic of dark energy–the ‘w -probe ’–which can be calculated from the ﬁrst derivative of the data.We ﬁnd that this diagnostic is reconstructed extremely accurately for diﬀerent reconstruction methods even if Ω0m is marginalized over.The w -probe can be used to successfully distinguish between ΛCDM and other models of dark energy to a high degree of accuracy.Key words:cosmology:theory—cosmological parameters—statistics1INTRODUCTIONThe nature of dark energy has been the subject of much debate over the past decade (for reviews see Sahni &Starobinsky (2000);Carroll (2001);Peebles &Ratra (2003);Padmanabhan (2003);Sahni (2004)).The supernova (SNe)type Ia data,which gave the ﬁrst indications of the accelerated expansion of the universe,are expected to throw further light on this intriguing ques-tion as their quality steadily improves.While the number of SNe available to us has increased two-fold over the past couple of years (at present there are about 150SNe between redshifts of 0and 1.75,with 10SNe above a redshift of unity)(Riess et al.1998;Perlmutter et al.1999;Knop et al.2003;Tonry et al.2003;Riess et al.2004),the SNe data are still not of a quality to ﬁrmly distinguish diﬀerent models of dark energy.In this connection,an important role in our quest for a deeper understanding of the nature of dark energy has been played by the ‘reconstruction program’.Commencing from the ﬁrst theoretical exposition of the reconstruction idea –Starobinsky (1998);Huterer &Turner (1999);Nakamura &Chiba (1999),and Saini et al.(2000)which applied it to an early supernova data set–there have been many attempts to reconstruct the properties of dark energy directly from observational data without assuming any particular microscopic/phenomenological model for the former.When using SNe data for this purpose,the main obstacle is the necessity to:(i)diﬀerentiate the data once to pass from the luminosity distance d L to the Hubble parameter H (t )≡˙a (t )/a (t )and to the eﬀective energy density of dark energy ǫDE ,(ii)diﬀerentiate the data a second time in order to obtain the deceleration parameter q ≡−¨a a/˙a 2,the dark energy eﬀective pressure p DE ,and the equation of state parameter w (t )≡p DE /ǫDE .Here,a (t )is the scale factor of a Friedmann-Robertson-Walker (FRW)isotropic cosmological model which we further assume to be spatially ﬂat,as predicted by the simplest variants of the inﬂationary scenario of the early Universe and conﬁrmed by observational CMB data.To get around this obstacle,some kind of smoothing of d L data with respect to its argument –the redshift z (t )–is needed.One possible way is to parameterize the quantity which is of interest (H (z ),w (z ),etc.)by some2Arman Shaﬁeloo,Ujjaini Alam,Varun Sahni,Alexei A.Starobinskyfunctional form containing a few free parameters and then determine the value of these parameters which produce the bestﬁt to the data.This implies an implicit smoothing of d L with a characteristic smoothing scale deﬁned by the number of parameters,and with a weight depending on the form of parameterization.Diﬀerent parameterizations have been used for:d L(Huterer&Turner1999;Saini et al. 2000;Chiba&Nakamura2000),H(z)(Sahni et al.2003; Alam et al.2004;Alam,Sahni&Starobinsky2004a), w(z)(Chevallier&Polarski2001;Weller&Albrecht 2002;Gerke&Efstathiou2002;Maor et al. 2002;Corasaniti&Copeland2003;Linder2003; Wang&Mukherjee2004;Saini,Weller&Bridle 2004;Nesseris&Perivelaroupolos2004;Gong2005a; Lazkoz,Nesseris&Perivelaroupolos2005)and V(z) (Simon,Verde&Jimenez2005;Guo,Ohta&Zhang2005). In Huterer&Turner(1999),a polynomial expansion of the luminosity distance was used to reconstruct the equation of state.However,Weller&Albrecht(2002)showed this ansatz to be inadequate since it needed an arbitrarily large number of parameters toﬁt even the simplestΛCDM equation of state.They proposed instead a polynomial ansatz for the equation of state which worked somewhat better.In Saini et al.(2000)a rational Pad`e-type ansatz for d L was proposed,which gave good results.In recent times there have been many more attempts at parameterizing dark energy.In Chevallier&Polarski(2001)and Linder (2003)an ansatz of the form w=w0+w a(1−a)was suggested for the equation of state.Corasaniti&Copeland (2003)suggested a four-parameter ansatz for the equation of state.Sahni et al.(2003)proposed a slightly diﬀerent approach in which the dark energy density was expanded in a polynomial ansatz,the properties of which were examined in(Alam et al.2004;Alam,Sahni&Starobinsky 2004a;Alam et al.2004b).See Alam et al.(2003);Gong (2005b);Basset,Corasaniti&Kunz(2004)for a summary of diﬀerent approaches to the reconstruction program and for a more extensive list of references.In spite of some ambiguity in the form of these diﬀerent parameterizations, it is reassuring that they produce consistent results for the bestﬁt curve over the range0.1<∼z<∼1where we have suﬃcient amount of data(see,e.g.,Fig.10in Gong (2005b)).However it is necessary to point out that the current SNe data are not of a quality that could allow us to unambiguously diﬀerentiateΛCDM from evolving dark energy.That is why our focus in this paper will be on better quality data(from the SNAP experiment)which should be able to successfully address this important issue.A diﬀerent,non-parametric smoothing procedure involves directly smoothing either d L,or any other quantity deﬁned within redshifts bins,with some characteristic smoothing scale.Diﬀerent forms of this approach have been elaborated in Wang&Lovelace (2001);Huterer&Starkman(2003);Saini(2003); Daly&Djorgovsky(2003,2004);Wang&Tegmark (2005);Espana-Bonet&Ruiz-Lapuente(2005).One of the advantages of this approach is that the dependence of the results on the size of the smoothing scale becomes explicit. We emphasize again that the present consensus seems to be that,while the cosmological constant remains a good ﬁt to the data,more exotic models of dark energy are by no means ruled out(though their diversity has been signiﬁcantly narrowed already).Thus,until the qualityof data improves dramatically,theﬁnal judgment on the nature of dark energy cannot yet be pronounced.In this paper,we develop a new reconstruction method which formally belongs to the second category,and whichis complementary to the approach ofﬁtting a parametric ansatz to the dark energy density or the equation of state. Most of the papers using the non-parametric approach cited above exploited a kind of top-hat smoothing in redshift space.Instead,we follow a procedure which is well knownand frequently used in the analysis of large-scale structure (Coles&Lucchin1995;Martinez&Saar2002);namely,we attempt to smooth noisy data directly using a Gaussian smoothing function.Then,from the smoothed data,we cal-culate diﬀerent cosmological functions and,thus,extract information about dark energy.This method allows us to avoid additional noise due to sharp borders between bins. Furthermore,since our method does not assume any deﬁ-nite parametric representation of dark energy,it does not bias results towards any particular model.We therefore ex-pect this method to give us model-independent estimates of cosmological functions,in particular,the Hubble parameterH(z)≡˙a(t)/a(t).On the basis of data expected from the SNAP satellite mission,we show that the Gaussian smooth-ing ansatz proposed in this paper can successfully distin-guish between rival cosmological models and help shed lighton the nature of dark energy.2METHODOLOGYIt is useful to recall that,in the context of structure for-mation,it is often advantageous to obtain a smoothed den-sityﬁeldδS(x)from aﬂuctuating‘raw’densityﬁeld,δ(x′), using a low passﬁlter F having a characteristic scale R f (Coles&Lucchin1995)δS(x,R f)= δ(x′)F(|x−x′|;R f)d x′.(1)Commonly usedﬁlters include:(i)the‘top-hat’ﬁlter,which has a sharp cutoﬀF TH∝Θ(1−|x−x′|/R TH),whereΘis the Heaviside step function(Θ(z)=0for z 0,Θ(z)=1for z>0)and(ii)the Gaussianﬁlter F G∝exp(−|x−x′|2/2R2G).For our purpose,we shallﬁnd it use-ful to apply a variant of the Gaussianﬁlter to reconstructthe properties of dark energy from supernova data.In other words,we apply Gaussian smoothing to supernova data (which is of the form{ln d L(z i),z i})in order to extract in-formation about important cosmological parameters such asH(z)and w(z).The smoothing algorithm calculates the lu-minosity distance at any arbitrary redshift z to beln d L(z,∆)s=ln d L(z)g+N(z) i[ln d L(z i)−ln d L(z i)g]×exp −ln2 1+z i2∆2 ,(2) N(z)−1= i exp −ln2 1+z i2∆2 .Here,ln d L(z,∆)s is the smoothed luminosity distance atany redshift z which depends on luminosity distances of eachSmoothing Supernova Data to Reconstruct the Expansion History of the Universe and its Age3Table 1.Expected number of supernovae per redshift bin from the SNAP experiment∆z0.1–0.20.2–0.30.3–0.40.4–0.50.5–0.60.6–0.70.7–0.80.8–0.9N 1701551421301191079480dzd L (z )1−(H 0/H )2Ω0m (1+z )3.(4)Theresults will clearly depend upon the value of the scale ∆in (2).A large value of ∆produces a smooth result,but the accuracy of reconstruction worsens,while a small ∆gives a more accurate,but noisy result.Note that,for |z −z i |≪1,the exponent in Eq.(2)reduces to the form −(z −z i )2/2∆2(1+z )2.Thus,the eﬀective Gaussian smooth-ing scale for this algorithm is ∆(1+z ).We expect to obtain an optimum value of ∆for which both smoothness and ac-curacy are reasonable.The Hubble parameter can also be used to obtained the weighted average of w 1+¯w =11+z=1δln(1+z ).(5)˜ρDE is the dark energy density ˜ρDE =ρDE /ρ0c (whereρ0c =3H 20/8πG ).We shall show in the section 5that ¯w ,which we call the w -probe,acts as an excellent diagnostic of dark energy,and can diﬀerentiate between diﬀerent models of dark energy with greater accuracy than the equation of state.To check our method,we use data simulated accord-ing to the SuperNova Acceleration Probe (SNAP)experi-ment.This space-based mission is expected to observe close to 6000supernovae,of which about 2000supernovae can beused for cosmological purposes (Aldering et al.2004).We propose to use a distribution of 1998supernovae between redshifts of 0.1and 1.7obtained from Aldering et al.(2004).This distribution of 1998supernovae is shown in Table 1.Although SNAP will not be measuring supernovae at red-shifts below z =0.1,it is not unreasonable to assume that,by the time SNAP comes up,we can expect high quality data at low redshifts from other supernova surveys such as the Nearby SN Factory 1.Hence,in the low redshift region z <0.1,we add 25more supernovae of equivalent errors to the SNAP distribution,so that our data sample now con-sists of 2023supernovae .Using this distribution of data,we check whether the method is successful in reconstruct-ing diﬀerent cosmological parameters,and also if it can help discriminate diﬀerent models of dark energy.We simulate 1000realizations of data using the SNAP distribution with the error in the luminosity distance given by σln d L =0.07–the expected error for SNAP.We also consider the possible eﬀect of weak-lensing on high redshift supernovae by adding an uncertainty of σlens (z )≈0.46(0.00311+0.08687z −0.00950z 2)(as in Wang &Tegmark (2005)).Initially,we use a simple model of dark energy when simulating data –an evolving model of dark energy with w =−a/a 0=−1/(1+z )and Ω0m =0.3.It will clearly be of interest to see whether this model can be reconstructed accurately and discriminated from ΛCDM using this method.From the SNAP distribution,we obtain smoothed data at 2000points taken uniformly between the minimum and maximum of the distributions used.Once we are assured of the eﬃcacy of our method,we shall also at-tempt to reconstruct other models of dark energy.Among these,one is the standard cosmological constant (ΛCDM)model with w =−1.The other is a model with a constant equation of state,w =−0.5.Such models with constant equation of state are known as quiessence models of dark energy (Alam et al.2003)and we shall refer to this model as the “quiessence model”throughout the paper.These three models are complementary to each other.For the ΛCDM model,the equation of state is constant at w =−1,w re-mains constant at −0.5for the quiessence model and for the evolving model,w (z )varies rapidly,increasing in value from w 0=−1at the present epoch to w ≃0at high redshifts.14Arman Shaﬁeloo,Ujjaini Alam,Varun Sahni,Alexei A.Starobinsky∆=0.24Fiducial Model:w =−1/(1+z)Figure 1.The smoothing scheme of equation (2)is used to determine H (z )and w (z )from 1000realizations of the SNAP dataset.The smoothing scale is ∆=0.24.The dashed line in each panel represents the ﬁducial w =−1/(1+z )‘metamorphosis’model while the solid lines show the mean Hubble parameter (left),the mean equation of state (right),and 1σlimits around these quantities.The dotted line in both panels is ΛCDM.Note that the mean Hubble parameter is reconstructed so accurately that the ﬁducial model (dashed line)is not visible in the left panel.3RESULTSIn this section we show the results obtained when our smoothing scheme is applied to data expected from the SNAP experiment.The ﬁrst issue we need to consider is that of the guess model.As mentioned earlier,the guess model in equation (2)is ing a guess model will naturally cause the results to be somewhat biased towards the guess model at low and high redshifts where there is paucity of data.Therefore we use an iterative method to estimate the guess model from an initial guess.Iterative process to obtain Guess modelTo estimate the guess model for our smoothing scheme,we use the following iterative method.We start with a sim-ple cosmological model,such as ΛCDM,as our initial guessmodel–ln d g 0L =ln d ΛCDML.The result obtained from this analysis,ln d 1L ,is expected to be closer to the real model than the initial guess.We now use this result as our nextguess model–ln d g 1L =ln d 1L and obtain the next result ln d 2L .With each iteration,we expect the guess model to become more accurate,thus giving a result that is less and less biased towards the initial guess model used.A few points about the iterative method should be noted here.•Using diﬀerent models for the initial guess does not af-fect the ﬁnal result provided the process is iterated several times.For example,if we use a w =−1/(1+z )‘metamor-phosis’model to simulate the data and use either ΛCDM or the w =−0.5quiessence model as our initial guess,theresults for the two cases converge by >∼5iterations.•Using a very small value of ∆will result in a accurate but noisy guess model,therefore after a few iterations,the result will become too noisy to be of any use.Therefore,we should use a large ∆for this process in order to obtain smoother results.•The bias of the ﬁnal result will decrease with each it-eration,since with each iteration we get closer to the true model.The bias decreases non-linearly with the number of iterations M .Generally,after about 10iterations,for mod-erate values of ∆,the bias is acceptably small.Beyond this,the bias still decreases with the number of iterations but the decrease is negligible while the process takes more time and results in larger errors on the parameters.•It is important to choose a value of ∆which gives a small value of bias and also reasonably small errors on the derived cosmological parameters.To estimate the value of ∆in (2),we consider the following relation between the reconstructed results,quality and quantity of the data and the smoothing parameters.One can show that the relative error bars on H (z )scale as (Tegmark 2002)δHN 1/2∆3/2,(6)where N is the total number of supernovae (for approxi-mately uniform distribution of supernovae over the redshift range)and σis the noise of the data.From the above equa-tion we see that a larger number of supernovae or larger width of smoothing,∆,will decrease the error bars on re-constructed H ,but as we shall show in appendix A,the bias of the method is approximately related to ∆2.This im-plies that,by increasing ∆we will also increase the bias of the results.We attempt to estimate ∆such that the error bars on H be of the same order as σ,which is a reasonable expectation.If we consider a single iteration of our method,then for N ≃2000we get ∆0≃N −1/3≃0.08.However,with eachSmoothing Supernova Data to Reconstruct the Expansion History of the Universe and its Age5∆=0.24Fiducial Model:w =−1/(1+z )0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6H 0TzFigure 2.The smoothing scheme of equation (2)is used to de-termine the look-back time of the universe,T (z )=t (0)−t (z ),from 1000realizations of the SNAP dataset for a w =−1/(1+z )‘metamorphosis’model.The smoothing scale is ∆=0.24.The solid lines show the mean look-back time and the 1σlimits around it.The look-back time for the ﬁducial model matches exactly with the mean for the smoothing scheme.The dotted line shows the ΛCDM model.iteration,the errors on the parameters will increase.There-fore using this value of ∆when we use an iterative process to ﬁnd the guess model will result in such large errors on the cosmological parameters as to render the reconstruction exercise meaningless.It shall be shown in Appendix A that at the M-th iteration,the error on ln d L will be approxi-mately δM (ln d L )≃√M ∆0.Therefore,if we wish to stop the boot-strapping after 10iterations,then ∆optimal ≃3∆0≃0.24.This is the optimal value of ∆we shall use for best results for our smoothing procedure.Considering all these factors,we use a smoothing scale ∆=0.24for the smoothing procedure of Eq (2)with a itera-tive method for ﬁnding the guess model (with ΛCDM as the initial guess).The boot-strapping is stopped after 10itera-tions.We will see that the results reconstructed using these parameters do not contain noticeable bias and the errors on the parameters are also satisfactory.Figure 1shows the reconstructed H (z )and w (z )with 1σerrors for the w =−1/(1+z )evolving model of dark energy.From this ﬁgure we can see that the Hubble param-eter is reconstructed quite accurately and can successfully be used to diﬀerentiate the model from ΛCDM.The equa-tion of state,however,is somewhat noisier.There is also a slight bias in the equation of state at low and high redshifts.Since the w =−1/(1+z )model has an equation of state which is very close to w =−1at low redshifts,we see that w (z )cannot discriminate ΛCDM from the ﬁducial model at z <∼0.2at the 1σconﬁdence level.Age of the UniverseWe may also use this smoothing scheme to calculate other cosmological parameters of interest such as the age of the universe at a redshift z :t (z )=H −10∞zdz ′(1+z ′)H (z ′).(8)Figure 2shows the reconstructed T (z )with 1σerrors forthe w =−1/(1+z )‘metamorphosis’model using the SNAP distribution.For this model the current age of the universe is about 13Gyrs and the look-back time at z ≃1.7is about 9Gyrs for a Hubble parameter of H 0=70km/s/Mpc.We see that the look-back time is reconstructed extremely ing this method we may predict this parameter with a high degree of success and distinguish between the ﬁducial look-back time and that for ΛCDM even at the 10σconﬁdence level.Indeed any cosmological parameter which can be obtained by integrating the Hubble parameter will be reconstructed without problem,since integrating involves a further smoothing of the results.Looking at these results,we draw the conclusion that the method of smoothing supernova data can be expected to work quite well for future SNAP data as far as the Hubble parameter is ing this method,we may recon-struct the Hubble parameter and therefore the expansion history of the universe accurately.We ﬁnd that the methodis very eﬃcient in reproducing H (z )to an accuracy of <∼2%within the redshift interval 0<z <1,and to <∼4%at z ≃1.7,as demonstrated in ﬁgure 1.Furthermore,using the Hubble parameter,one may expect to discriminate be-tween diﬀerent families of models such as the metamorphosis model w =−1/(1+z )and ΛCDM.This method also repro-duces very accurately the look-back time for a given model,as seen in ﬁg 2.It reconstructs the look-back time to anaccuracy of <∼0.2%at z ≃1.7.4REDUCING NOISE THROUGH DOUBLESMOOTHINGAs we saw in the preceding section,the method of smoothing supernova data to extract information on cosmological pa-rameters works very well if we employ the ﬁrst derivative of the data to reconstruct the Hubble parameter.It also works reasonably for the second derivative,which is used to deter-mine w (z ),but the errors on w (z )are somewhat large.In this section,we examine a possible way in which the equa-tion of state may be extracted from the data to give slightly better results.The noise in each parameter translates into larger noise levels on its successive derivatives.We have seen earlier that,using the smoothing scheme (2),one can obtain H (z )from the smoothed d L (z )fairly successfully.However,small noises in H (z )propagate into larger noises in w (z ).Therefore,it is logical to assume that if H (z )were smoother,the resul-tant w (z )might also have smaller errors.So,we attempt to6Arman Shaﬁeloo,Ujjaini Alam,Varun Sahni,Alexei A.Starobinsky∆=0.24,Double SmoothingFiducial Model:w =−1/(1+z)Figure 3.The double smoothing scheme of equations(2)and (9)has been used to obtain H (z )and w (z )from 1000realizations of the SNAP dataset.The smoothing scale is ∆=0.24.The dashed line in each panel represents the ﬁducial w =−1/(1+z )‘metamorphosis’model while the solid lines represent the mean and 1σlimits around it.The dotted line in both panels is ΛCDM.In the left panel H (z )for the ﬁducial model matches exactly with the mean for the smoothing scheme.∆=0.24,Double SmoothingFiducial Model:w =−1Figure 4.The double smoothing scheme of equations (2)and (9)has been used to obtain H (z )and w (z )from 1000realizations of the SNAP dataset.The smoothing scale is ∆=0.24.The dashed line in each panel represents the ﬁducial ΛCDM model with w =−1while the solid lines represent the mean and 1σlimits around it.In the left panel H (z )for the ﬁducial model matches exactly with the mean for the smoothing scheme.smooth H (z )a second time after obtaining it from d L (z ).The procedure in this method is as follows –ﬁrst,we smooth noisy data ln d L (z )to obtain ln d L (z )s using equation (2).We diﬀerentiate this to ﬁnd H (z )s using equation (3).We then further smooth this Hubble parameter by using thesame smoothing scheme at the new redshifts H (z,∆)s 2=H (z )g +N (z )i[H (z i )s −H (z i )g ]×exp−ln 21+zi2∆2,(9)Smoothing Supernova Data to Reconstruct the Expansion History of the Universe and its Age7∆=0.24,Double SmoothingFiducial Model:w =−0.5Figure 5.The double smoothing scheme of equations (2)and (9)has been used to obtain H (z )and w (z )from 1000realizations of the SNAP dataset.The smoothing scale is ∆=0.24.The dashed line in each panel represents the ﬁducial quiessence model with w =−0.5while the solid lines represent the mean and 1σlimits around it.The dotted line is ΛCDM.N (z )−1=iexp−ln21+zi2∆2.We then use this H (z,∆)s 2to obtain w (z )using equa-tion (4).This has the advantage of making w (z )less noisy than before,while using the same number of parameters.However,repeated smoothing can also result in the loss of information.The result for the SNAP distribution using this double smoothing scheme for the w =−1/(1+z )model is shown in ﬁgure 3.We use ∆=0.24for smoothing both ln d L (z )and H (z ).Comparing with ﬁgure 1,we ﬁnd that there is an improvement in the reconstruction of H (z )as well as w (z ).Thus,errors on the Hubble parameter decrease slightly and errors on w (z )also become somewhat smaller.We now explore this scheme further for other models of dark energy.We ﬁrst consider a w =−1ΛCDM model.In ﬁgure 4,we show the results for this model.We ﬁnd that the Hubble parameter accurately reconstructed and even w is well reconstructed,with a little bias at high redshift.The next model we reconstruct is a w =−0.5quiessence model.The results for double smoothing are shown in ﬁg 5.There is a little bias for this model at the low redshifts,although it is still well within the error bars.We note that in all three cases,a slight bias is notice-able at low or high redshifts.This is primarily due to edge eﬀects–since at low (high)redshift,any particular point will have less (more)number of supernovae to the left than to the right.Even by estimating the guess model through an iterative process,it is diﬃcult to completely get rid of this eﬀect.In order to get rid of this eﬀect,we would require to use much larger number of iterations for the guess model,but this would result in very large errors on the parameters.However,this bias is so small as to be negligible and cannot aﬀect the results in any way.Looking at these three ﬁgures,we can draw the follow-ing conclusions.The Hubble parameter is quite well recon-structed by the method of double smoothing in all three cases while the errors on the equation of state also decrease.At low and high redshifts,a very slight bias persists.Despite this,the equation of state is reconstructed quite accurately.Also,since the average error in w (z )is somewhat less than that in the single smoothing scheme (ﬁgure 1),the equation of state may be used with better success in discriminating diﬀerent models of dark energy using the double smoothing procedure.5THE w -PROBEIn this section we explore the possibility of extracting infor-mation about the equation of state from the reconstructed Hubble parameter by considering a weighted average of the equation of state,which we call the w -probe .An important advantage of this approach is that there is no need to go to the second derivative of the luminosity distance for in-formation on the equation of state.Instead,we consider the weighted average of the equation of state (Alam et al.2004)1+¯w =11+z,(10)which can be directly expressed in terms of the diﬀer-ence in dark energy density ˜ρDE =ρDE /ρ0c (where ρ0c =3H 20/8πG )over a range of redshift as 1+¯w (z 1,z 2)=1δln(1+z )=1H 2(z 2)−Ω0m (1+z 2)3ln1+z 1。

Smoothing Supernova Data to Reconstruct the Expansion History of the Universe and its Age

合集下载

mnn模型量化剪枝蒸馏

[ToG13]Poisson Surface Reconstruction

python 泰森多边形 -回复

多尺度上采样方法的轻量级图像超分辨率重建

pytorch的dataloader中数据增强技巧

模型超参数英文标准格式

fine-to-coarse reconstruction算法-概述说明以及解释

一种改进的高斯频率域压缩感知稀疏反演方法(英文)

纹理物体缺陷的视觉检测算法研究--优秀毕业论文

Spotlight SAR data focusing based on a two-step processing approach

VoxelNet_ End-to-End Learning for Point Cloud Base

Geometric Modeling

(NEW)介绍农业知识英文

mosaic数据增强计算公式

AI训练中的超参数优化寻找最佳参数的方法

点云曲面重建 python -回复

西门子PLM软件Parasolid 产品介绍说明书

大模型预训练参数更新流程

文档推荐

最新文档

Smoothing Supernova Data to Reconstruct the Expansion History of the Universe and its Age

合集下载

mnn模型量化剪枝蒸馏

[ToG13]Poisson Surface Reconstruction

python 泰森多边形 -回复

多尺度上采样方法的轻量级图像超分辨率重建

pytorch的dataloader中数据增强技巧

模型超参数 英文标准格式

fine-to-coarse reconstruction算法-概述说明以及解释

一种改进的高斯频率域压缩感知稀疏反演方法(英文)

纹理物体缺陷的视觉检测算法研究--优秀毕业论文

Spotlight SAR data focusing based on a two-step processing approach

VoxelNet_ End-to-End Learning for Point Cloud Base

Geometric Modeling

(NEW)介绍农业知识英文

mosaic数据增强计算公式

AI训练中的超参数优化 寻找最佳参数的方法

点云 曲面重建 python -回复

西门子PLM软件Parasolid 产品介绍说明书

大模型预训练参数更新流程

文档推荐

最新文档

模型超参数英文标准格式

AI训练中的超参数优化寻找最佳参数的方法

点云曲面重建 python -回复