Two Examples of Docking Algorithms
- 格式:ppt
- 大小:2.13 MB
- 文档页数:29
高三基础算法英语阅读理解30题1<背景文章>Algorithm is a fundamental concept in computer science. It can be defined as a set of computational steps and rules for performing a specific task. In simple terms, an algorithm is like a recipe that tells a computer exactly what to do.There are various types of algorithms. One of the most common types is the sorting algorithm. Sorting algorithms are used to arrange data in a particular order, such as ascending or descending order. For example, the bubble sort algorithm repeatedly steps through the list to be sorted, compares adjacent elements and swaps them if they are in the wrong order. Another important type is the search algorithm. Search algorithms are designed to find a particular item within a data set. Binary search, for instance, is a very efficient search algorithm when dealing with sorted data.The importance of algorithms in computer science cannot be overstated. They are the building blocks of all software applications. Whether it is a simple calculator program or a complex operating system, algorithms are at work behind the scenes. Good algorithms can significantly improve the performance and efficiency of a program. For example, a well - designed sorting algorithm can reduce the time it takesto sort a large amount of data from hours to seconds.The development of algorithms has a long history. In the early days of computing, algorithms were relatively simple due to the limited computing power available. As technology advanced, more complex and sophisticated algorithms were developed. For example, in the field of artificial intelligence, new algorithms are constantly being developed to enable machines to learn and make decisions like humans.1. <问题1>A. What is an algorithm according to the passage? A set of random numbers.B. A set of computational steps and rules for a specific task.C. A type of computer hardware.D. A programming language.答案:B。
An Efficient Approximation Scheme for Data Mining TasksGeorge Kollios Boston University gkollios@ Dimitrios GunopulosUC Riversidedg@Nick KoudasAT&T Laboratorieskoudas@Stefan Berchtoldstb gmbhberchtold@stb-gmbh.deAbstractWe investigate the use of biased sampling according to the density of the dataset,to speed up the operation of general data mining tasks,such as clustering and outlier detection in large multidimensional datasets.In density-biased sampling,the probability that a given point will be included in the sample depends on the local density of the dataset.We propose a general technique for density-biased sampling that can factor in user requirements to sample for properties of interest,and can be tuned for specific data mining tasks.This allows greatflexibility,and improved accuracy of the results over simple random sampling.We describe our approach in detail,we analytically evaluate it,and show how it can be optimized for approximate clus-tering and outlier detection.Finally we present a thorough experimental evaluation of the proposed method,applying density-biased sampling on real and synthetic data sets,and employing clustering and outlier detection algorithms,thus highlighting the utility of our approach.1IntroductionClustering and detecting outliers in large multidimen-sional data sets are important data analysis tasks.A variety of algorithmic techniques have been proposed for their solu-tion.The inherent complexity of both problems however is high,and the application of known techniques on large data sets is time and resource consuming.Many data intensive organizations,such as AT&T,collect lots of different data sets of cumulative size in the order of hundreds of gigabytes over specific time frames.Application of known techniques to analyze the data collected even over a single time frame, would require a large resource commitment and would be time consuming.Additionally,from a data analysis point of view not all data sets are of equal importance.Some data sets might contain clusters or outliers,others might not.To date,one has no way of knowing that,unless one resorts to executing the appropriate algorithms on the data.A power-ful paradigm to improve the efficiency of a given data analy-sis task is to reduce the size of the dataset(often called Data This research has been supported by NSF CAREER Award9984729 and NSF IIS-9907477.Reduction[3]),while keeping the accuracy loss as small as possible.The difficulty lies in designing the appropriate ap-proximation technique for the specific task and datasets.In this paper,we propose to use biased sampling as a data reduction technique to efficiently provide approximate solutions to data mining operations such as cluster and out-lier detection on large data set collections.Biased sampling is a generalization of uniform random sampling,which has been used extensively to speed up the execution of data min-ing tasks.In uniform random sampling,every point has the same probability to be included in the sample[26].In con-trast,in biased sampling,the probability that a data point is added to the sample is different from point to point.In our case,this probability depends on the value of prespec-ified data characteristics and the specific analysis require-ments.Once we derive a sampling probability distribution that takes into account the data characteristics and the anal-ysis objectives,we can extract a biased sample from the dataset and apply standard algorithmic techniques on the sample.This allows the user tofind approximate solutions to different analysis tasks,and gives a quick way to decide if the dataset is worthy of further exploration.Our main observation is that the density of the dataset contains enough information to derive the desired sampling probability distribution.To identify the clusters when the level of noise is high,or to identify and collect reliable statistics for the large clusters,we can oversample the dense regions.To identify all possible clusters,we can oversam-ple the sparser regions.It now becomes much more likely to identify sparser or smaller clusters.On the other hand, we can make sure that denser regions in the original dataset continue to be denser in the sample,so that we do not loose any of the dense clusters.To identify outliers we can sample the very sparse regions.We propose a new,flexible,and tunable technique to per-form density-biased sampling.In this technique we calcu-late,for each point,the density of the local space around the point,and we put the point into the sample with probabil-ity that is a function of this local density.For succinct(but approximate)representation of the density of a dataset,we choose kernel density estimation methods.We apply our technique to the problems of cluster and outlier detection, and report our experimental results.Our approach has the following benefits:(1)Flexibility:The probability that apoint is included in the sample can be optimized for dif-ferent data mining tasks,(2)Effectiveness:Density-biased sampling can be significantly more effective than random sampling as a data reduction technique,and(3)Efficiency: It makes use of efficient density estimation techniques(re-quiring one dataset pass),and requires one or two additional passes to collect the biased sample.In section2we analytically evaluate the benefits of the proposed approach over random sampling and present our approach.In section3we discuss the issues involved when combining our sampling technique with clustering and out-lier detection algorithms in order to speed up the corre-sponding tasks.In section4we present the results of a detailed experimental evaluation using real and synthetic datasets,analyzing the benefits of our approach for various parameters of interest.1.1Related WorkClustering and outlier detection are very important and widely used techniques,with a large variety of techniques proposed for their solution.However,the approach closer to ours is the technique of Palmer et.al.[22].They devel-oped an algorithm to sample for clusters by using density information,under the assumption that clusters have a zip-fian distribution.Their technique is designed tofind clusters when they differ a lot in size and density.The technique un-dersamples dense areas and oversamples sparse areas.Their approach tightly couples the extraction of density informa-tion and the actual sampling.It uses a hash based approach, and the quality of the sample degrades with collisions im-plicit to any hash based approach.The approach presented herein,decouples density estimation and biased sampling. Our use of kernels for density estimation is orthogonal to our framework,which is independent on any assumptions about cluster distribution.Any density estimation technique can be used instead and we suggest a hash free approach to eliminate the impact of collisions to density estimation.In addition,our approach is tunable for different cases,such as when there is a lot of noise,or when we are looking for outliers.We present an experimental comparison between the techniques in section4.2A Flexible and Efficient Technique for Den-sity Biased SamplingSampling[27,29]is a well recognized and widely used statistical technique.In the context of clustering in databases,both partitional[20]and hierarchical[8]clus-tering techniques employ uniform random sampling to im-prove the efficiency of the techniques.Toivonen[28]ex-amined the problem of using sampling during the discovery of association rules.Sampling has also been recently suc-cessfully applied in query optimization[6,5],as well as approximate query answering[7,1].The focus of this paper is in the design of alternative forms of sampling that can be used to expedite data min-ing tasks such as cluster and outlier detection.A natural question arises regarding the potential benefits of such sam-pling techniques.In other words,why is uniform random sampling not sufficient for such a purpose?Guha et.al,[8] presented a theorem linking the sample size with the proba-bility that a fraction of the cluster is included in the sample. Let be a data set of size,and a cluster of size. Guha et.al,consider a cluster to be included in the sam-ple,when more than points from the cluster are in the sample,for.Then the sample size,required by uniform random sampling,to guarantee that is included in the sample with probability less than,is: For example,in order to guarantee with probability90% that a fraction of a cluster with1000points is in the sample,we need to sample25%of the dataset.Clearly, one must be willing to include a very large fraction of the dataset in the sample,in order to provide good guarantees with uniform random sampling.This requirement,however, defies the motivation for sampling.Instead of employing uniform sampling,let us assume that we sample according to the following rule.Let be the points in the dataset.Let be a cluster. Assume that we had the ability to bias our sampling process according to the following rule,:for.We can state the following theorem(the proof appears in[15]):Theorem1Let be a data set of size,a cluster of size .A cluster is included in the sample,when more than points from the cluster are in the sample,for.Let be the sample size required by sampling according to,to guarantee that is included in the sample with probability less than,and the sample size required by uniform random sampling to provide the same guarantee.Then:Theorem1provides a means to overcome the limitations of uniform random sampling since we can assure,with the same probability,that the same fraction of the cluster is in the sample,by taking a sample of smaller size.The serious limitation,however,is that we have no clear way to bias our sampling towards points in the clusters,since we do not know these points a priori.We observe that knowledge of the probability density function of the underlying dataset,can provide enough information to locate those points.Clusters are located in dense areas of the dataset and outliers in sparse ar-eas.By adapting our sampling,according to the density of the dataset we can bias our sampling probabilities towards dense areas in the case of clusters,and sparse areas in the case of outliers.All that is needed,is a mapping function that maps density to sampling probability,in a way that de-pends on the objective of our sampling process.For the case of clusters,this function can be as simple as sampling according to the probability density function of the under-lying data set.For outliers,it can be as simple as sampling according to the inverse of this function.Since the sample is biased towards specific areas it has the potential,due to Theorem1,to require smaller sample sizes to give certain guarantees.2.1Density Estimation TechniquesAn efficient and accurate technique for estimating the density of a dataset is paramount for the success of our method.There is a number of methods suggested in the literature employing different ways tofind the density esti-mator for multi-dimensional datasets.These include,com-puting multi-dimensional histograms[23][6][16][2],us-ing various transforms,like the wavelet transformation[30] [19],SVD[23]or the discrete cosine transform[17]on the data,using kernel estimators[4],as well as sampling[21] [18][10].Although our biased-sampling technique can use any density estimation method,kernel density estimators pro-vide the best solution because they are accurate and can be computed efficiently.Work on query approximation[9]and in the statistics literature[25],[24]shows that kernel func-tions always estimate the density of the dataset more accu-rately than using a random sample of the dataset.Finding a kernel density estimator can be done in one dataset pass. The technique is presented in detail in[9].2.2The Proposed Biased Sampling TechniqueIn this section we outline our technique for density-biased sampling.Let be a d-dimensional dataset withpoints.For simplicity we assume that the space do-main is,otherwise we can scale the attributes.Letbe a density estimator for the dataset.That is,for a given region,the in-tegral is approximately equal to the number of points in.We want to produce a biased sample of,with the following properties:Property1The probability that a point in will be in-cluded in the sample is a function of the local density of the data space around the point.Property2The expected size of the sample should be, where is a parameter set by the user.Let’s consider the case of uniform samplingfirst.A point in ,will be in the sample with probability.By the defini-tion of,if then the local density around the point is larger than the average density of the space,defined asIn density biased sampling,we want the probability of a given point to be included in the sample,to be a function of the local density of the space around the point.This is given by the value of .In addition we want the technique to be tunable,so we introduce the functionfor some.The basic idea of the sampling technique is to sample a pointwith probability that is proportional to. The value of controls the biased sampling process. Let.We define.Thus has the additional property:Having defined thus,we include each pointin the sample with probabilityGiven a dataset,a density estimator for,a value of and a sample size set by the user,pute in one pass the value of,where.2.Add each point in the sample with probabilityx.Figure 1.The Proposed Biased SamplingTechniqueOur technique for density-biased sampling is shown in Figure1.It can be shown that this algorithm satisfies prop-erties1and2.The proof of this and the following lemmas appears in[15].One of the main advantages of the pro-posed technique is that it is veryflexible,and can be tuned for different ing different values of we can oversample either the dense regions(for),or the sparse regions(for).We consider the following cases: 1.:each point is included in the sample with prob-ability,and therefore we have uniform sampling.2.:In this case if and only if.It follows that the regions of higher density are sampled at a higher rate than regions of lower density.In particular,regions with density that is higher than the average density in the data space, are sampled at a rate that is higher,than the uniform sampling rate,and regions where the density is lower than the data space average,are sampled at a rate that is lower than the uniform sampling rate.3.:In this case the opposite is true.Sinceif and only if,the re-gions with lower density than average are sampled ata higher rate than uniform,and the regions of higherdensity are sampled with a lower rate than uniform. 4.:there is the same expected number of samplepoints in any two regions of the same volume.The sampling technique we described has an additional fun-damental property,namely it preserves,in the sample,the relative densities of different regions in.If regions and in are dense,they should appear dense in the sam-ple too.Moreover,if has higher density than then the same should be true in the sample.We state and prove the following lemma:Lemma1If,relative densities are preserved with high probability.That is,if a region has higher density than region in the data set,then,with high probability (depending on the size of the regions),the same is true for the sample.Lemma1allows us to use density-biased sampling to discover small clusters that are dominated by very large and very dense clusters.In the case of,dense ar-eas are oversampled,but areas of lower density are sampled enough,such that their density relative to the sampled dense areas is preserved.By setting,we oversam-ple areas that are less dense,but still sample the very dense areas enough so that they remain dense in the sample.In our presentation of the sampling technique we choose for simplicity to separate the computation of the biased sam-pling probability with the actual sampling step.It is possi-ble to integrate both steps in one,thus deriving the biased sample in a single pass over the database.In this case how-ever we only compute an approximation of the sampling probability.The details of this integration are presented in [15].3Using Biased Sampling for Cluster and Outlier DetectionWe describe how density biased sampling can be inte-grated with cluster and outlier detection algorithms.Our emphasis is on evaluating how biased sampling can be usedin conjunction with“off the shelf”clustering and outlierdetection algorithms to provide accurate approximate solu-tions to these problems.3.1ClusteringWe run clustering algorithms on the density-biased sam-ple to evaluate the accuracy of the proposed technique.Tocluster the sample points we use a hierarchical clustering al-gorithm.The algorithm starts with each point in a separatecluster and at each step merges the closest clusters together.The technique is similar to CURE[8]in that it keeps a setof points as a representative of each cluster of points.The hierarchical clustering approach has several advan-tages over-means or-medoids algorithms.It has beenshown experimentally that it can be used to discover clus-ters of arbitrary shape,unlike-means or-medoids tech-niques,which typically perform best when the clusters arespherical[8].It has also been shown that a hierarchicalclustering approach is insensitive to the size of the cluster.On the other hand-means or-medoids algorithms tendto split up very large clusters,and merge the pieces withsmaller clusters.The main disadvantage of the hierarchi-cal approach is that the run time complexity is quadratic.Running a quadratic algorithm even on a few thousand datapoints can be problematic.A viable approach to overcomethis difficulty,is to reduce the size of the dataset in a man-ner that preserves the information of interest to the analysiswithout harming the overall technique.A hierarchical clustering algorithm that is running offthe biased sample is in effect an approximation algorithmfor the clustering problem on the original dataset.To thisrespect,our technique is analogous to the new results onapproximation clustering algorithms[12][11],since thesealgorithms also run on a(uniform random)sample to effi-ciently obtain the approximate clusterings.The techniquesare not directly comparable,however,because these algo-rithms are approximating the optimal-medoids partition-ing,which can be very different from the hierarchical clus-terings wefind.We note here that we can also use biased samplingin conjunction with-means or-medoids algorithmsif this is desirable.However,-means,or-medoidsalgorithms optimize a criterion that puts equal weightto each point in the dataset.For example,for the-means clustering algorithm,assuming clusters the op-timization criterion is minimization of the following func-tion:(cluster i)(j i),where for the-means algorithm is the mean of cluster,and for the -medoids algorithm is the medoid of cluster.To use density biased sampling in this case,we have to weightthe sample points with the inverse of the probability thateach was sampled.3.2Distance Based OutliersDistance based outliers were introduced by Knorr andNg[13,14].Definition1[13]An object in a dataset is aoutlier if at most objects in lie at distance at most from .Knorr and Ng[13]show that this definition is compat-ible with empirical definitions of what is an outlier,and italso generalizes statistical definitions of outliers.Note thatin the definition of a-outlier,the values of and are parameters set by the user.The number of objects that can be at distance at most can also be specified as afraction of the dataset size,in which case.The technique we present has the additional advantage thatit can estimate the number of-outliers in a dataset very efficiently,in one dataset pass.This gives the op-portunity for experimental exploration of and,in order to derive sufficient values for the specific dataset,and the task at hand.To facilitate the description of our algorithm,we also de-fine the function to be the number of points that lie at distance at most from the point in the dataset.The basic idea of the algorithm is to sample the regionson which the data point density is very low.Since we aredealing with datasets with numerical attributes,we assume,in the description of the algorithm that the distance functionbetween points is the Euclidean distance.However differentdistance metrics(for example the or Manhattan metric)can be used equally well.Let be a dataset with points and numericaldimensions.Let be a density estimator of the dataset.Given input parameters(the maximum distance of neigh-boring points)and(the maximum number of the datasetpoints that can be at distance at most away),we compute,for each point,the expected number of points in a ballwith radius,that is centered at the point.Thus we esti-mate the value of byBall(O,k)We keep the points that have smaller expected number ofneighbors than.These are the likely outliers.Then,we make another pass over the data,and verify the numberof neighbors for each of the likely outliers.For mode detailsabout this algorithm we refer to[15].4ExperimentsWe present an experimental evaluation of our approach.Due to space constraints,we present here only a small sub-set of our evaluation and we refer to[15]for more results.We run clustering and outlier detection tasks in both syn-thetic and real datasets and we intent to explore the follow-ing issues:(i)determine whether biased sampling indeed offers advantages over uniform random sampling for clus-ter and outlier detection,and quantify this advantage,(ii) determine the accuracy of the approximate solution that we obtain if we run cluster and outlier detection algorithms on only the biased sample,as opposed to running on the entire dataset.We use the same algorithm for clustering the density-biased sample and the uniform random sample in order to make the comparison objective.We used a hierarchical clustering algorithm based on CURE[8],but not the orig-inal implementation by the authors.In this algorithm each cluster is represented by a set of points that have been care-fully selected in order to represent the shape of the cluster (well scattered points).Therefore,when the hierarchical al-gorithm is running on the uniform random sample,the clus-ters it produces are the ones CURE would have found.We compare these results with the ones obtained when the hier-archical algorithm is run on a density-biased sample(with different values of).We also include the BIRCH[31] clustering algorithm in the comparison since it is gener-ally considered as one of the best clustering methods pro-posed so far for very large databases.We allow BIRCH to use as much space as the size of the sample to keep the CF-tree(Section2)which describes the dataset.Although we constrain the size of the CF-tree,we allow BIRCH to work on the entire dataset for clustering,that is we do not run BIRCH on our samples.By comparing with BIRCH, we intent to explore the effectiveness of our summarization technique for identifying clusters to that of BIRCH.4.1Description Of DatasetsSynthetic Datasets We used a synthetic data generator to produce two tofive dimensional datasets.The number of clusters is a parameter,varied from10to100in our exper-iments.Due to space constraints we report only the results for10clusters.The rest of our results with datasets contain-ing larger number of clusters are qualitatively similar.Each cluster is defined as a hyper-rectangle,and the points in the interior of the cluster are uniformly distributed.The clus-ters can have non-spherical shapes,different sizes(number of points in each cluster)and different average densities. We also generate and use in our experiments datasets with noise as follows:Let be a dataset of size containing synthetically generated clusters.We adduniformly distributed points in as noise and we say that contains noise.We vary from5%to80% in our experiments.In addition we used synthetic datasets from[8](see Figure3.)Geospatial Datasets We also used real life datasets from spatial databases.The NorthEastern dataset represents 130000postal addresses in the North East States.The California dataset represents postal addresses in California and has a size of62553points.Both datasets contain two dimensional points.Another dataset is the Forest Cover Dataset(59000points)from the UCI KDD archive1.This was obtained from the US Forest Service(USFS).4.2Implementation Details And Parameter Set-ting.For the density approximation function,we use the ker-nel density estimator technique.In our experiments we use sample points to initialize the kernel cen-ters and we use the Epanechnikov kernel function.We per-formed extensive experiments[15]evaluating the impact of the number of kernels in our approach.The value of1000 gave very good results across a wide range of datasets with different statistical characteristics and sizes and we propose the use of this value in practice.We extract two types of samples:Uniform Sampling:We extract a uniform random sam-ple of size,byfirst reading the size,,of the dataset and then sequentially scanning the dataset and choos-ing a point with probability.Thus,the expected size of the sample is.Biased sampling:We implemented the algorithm de-scribed in section2.2.There are three parameters af-fecting this algorithm:(a)the number of sample points that we use for the kernel estimation()(b)the num-ber of sample points,and(c)the parameter,that controls the rate of sampling for areas with different densities.We experiment with these parameters in the sequel.We also compare out technique with the technique of[22]2.Since this technique is designed specifically toidentify clusters of different sizes and densities,by oversampling sparse regions and undersample sparse regions,we run the comparison for this case only.We employ the following clustering algorithms:BIRCH:We used an implementation of the BIRCH al-gorithm,provided to as by the authors[31].We set the parameter values as suggested in[31].Thus,we set the page size to1024bytes,initial threshold to0and input range to1000.1available from /summary.data.type.html2the code is available in:/People/crpalmer/dbs-current.tar.gzHierarchical Clustering algorithm:We follow the sug-gestion of the experimental study in[8]and we set the parameters accordingly.Parameter(the shrink fac-tor)is set to,we use one partition and the default value for the representative points is10.4.3Clustering ExperimentsRunning time experiments In thefirst set of experiments we investigated the effect of the number of kernels and the size of the dataset to the running time of our algorithm.Not suprisingly,our algorithm scales linearly to the number of kernels and the size of the datasets.For mores results we refer to[15].10020030040050060070080090010001100120013001400100030005000700090001100013000150001700019000TimeinSecsNumber of SamplesBS-CURERS-CUREFigure2.Running time of the clustering algo-rithm.In Figure2we plot the total running time of the cluster-ing algorithm for biased and random sampling.We used a dataset with1million points and1000kernels.Note that the running time of the hierarchical algorithm is quadratic. Also note that the time we spend for building the density es-timator and for doing another pass for the actual sampling, is more than offset by the time we save because we can run a quadratic algorithm on a smaller sample.For example, if we run the algorithm on a1%uniform random sample, the running time is.If we run the clus-tering algorithm on a biased sample,the running time is.For one million points,our method with a or sample is faster than the hierarchical algorithm with or uniform random sample re-spectively.Biased Sampling vs Uniform Random Sampling In this experiment we explore the effects of using density-biased sampling as opposed to uniform random sampling in order to identify clusters.We used a dataset(dataset1)used in [8].This dataset has5clusters with different shapes and densities.We draw a uniform random sample as well as a biased sample with,both of size1000,and we run the hierarchical clustering method on both samples.。
湖北大学本科毕业论文(设计)题目最短路径算法及其应用姓名学号专业年级指导教师职称2011年 4月 20 日目录绪论 (1)1 图的基本概念 (1)1.1 图的相关定义 (1)1.2 图的存储结构 (2)1.2.1 邻接矩阵的表示 (2)1.2.2 邻接矩阵的相关结论 (3)2 最短路径问题 (3)2.1 最短路径 (4)2.2 最短路径算法 (4)2.2.1Dijkstra算法 (4)2.2.2Floyd算法 (5)3 应用举例 (5)3.1 Dijkstra算法在公交网络中的应用 (5)3.1.1 实际问题描述 (5)3.1.2 数学模型建立 (5)3.1.3 实际问题抽象化 (6)3.1.4 算法应用 (6)3.2 Floyd算法在物流中心选址的应用 (7)3.2.1 问题描述与数学建模 (7)3.2.2 实际问题抽象化 (7)3.2.3 算法应用 (8)参考文献 (10)附录 (11)最短路径算法及其应用摘要最短路径算法的研究是计算机科学研究的热门话题,它不仅具有重要的理论意义,而且具有重要的实用价值。
最短路径问题有广泛的应用,比如在交通运输系统、应急救助系统、电子导航系统等研究领域。
最短路径问题又可以引申为最快路径问题、最低费用问题等,但它们的核心算法都是最短路径算法。
经典的最短路径算法——Dijkstra和Floyd算法是目前最短路径问题采用的理论基础。
本文主要对Dijkstra和Floyd算法进行阐述和分析,然后运用这两个算法解决两个简单的实际问题。
【关键字】最短路径 Dijkstra算法 Floyd算法图论Shortest path algorithms and their applicationsAbstractThe research about the shortest path is a hot issue in computer science. It has both important theoretical significance and important utility value. The shortest path problem has broad application area, such as transport system, rescue system, electronic navigation system and so on. The shortest path problem can be extended to the problem of the fastest path problem and the minimum cost problem. But their core algorithms are all both the shortest path algorithms. The classical algorithms for the shortest path——Dijkstra and Floyd are the theoretical basis for solving the problems of the shortest path. The article mainly through the demonstration and analysis of the Dijkstra and Floyd algorithms, then use the algorithms to solve the two simple practical problems.【keywords】shortest path Dijkstra algorithm Floyd algorithm graph绪论随着知识经济的到来,信息将成为人类社会财富的源泉,网络技术的飞速发展与广泛应用带动了全社会对信息技术的需求,最短路径问题作为许多领域中选择最优问题的基础,在电子导航,交通旅游,城市规划以及电力,通讯等各种管网,管线的布局设计中占有重要的地位。
目录Chapter 2 Linear programming (2)Solution: (4)Chapter 3 Simplex (6)Solution: (7)Chapter 4 Sensitivity Analysis and duality (11)Solution: (14)Chapter 5 Network (18)Solution: (20)Chapter 6 Integer Programming (23)Solution: (25)Chapter 7 Nonlinear Programming (28)Solution: (28)Chapter 8 Decision making under uncertainty (29)Solution: (31)Chapter 9 Game theory (34)Solution: (36)Chapter 10 Markov chains (39)Solution: (41)Chapter 11 Deterministic dynamic programming (43)Solution: (43)Expanded Projects (44)Chapter 2 Linear programming1. A firm manufactures chicken feed by mixing three different ingredients. Eachingredien t contains three key nutrients protein, fat and vitamin. The amount of each nutrient contained in 1 kilogram of the three basic ingredients is summarized in the following table:Ingredient Protein(grams)F at(grams)Vitamin(units)12511235245101603327190The costs per kilogram of Ingredients 1, 2, and 3 are $0.55, $0.42 and $0.38, respectively. Each kilogram of the feed must contain at least 35 grams of protein, a minimum of 8 grams of fat and a maximum of 10 grams of fat and at least 200 units of vitamin s. Formulate a linear programming model for finding the feed mix that has the minimum cost per kilogram.2.For a supermarket, the following clerks are required:Days Min. number of clerksMon 20T ue16Wed13Th u16F ri19Sat14Sun12Each clerk works 5 consecutive days per week and may start working on Monday, Wednesday or Friday.The objective is to find the smallest number of clerks required to comply with the above requirements. Formulate the problem as a linear programming model.3.Consider the following LP problem:12121212126841634243412,0MaxZ x x Subject tox x x x x x x x =++≤+≤-≤≥ (a) Sketch the feasible region.(b) Find two alternative optimal extreme (corner) points.(c) Find an infinite set of optimal solutions.4. A power plant has three thermal generators. The generators’ generation costsare $36/MW, $30/MW, and $25/MW, respectively. The output limitation for the generators is shown in the table. Some moment, the power demand for thisplant is 360MW, please set up an LP optimization model and find out the optimal output for each generator (with lowest operation cost).5. Use the Graphical Solution to find the optimal solutions to the following LP:12121212max 4.. 36 20 ,0z x x s t x x x x x x =-++≤-+≤≥Solution :1. Let x 1 = the amoun t of Ingredien t 1 mixed in 1 kilogram of thechicken feedx 2 = the amoun t of Ingredien t 2 mixed in 1 kilogram of the chicken feedx 3 = the amoun t of Ingredien t 3 mixed in 1 kilogram of the chicken feedThe LP model is:1231231231231231231230.550.420.382545323511107811107102351601902001,,0Min Z x x x Subject tox x x x x x x x x x x x x x x x x x =++++≥++≥++≤++≥++=≥2.Let x1 = number of clerks start working on Mondayx2 =number of clerks start working on Wednesday x3 =number of clerks start working on Friday The LP model is:12313131212123232312320161316191412,,0Min Z x x x Subject tox x x x x x x x x x x x x x x x x x =+++≥+≥+≥+≥++≥+≥+≥≥3. (a)(b) The t w o alternativ e optimal extreme points are (4, 3) and (6,3/2 ). (c) The infinite set of optimal solutions: {λ(4, 3) + (1 − λ)(6,3/2) : 0 ≤ λ ≤ 14. Model:123123111123max 363025.. 360 5020050150 50150 ,,0z x x x s t x x x x x x x x x =++++=≤≤≤≤≤≤≥Solution:x 1=60(MW); x 2=150(MW); x 3=150(MW)5. According to the figure, the solution is: x 1=0; x 2=0Chapter 3 Simplex1. Show that if ties are broken in favor of lower-numbered rows, then cyclingoccurs when the simplex method is used to solve the following LP: 123123412341234369920/32/3099210(1,2,3,4)i Max Z x x x Subject tox x x x x x x x x x x x x i =-+-+--≤+--≤--++≤≥= 2. Use the simplex algorithm to find two optimal solutions to the following LP:123123123123max 53.. 36 53615 ,,0z x x x s t x x x x x x x x x =++++≤++≤≥3. Use the Big M method to find the optimal solution to the following LP:1212121212max 5.. 26 4 25 ,0z x x s t x x x x x x x x =-+=+≤+≤≥4. Use the simplex algorithm to find two optimal solutions to the following LP .123123123123max 53.. 3653615 ,,0z x x x s t x x x x x x x x x =++++≤++≤≥5. For a linear programming problem:1212121234241232850(1,2)i Max Z x x Subject tox x x x x x x i =++≤+≤+≤≥= Find the optimal solution using the simplex algorithm.Solution:1.Here are the pivots:BV={S1,S2,S3}.BV={X2,S2,S3}.We now enter X3 into the basis in Row 2.BV={X2,X3,S3}.We now enter X4 into the basis in Row 1.BV={X4,X3,S3}.X1 now enters basis in Row 2.BV={X4,X1,S3}.We now choose to enter S1 in Row 1.BV={S1,X1,S3}.S2 would now enter basis in Row 2. This will bring us back to the initial tableau, so cycling has occurred. 2. Standard form:1231231123212312max 53.. 36 53615 ,,,,0z x x x s t x x x s x x x s x x x s s =+++++=+++=≥Tableau:So: z=15; x 1=3 ; x 2=0;x 3=03. Standard form:12121211221212max 5.. 26 4 25 ,,,0z x x s t x x x x s x x s x x s s =-+=++=++=≥=>12112112112212121max 5.. 26 4 25 ,,,,0z x x a M s t x x a x x s x x s x x s s a =--++=++=++=≥ Tableau: => => So, the solution is z=15, x 1=3, x 2=04. Standard form:1231231123212312max 53.. 36 53615 ,,,,0z x x x s t x x x s x x x s x x x s s =+++++=+++=≥So, the solution is z=15,x 1=0,x 2=5 or z=15,x 1=3,x 2=0 5. Optimal solution:Chapter 4 Sensitivity Analysis and duality1. Consider the following linear program (LP):1212232420(1,2)i Max Z x x Subject tox x x x i =++≤≤≥=(a). De termin e the shadow price for b 2, the right-hand side of the constrai n t x 2 ≤ b 2. (b). De t e rmin e th e allowable r ange to s tay optimal for c 1, the co e ffic i e n t of x 1 in theob jec tiv e function Z = c 1x 1 + 3x 2.(c). De termin e the allowable range to stay feasible for b 1, th e right-hand side of theconstrai n t 2x 1 + x 2 ≤ b 1.2. There is a LP model as following,1212121234524123280(1,2)i Max Z x x Subject tox x x x x x x i =++≤+≤+≤≥= The optimal simplex tableau is1) Give the dual problem of the primal problem.2) If C2 increases from 4 to 5, will the optimal solution change? Why? 3) If b2 changes from 12 to 15, will the optimal solution change? Why? 3. There is a LP model as following12312312312236222333280(1,2)j Min Z x x x Subject tox x x x x x x x x j =++++≥-++≤-+≤≥= 1) give its dual problem.2) Use the graphical solution to solve the dual problem.4. You have a constraint that limits the amount of labor available to 40 hours perweek. If your shadow price is $10/hour for the labor constraint, and the market price for the labor is $11/hour. Should you pay to obtain additional labor? 5. Consider the following LP model of a production plan of tables and chairs:Max 3T + 2C (profit) Subject to the constraints:2T + C ≤100 (carpentry hrs) T + C ≤80 (painting hrs)T ≤ 40T, C ≥ 0 (non-negativity)1) Draw the feasible region. 2) Find the optimal solution.3)Does the optimal solution change if the profit contribution for tables changed from $3 to $4 per table?4) What if painting hours available changed from 80 to 100?6. For a linear programming problem:11221212121234524123280(1,2)i Max Z c x c x x x Subject tox x x x x x x i =+=++≤+≤+≤≥=Suppose C2 rising from 4 to 5, if the optimal solution will change? Explain the reason. 7. For a linear programming problem:112212121221234524123280(1,2)i Max Z c x c x x x Subject tox x x x b x x x i =+=++≤+≤=+≤≥=Suppose b2 rising from 12 to 15, if the optimal solution will change? Explain thereason.8. For a linear programming problem:112212121221234524123280(1,2)i Max Z c x c x x x Subject tox x x x b x x x i =+=++≤+≤=+≤≥=Calculate the shadow price of all of the three constraints. 9.1) Use the simplex algorithm to find the optimal solution to the model below(10 points)1212125231250(1,2)i Max Z x x Subject tox x x x x i =++≤+≤≥=2) For which objective function coefficient value ranges of x 1 and x 2 does thesolution remain optimal? (10 points) 3) Find the dual of the model; (5 points)4) Find the shadow prices of constraints. (5 points)5) If x1 and x2 are all integers, using the branch-and-bound to solve it.( 15points)10. A factory is going to produce Products I, II and III by using raw materials A and B.1) Please arrange production plan to make the profit maximization. (15) 2) Write the dual problem of the primal problem. (5)3) If one more kg of raw material A is available, how much the total profit will be increased? (5) 4) If the profit of product II changes from 1 to 2,will the optimal solution change? (5)Solution :1.(a) T h e shadow pr ic e for b 2 is 2.5. Replace th e constrai n t x 2 ≤ 2 by the constrain t x 2 ≤ 3.The new optimal solution is (x 1, x 2) = (0.5, 3) with Z = 9.5. Thus, a unit increas e in b 2 leads t o a 2.5 unit increase in Z .(b) The all o wabl e range to s tay optimal i s 0 ≤ c 1 ≤ 6. The ob j e ctiv e fun c t ion Z =c 1x 1 + 3x 2 is p arall e l to th e c on s tr ain t boundary equation 2x 1 + x 2 = 4 when c 1 = 6. The ob j e ctiv e function Z = c 1x 1 + 3x 2 is parallel to t he c on s tr ain t boundary equation x 2 = 2 wh e n c 1 = 0.(c) T h e allowable range to stay feasible is 2 ≤ b 1 < ∞. The righ t -h and sideb 1 can b e decreased un t il thec on s tr ain t boundary e qu ation 2x 1 + x 2 = 4 intersects th e solution (x 1, x 2) = (0, 2). This occurs when b 1 = 2. T h e right-hand side b 1 can b e in c r e ase d w i thou t i nte r s ec t ing a s olu tion .2.1) the dual problem:123123123125128..233424,0Min w y y y S ty y y y y y y y =++++≥++≥≥2) when C2 changes from 4 to 5, the optimal basic variable will not change, because the coefficient of the nonbasic variable remain positive.3) when b2 changes from 12 to 15, the optimal basic variable will not change. 3.1) the dual problem of the primal problem is :121212121223..222336,0Max w y y S ty y y y y y y y =--≤+≤+≤≥ 2) using the graphical solution, the optimal solution of the dual problem is: w= 19/5, y1=8/5, y2=-1/5.4. No. If you obtain one additional labor, you should pay $11. But by the shadowprice, you can only earn $10. So we should not pay to obtain additional labor. 5.2) The optimal solution is T=20, C=60 and the maximum profit is 180.3) If the profit contribution for tables changed from $3 to $4 per table, therewill be two optimal solutions, says T=20, C=60 and T=40, C=20, and the maximum profit is 200.4) Because painting hrs is a constraint condition for T=20, C=60, so theoptimal solution will change. The new optimal solution is T=0, C=100, and the maximum profit is 200.6. Parameter is calculated below:1212311211[,,][,][0,4,3][0,0]11104202311/81/403/81/401/41/2111240320001001BV NBV s j BV NBVBV s x x NBV s s C C B B a a a N c c B N c --====⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦--⎡⎤⎢⎥=-⎢⎥⎢⎥-⎣⎦⎡⎤⎡⎤⎡⎤⎢⎥⎢⎥⎢⎥===⎢⎥⎢⎥⎢⎥⎢⎥⎢⎥⎢⎥⎣⎦⎣⎦⎣⎦⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦=-If c2 rising from 4 to 5, then ,and >0,so the optimal solution will not change.7. If b2 rising from 12 to 15, every element of =[9/8,29/8,1/4] is large thenzero,so the optimal solution will not change. 8. Shadow price is calculated by 。
机器学习题库一、 极大似然1、 ML estimation of exponential model (10)A Gaussian distribution is often used to model data on the real line, but is sometimesinappropriate when the data are often close to zero but constrained to be nonnegative. In such cases one can fit an exponential distribution, whose probability density function is given by()1xb p x e b-=Given N observations x i drawn from such a distribution:(a) Write down the likelihood as a function of the scale parameter b.(b) Write down the derivative of the log likelihood.(c) Give a simple expression for the ML estimate for b.2、换成Poisson 分布:()|,0,1,2,...!x e p x y x θθθ-==()()()()()1111log |log log !log log !N Ni i i i N N i i i i l p x x x x N x θθθθθθ======--⎡⎤=--⎢⎥⎣⎦∑∑∑∑3、二、 贝叶斯假设在考试的多项选择中,考生知道正确答案的概率为p ,猜测答案的概率为1-p ,并且假设考生知道正确答案答对题的概率为1,猜中正确答案的概率为1,其中m 为多选项的数目。
The Hoshen-Kopelman AlgorithmThe Hoshen-Kopelman Algorithm is a simple algorithm for labeling clusters on a grid, where a grid is a regularnetwork of cells, where each cell may be either "occupied" or "unoccupied". The HK algorithm is an efficient meansof identifying clusters of contiguous cells.The algorithm was originally described in "Percolation and cluster distribution. I. Cluster multiple labelingtechnique and critical concentration algorithm," by J. Hoshen and R. Kopelman and printed in Phys. Rev. B. 1(14):3438-3445 in October 1976. The article is available in PDF from Physical Review Online Archive (subscriptionrequired). However, the HK-algorithm is really just a special application of the Union-Find algorithm, well knownto computer scientists. The use of the union/find abstraction also makes description of the H-K algorithm far moretransparent than without.The general idea of the H-K algorithm is that we scan through the grid looking for occupied cells. To eachoccupied cell we wish to assign a label corresponding to the cluster to which the cell belongs. If the cell haszero occupied neighbors, then we assign to it a cluster label we have not yet used (it's a new cluster). If thecell has one occupied neighbor, then we assign to the current cell the same label as the occupied neighbor(they're part of the same cluster). If the cell has more than one occupied neighboring cell, then we choose thelowest-numbered cluster label of the occupied neighbors to use as the label for the current cell. Furthermore, ifthese neighboring cells have differing labels, we must make a note that these different labels correspond to thesame cluster.The Union-Find algorithm is a simple method for computing equivalence classes. Calling the function union(x,y)specifies that items x and y are members of the same equivalence class. Because equivalence relations aretransitive, all items equivalent to x are equivalent to all items equivalent to y. Thus for any item x, there is aset of items which are all equivalent to x; this set is the equivalence class of which x is a member. A secondfunction find(x) returns a representative member of the equivalence class to which x belongs.It is easy to describe the H-K algorithm in terms of union and find operations, and coding the algorithm withreference to union and find subroutines is more likely to result in a correct program than a more haphazardimplementation technique.The HK algorithm consists of a raster scan of the grid in question. Each time an occupied cell is encountered, acheck is done to see whether this cell has any neighboring cells who have already been scanned. If so, first a union operation is performed, to specify that these neighboring cells are in fact members of the same equivalence class. Then a find operation is performed to find a representative member of that equivalence class with which tolabel the current cell. If, on the other hand, the current cell has no neighbors, it is assigned a new, previouslyunused, label. The entire grid is processed in this way. The grid can then be raster-scanned a second time,performing only `find' operations at each cell, to re-label the cells with their final assignment of arepresentative element. This is easy to describe in pseudocode:largest_label = 0;for x in 0 to n_columns {for y in 0 to n_rows {if occupied[x,y] thenleft = occupied[x-1,y];above = occupied[x,y-1];if (left == 0) and (above == 0) thenlargest_label = largest_label + 1;label[x,y] = largest_label;else {if (left != 0) {if (right != 0)union(left,above);label[x,y] = find(above);} elselabel[x,y] = find(right);}}}One application is in the modeling of percolation or electrical conduction. If occupied cells are made of copperand unoccupied cells of glass, then a cluster is a group of electrically connected cells. Cells touch in the fourcardinal directions, but not diagonally. Here's an example:101100111111110 001001001110001 011110000101111 101101000011001 001100001000101 101010001001011 110010********* 100111001110011 110111010100100102200333333330 002004003330005 022220000305555 602207000055005 002200005000805 9020100005005055 9900100555555550 900101010005550055 99010101001105001200to do: re-render the above diagrams using CSS; automatically adjust the cell dimensions to be uniform and squareInner Workings of Union-FindThis is a description of an implementation of the Union-Find algorithm. We begin by assuming that there are a maximum of N equivalence classes. Note that this is the maximum number of intermediate equivalence classes, which may be greater than the final number of equivalence classes — an extreme upper bound for the number ofequivalence classes is the number of items (grid cells in the case of the HK algorithm) which are being sorted into equivalence classes. (I suppose this is a form of the pigeonhole principle: If you have X things in Y classes, then Y is less than X.)We maintain an array of N integers, called "labels ," whose elements have the following meaning: if labels[a]==b then we know that labels a and b correspond to the same cluster; initially we set labels[x]=x for all x (initially, each element is in its own equivalence class; initially each element is not equivalent to any other). Furthermore we impose the requirement that a >= b . In this way, we can set up chains of equivalent labels. Simple versions of the find and union functions are immediately apparent:int labels[N]; // initialized to labels[n]=n for all nint find(int x) { // naive (but correct) implementation of find while (labels[x] != x) x = labels[x]; return x; }The union function makes two labels equivalent by linking their respective chains of aliases:void union(int x, int y) { labels[find(x)] = find(y); }Note that the result x = find(x) will have the property label[x] == x , which is the defining property for x to be the representative member of its equivalence class. The correctness of the union function can be gleaned from this fact.[The original HK algorithm used the convention that negative label[x] values indicated that x was an alias of another label, while a positive value indicated that label[x] was the canonical label. This positive value wasincremented every time an element was added to the equivalence class — the result was that the labels array would give not only the structure of the equivalence classes, but the total number of elements in each one as well. It's a good idea, and the implementation given here could easily be modified to do that. However, it's probably simpler to just loop over the final labelled matrix and count the number of sites in each class.] An improvement is to allow find to collapse the tree of aliases:int find(int x) { int y = x;while (labels[y] != y) y = labels[y];while (labels[x] != x) { int z = labels[x]; labels[x] = y; x = z; }return y; }to do: add a diagram showing how the label aliases form a set of treesimplementationGiven an implementation of the union-find algorithm in the functions uf_find, uf_union, and uf_makeset , the Hoshen-Kopelman algorithm becomes very simple:for (int i=0; i<m; i++) for (int j=0; j<n; j++)if (matrix[i][j]) { // if occupied ... int up = (i==0 ? 0 : matrix[i-1][j]); // look up int left = (j==0 ? 0 : matrix[i][j-1]); // look left switch (!!up + !!left) {In this matrix, 1's represent occupied cells; 0's are unoccupied cells.001001101101100This is the result of applying the Hoshen-Kopelman algorithm to the grid on the left. Contiguous clusters are labeled.00130010100550121200case 0:matrix[i][j] = uf_make_set(); // a new clusterbreak;case 1: // part of an existing clustermatrix[i][j] = max(up,left); // whichever is nonzero is labelledbreak;case 2: // this site binds two clustersmatrix[i][j] = uf_union(up, left);break;}}NotesIt should be obvious how to generalise this for higher dimensions.A complete implementation in the C language is given here in the file hk.c.In MATLAB, the image processing toolbox comes with a function bwlabel that does cluster labelling.I don't have any FORTRAN implementation (as has often been requested). If someone would like to contribute one, I would include the code here.Example experimentsOne might be interested in the relationship between site occupation probability and the resulting number of clusters. The following program generates random matricies each with a randomly chosen site occupation probability, and outputs on each line the site occupation probability that was used and the number of clusters in the resulting matrix.int main(int argc, char **argv) {int m = 100;int n = 100;int n_trials = 10000;// allocate our matrixint **matrix = (int **)calloc(m, sizeof(int*));for (int i=0; i<m; i++)matrix[i] = (int *)calloc(n, sizeof(int));for (int trial = 0; trial < n_trials; trial ++) {float p = rand()/(float)RAND_MAX;// make a random matrix with site occupancy probability pfor (int i=0; i<m; i++)for (int j=0; j<n; j++)matrix[i][j] = (rand() < p*RAND_MAX);// count the clustersint clusters = hoshen_kopelman(matrix,m,n);printf("%f %d\n",p,clusters);}return 0;}Here's a visualisation of the output, made using gnuplot:It might also be interesting to plot the size of the largest cluster versus site occupation probability, or the distribution of cluster sizes for a given probability, etc. Another project would be to adapt the HK algorithm to work on a non-square grid (such as a hexagonal grid).Copyright © 2000 by Tobin Fricke. Last modified 21 April 2004.。