Walking_Marching_and_Galloping_Patterns_for_Memory_Tests2005
- 格式:pdf
- 大小:101.65 KB
- 文档页数:8
Marching Cubes ModuleDescriptionThis module is longer and more complex than many of the other modules.Most of the modules involve using VTK to visualize data.VTK is a general purpose toolkit that supports many different data types and many different algorithms.Its generic visualization pipeline structure allows different algorithms to be pieced together easily. The downside of this is that it is not optimized for any specific task and is thus slower and takes more memory to do the same task than an algorithm optimized for a specific visualization algorithm.This module examines many of the memory and low-level programming details that VTK hides by having the student develop a complete program for visualizing isosurfaces.Problem StatementThe goal of this module is to develop a complete system that extracts an isosurface from a data set and displays it using OpenGL.This project can easily be split into the two parts:extracting the isosurface and rendering it in an OpenGL window.If the student has previously worked with OpenGL this will be very easy since all it needs to do is open a window and display triangles,ideally rendered with smooth shading, vertex normals,and lighting.Extracting the isosurface using the marching cubes algorithm is straight forward,but actually writing the code and optimizing it is a fair amount of work.IntroductionCompared to many visualization algorithms,the implementation of the marching cubes algorithm is fairly straightforward and relatively simple and short.Because of this it is a good choice for afirst algorithm that a student implements from scratch.It also can be implemented in a brute force fashion or optimizations to limit memory usage and reduce duplicate computations can be ing the OpenGL API,the triangles it produces can be rendered for a complete visualization system in about10-15 pages of C++code.An optimized version with an advanced viewer that allows viewpoint changes and zooming capabilities can implemented in less than30pages of1code total.Depending on the number of students working together on the project,their expertise in OpenGL,and the amount of time you have,the instructor may provide some of the code for viewing the resulting isosurface.BackgroundThe marching cubes algorithm was originally developed by Lorensen and Cline and published in the1987Siggraph Proceedings(pp.163-169).The VTK book also describes the algorithm.A brief description of the algorithm from the course notes is copied here as the next two pargraphs for convenience.See Figure6-4page159of the VTK book for a two-dimensional example which generates an isoline for a specified value.The two-dimensional algorithm for extracting an isoline is known as marching squares.Figure6-5page160of the VTK book for the 16possible cases(2choices for each of4vertices so24)for an isoline passing through a square.Also note that cases1and14are equivalent(i.e.,it does not matter whether three of the vertices have a value above the isovalue and one below or three of the vertices have a value below the isovalue and one above-these are referred to as complementary cases).And if you allow rotations,cases1,2,4,7,8,11,13,and14are all equivalent.Also note that cases5and10are ambiguous with regard to where the isoline occurs.The extension of this method to three-dimensions is known as marching cubes.In three dimensions there are256cases(28).Figure6-6page161of the VTK book shows the15 topologically unique cases using the same strategy to reduce the number of two dimensional cases.The three-dimensional case also suffers from the ambiguity problem and requires handling some of the complementary(where the vertices above the isosurface and below are swapped)cases differently.See Figure6-10page164of the VTK book.Implementing the marching cubes algorithm is straightforward but is tedious and error-prone.There are256cases for a voxel since each value at the8voxel vertices can either be above or below the isosurface value.The common approach to implementing the marching cubes algorithm is to generate a256entry lookup table in which each entry specifies the triangles for that specific case.Then as each voxel is processed,the case is determined by calculating the case and getting the trinagles from the lookup table. This table could be produced by hand,but this is one of the areas where errors are likely to be introduced.If the lookup table specifies the wrong triangles for any one2case,the resulting isosurface(assuming that case occurs)will be wrong.Trying to track these errors down by looking at the image is practically impossible.The only way tofix it is to recheck each of the256entries in the lookup table.Another approach is to enter the15topologically unique cases and information about the triangles for each of the15cases and then write a program that uses this information to generate the256entry lookup table.This is less error prone since fewer entries have to be hand entered,although there is more code to write and debug.This is the approach the module author used and the code for it is in the instructor’s manual.To implement this consider how many different ways you can place a cube down on aflat surface.You may want to take a box and number the vertices from0to 7and experiment with placing the box down on different faces and rotating it.Once you determine,this,you can create a table that specifies which of the8positions each of the8vertices is in for each of the configurations.You will also want to number the edges from0to11and keep track of where each edge is for the configurations or just keep track of the two vertices each edge connects.Other information required is which vertices are above the isosurface value for each of the15unique cases and of course,the edges that contain the triangle vertices for each of the cases.Given all this information,a program can be written to generate the256 entry lookup table containing the triangles to generate for each case.A reasonable size for current data sets is256x256x256using2bits per data value.This requires approximately33MB of disk space/memory so it is not unreasonable to load the entire data set into memory and then also create the extracted triangles.AsMRI/CT scanners continue to improve their resoultion,data sets that are1024x1024x1024are expected.With2bits per data value,this is over2GB of data so reading the entire data set into the computer is not realistic.The size of memory will continue to grow,but since the data sets are cubic,the amount of memory needed will increase more quickly.When writing your program,try to minimize the number of slices you store in memory.The vertex location for each triangle is calculated by interpolating the data values at the two vertices where one is above and one below the specified isovalue.As you process each pair of data slices,the“brute force”method of generating the triangles would be to calculate and store the vertices for each triangle;however,each vertex is shared by a number of triangles so ideally you only want to calculate and store the coordinates for each vertex once and reuse it for each triangle that shares it.One of the module questions asks you to examine the relationship between the number of triangles and vertices.3The standard computer graphics data structure for storing a polyhedra is known as a polymesh.It includes an array of points and an array of polygons.Each polygon has an array that indexes the points that form that polygon.Thus,only one integer is required to store each polygon vertex(instead of3float/double values)plus the overhead of storing each vertex once.For the marching cubes algorithm,all the polygons are triangles so there is no need to store the number of vertices in each polygon.Other information that is often stored is a normal for each polygon and a“vertex normal”. The vertex normal is the average normal for all the polygons shared by that polygon. The normal for each triangle can be calculated using the cross product of two edges and then normalize the result.For each vertex,set the vertex normal to zero and then add all the normals for each polygon using that vertex and normalize the total.For lighting purpose,you will want to be consistent in specifying the vertex order for your triangles in the lookup table so that the vertices for a triangle are always specifed in clockwise or counter-clockwise order.This allows the normals to be calculated so that the normal is always“outward facing”for each triangle.Once all the triangles have been calculated along with vertex normals,we need to display the results.Currently,the most common graphics API is OpenGL.The simplest method for creating an interface for an OpenGL window is GLUT(GL Utility Toolkit).This was originally developed by SGI and there is now an open source version known as“freeglut”.Your instructor will let you know how to create an OpenGL window on your computer systems.The OpenGL is afinite state machine.The state(value of)various settings controls how objects are drawn.The state is changed by calling various OpenGL functions. Below is an example of setting up the viewing parameters for a400by400OpenGL window.glMatrixMode(GL_PROJECTION);glLoadIdentity();gluPerspective(45.0,1,0.1,100);glViewport(0,0,400,400);glMatrixMode(GL_MODELVIEW);glLoadIdentity();gluLookAt(0,0,5,//eye/camera location0,0,0,//center of interest0,1,0);//up vector that specifies the camera tiltglEnable(GL_LIGHTING);glEnable(GL_LIGHT0);4glShadeModel(GL_SMOOTH);glEnable(GL_DEPTH_TEST);glEnable(GL_NORMALIZE);float pos[4]={10,20,30,0};float intensity[4]={0.8,0.4,0.4,1};glLightfv(GL_LIGHT0,GL_POSITION,pos);glLightfv(GL_LIGHT0,GL_AMBIENT,intensity);Once,the viewing paratmers have been setup,drawing triangles using OpenGL is very easy.The vertices and normals are specified with function calls.Below is an example: glBegin(GL_TRIANGLES);glNormal3f(n1.x,n1.y,n1.z);//the vertex normal for point1glVertex3f(p1.x,p1.y,py.z);//the coordinates for point1glNormal3f(n2.x,n2.y,n2.z);//the vertex normal for point2glVertex3f(p2.x,p2.y,p2.z);//the coordinates for point2glNormal3f(n3.x,n3.y,n3.z);//the vertex normal for point2glVertex3f(p3.x,p3.y,p3.z);//the coordinates for point3glEnd();When creating your viewer,allow the isovalue to be specified on the command line so you can easily extract different isovalues without recompiling your program.You may also want thefilename to be specified on the command line.Your instructor may provide you with a“TrackBall”class to allow the viewing parameters to be changed using the mouse similar to the VTK interactors.The trackball class modifies the eye and center of interest and creates a viewing transformation matrix that can be applied in the OpenGL pipeline.If your program allows different viewing parameters to be specified interactively,you may want to consider using an OpenGL“display list”for all the triangles and vertex normals. Without a display list,the calls to glNormal3f and glVertex3f have to be repeated (placed in your display callback)each time the scene is redrawn.The overhead of all these function calls can be substantial.With a display list,all the functions are called once and OpenGL stores the information in memory.The display list trades memory requirements for speed.With a display list each time the scene is redrawn,only one OpenGL call to execute the display list needs to be performed.5Questions1.How many occurrences of each of the15topologically unique cases occur in the256entry lookup table?2.For an isosurface that produces a polyhedra with no holes in it,how manyvertices and edges are there if there are n triangles?If none of the vertices where shared,there would be3n vertices,but that is obviously not the case.On average by how many triangles is each vertex shared?Hint:Euler may help you answer this question.3.How many data slices did you keep in memory as you read the datafile?What isthe minimum number of slices that need to be in memory to implement thealgorithm?pare the performance of your implementation to VTK’s marching cubealgorithm.Describe how you do the comparision and list the results.5.If you were able to implement OpenGL display lists with the TrackBall forinteractive viewing parameter changes,compare the rendering speed with andwithout display lists.PrerequisitesThis module is more challenging than many of the others.It does not require a knowledge of VTK but does require significantly more programming expertise.The basic data structures covered in CS1and CS2are all that is required(arrays, classes/structs,and dynamic memory);however,because of the difficulty at least one student in each group probably should have completed an advanced data structures and algorithms course.Additionaly,a very basic knowledge of OpenGL and some window framework for it (such as GLUT)is necessary to visualize the extracted isosurface.Going over a simple GLUT/OpenGL example that draws polygons,and possibly smooth shaded polygons using vertex normals,should provide a student with enough knowledge.The basic concepts of smooth shading is covered in the course notes.Although Python or Java could be used for this assignment,it is more appropriate to use C/C++for the student to experience and appreciate the optimizations and6memory/speed improvements that a lower level language provides.If you have a large group of students working on the project,one or two students could develop the interface in Python and then access the C++code using Boost or SWIG to call theC++code from Python.InstructorThe instructor’s manual includes code for a complate visualization system using the marching cubes algorithm,OpenGL,and GLUT.The MarchingCubesTable class creates a lookup table specifying the triangles for each of the256cases.The code is heavily commented.Depending on the ability and experience of you students,you may want to suggest the data structures used in it or let them to create their own.The MarchingCubes class uses the MarchingCubesTable to extract triangles from a data set.It is set up to read data sets with the same format as the quarter resolution VTK head data set used in the Pipeline module.The quarter resolution VTK head data set consists of93files where eachfile contains a64by64slice with16bits per pixel.That data set can be viewed using:./main /usr/local/vtkdata/headsq/quarter 500where/usr/local/vtkdata/headsq contains thefiles quarter.1,quarter.2,..., quarter.93.The Viewer class initializes GLUT and OpenGL.It also uses the TrackBall class to handle viewing parameter changes using the mouse(left button rotates,middle button zooms,and right button translates).The Trackball class uses quaternions to achive the results and generates a4x4transformation that can be inserted in the OpenGL pipeline using the glMultMatrix function.The author module suggests you supply the students with the TrackBall and Viewer classes and have the students write the other classes.If your students are experienced with OpenGL or you have enough students that part of the group could work on the Viewer class while others write the marching cubes classes,you could let them write the Viewer class also.The TrackBall class is not needed but then the viewing parameters cannot be adjusted using the mouse.The code in the MarchingCubes class also includes methods for creating an OpenGL display list as discussed in the Background section.Also note the code to handledifferent endians.The head data set is stored in little endian format so if you are7reading the raw datafile on a big endian machine,the bytes of a short data type must be swapped.8。
生物多样性 1997,5(3):161~167CHIN ESE BIODIV ERSIT Y群落内物种多样性发生与维持的一个假说3张大勇 姜新华(兰州大学生物系干旱农业生态国家重点实验室, 兰州 730000)摘 要 本文根据作者对竞争排除法则的研究而提出了一个新的群落多样性假说。
按照作者的观点,占用相同生态位的物种可以稳定共存;这样,群落内物种多样性将受到4个基本因子所控制。
它们分别是:(Ⅰ)生态位的数量;(Ⅱ)区域物种库的大小;(Ⅲ)物种迁入速率,以及(Ⅳ)物种灭绝速率。
该假说强调区域生物地理过程与局域生态过程共同决定了群落内种多样性的大小及分布模式。
关键词 局域物种多样性,物种分化,区域物种库,生态位,竞争排除法则A hypothesis for the origin and maintenance of within2community species diversity/Zhang Dayong,JiangXinhu a//CHINESE BIODIVERSIT Y.—1997,5(3):161~167This paper formulates a novel hypothesis of community diversity on the basis of rejecting the competitive exclu2 sion principle.Since we accept the view that many species could occupy the same niche,local s pecies diversity is considered to be controlled by four fundamental factors,which are,res pectively,(Ⅰ)the number of niches in the community,(Ⅱ)the size of regional species2pool,(Ⅲ)species immigration rate,and(Ⅳ)species extinc2 tion rate.The hypothesis suggests that both regional biogeographic processes and local ecological processes will play an important role in determining the magnitude and pattern of community diversity.K ey w ords local species diversity,speciation,regional species2pool,niche,competitive exclusion principle Author’s address Department of Biology&State K ey Laboratory of Arid Agroecology,Lanzhou Univer2sity,Lanzhou 7300001 引言由于环境污染和生境破坏等人类活动的影响,大规模物种灭绝已成为当今社会所密切关注的一个焦点。
2023-2024学年上海市浦东新区东昌中学高一下学期期末考试英语试题1. The manager had fallen asleep where he __________, without undressing.A.had laid B.was lying C.was laying D.had lied 2. Mr Zhang is said __________ in American now, but I don’t know which state he is in.A.to have studied B.having studied C.to be studying D.to study 3. The visitors came to the railway station, only __________ that the train had left.A.to tell B.to be told C.told D.being told 4. He was speaking to his chemistry teacher __________ I saw him.A.for the first time B.the first time C.the first timewhen D.when the first time5. You’d better take something to read when you go to see a doctor __________ you have to wait.A.as if B.even if C.so that D.in case6. She is no longer the sweet girl __________ she used to be.A.what B.who C.that D.when7. Their school is built __________ there was a big rice field.A.what B.where C.in which D.which8. With the fast development of agriculture, the people ________ village I taught before had lived a happy life.A.who B.whose C.in whose D.in which9. The president stood up, __________ a friendly smile and began to make a speech.A.to give B.giving C.gave D.given10. There are cases __________ speech has started late in a child who eventually turns out to be of high IQ.A.why B.which C.where D.as Directions: After reading the passage below, fill in the blanks to make the passage coherent and grammatically correct. For the blanks with a given word, fill in each blank with the proper form of the given word; for the other blanks, use one word that best fits each blank.Number of Steps a Day It Takes to Cut Risk of Early DeathNew research suggests exactly how many steps you need to take each day to reduce the risk of heart disease and early death. If you 11 (fail) in your pursuit of walking 10,000 steps a day —researchers have some good news for you.They found walking up to 10,000 steps a day reduces these risks. The lowest risk of early death was 12 people who took 9,000 to 10,500 steps a day. When it came to avoiding heart attack, people 13 (manage) around 9,700 steps a day had the lowest risks.Experts have previously found people who spend a lot of time sitting while awake are more likely 14 (suffer) an early death and develop heart disease. It has been unclear 15 walking can balance out the effects of sitting down for most of the day — until now.A study, published in the British Journal of Sports Medicine, analysed data from 72,174 people 16 (age) around 61 from UK. Participants wore a device for seven days to measure their exercise levels. After a seven-year follow-up, 1,633 deaths and 6,190 cardiovascular (心脏血管的) events, such as heart attack, were recorded. The results showed any amount of daily steps above 2,200 a day were linked to lower death and heart disease — 17 the rest of the day was spent being inactive.Julie Ward, a senior nurse in the U.K., said, “We encourage people to stay active for 18 heart and circulatory health by doing 150 minutes of moderate exercise a week. “This can be any activity 19 fits into your lifestyle, such as taking regular walking breaks away from your computer screen, going to the gym, enjoying exercise classes, or even getting off the bus one stop 20 (early) to get more steps in.”Directions: Fill in each blank with a proper word chosen from the box. Each word can only be used once. Note that there is one word more than you need.A. topicB. developedC. arguablyD. increasinglyE. repeatedF. eventsG. shares H. individual I. influenced J. appeals K. literaryPoetry is a kind of writing in which the sound and meaning of groups of words express ideas or emotion in addition to the experiences or strong feelings the writer 21 . Unlike most other forms of writing, poetry is often written in lines, rather than paragraphs. Poetry also sounds different from other forms of writing, often using rhythm and rhyme to create an interesting sound when read aloud. Poetry catches the attention of a reader because it 22 to both emotions and senses.Sound is 23 the single most important aspect of any poem. The sound that any given word makes, or the sounds that come from specific groups of words used together, are what make poetry so unique as a form of writing. A typical story or report does not focus on the sounds that each 24 word makes when read. But poems generally contain few words, so it is important that each word plays a role in making an impact on the reader. Rhythm is the flow of sounds created by successive words in a poem. When you read a poem you can often hear this 25 pattern, or “beat” in the sounds. This is called meter.Some of the oldest and best-known poetry in the world came from Ancient Greece. As far back as 700 BCE, poets there recited their work at public 26 and religious ceremonies. The great epic poems The Iliad and The Odyssey by Homer came from Greece. The Greeks eventually 27 Roman poets, such as Virgil, who wrote the Aeneid 30 BCE. In medieval times, poems such as Beowulf, The Divine Comedy by Dante. and The Canterbury Tales by Chaucer were written. Religion and romance became the 28 of choice for many poets at that time.Poetry 29 even more during the Renaissance period of history, an era of many great cultural achievements. This was the period during which Shakespeare, the most well-known poet, was making his mark! Needless to say, a trend had started. Poetry has continued to grow and change as a form of 30 expression in modern times.Once upon a time there lived in Germany two brothers who loved a good story — one with magic and danger, royalty and villains (恶棍). At school they met a wise man who led them to a treasure — a library of old books with tales more fascinating than any they had ever heard. _________, the brothers began collecting their own stories, listening to the folktales people told them. Soon they_________ their own treasure — a book of fairy tales that would charm millions in faraway lands for generations to come.The brothers Grimm, Jacob and Wilhelm, named their story collection Children’s and Household Tales and published it in Germany in 1812. The collection which has been translated into more than 160 languages up to now is a publishing _________. The stories and their characters have appeared in theatre, opera, comic books, movies, paintings, rock music, advertising and even fashion.Such _________ would have shocked the modest Grimms. During their lifetimes the book sold few copies in Germany. The early editions were not even _________ children. They had no illustrations, and scholarly footnotes took up almost as much space as the tales themselves. Jacob and Wilhelm Grimm viewed themselves as _________ students of folklore. They began their work at a time when Germany had been occupied by the French under Napoleon. As young scholars, the brothers Grimm began to work on the fairy tale collection in order to save the endangered oral storytelling tradition of Germany.Long before the Grimms’ time, _________ developed in inns, barns, and peasant homes. During winter nights, as they sat spinning wood, women kept each other company and entertained each other with tales of adventure, romance and magic. _________, 40 such storytellers delivered tales to the Grimms, many of them coming to their house in Kassel. Although the brothers implied that they were just _________ the tales, Wilhelm polished and reshaped the stories up to the final edition of 1857. In an effort to make them more __________ to children and their parents, he stressed the moral of each tale and emphasized gender roles. According to the Grimms, the collection served as “a manual of __________.” To this day, parents read them to their children be cause they approve of the lessons in the stories: keep your promises, don’t talk to strangers, work hard, obey your parents.So what __________ their popularity? Some have suggested that it is because the characters are always striving for happiness. But the truth probably lies in their __________. The Grimms’ tales were born out of a storytelling tradition without __________ of age or culture. The brothers’ skill was to translate these into a universal style of writing that seems to mirror whatever moods or interests we bring to our __________ of them. And so it was that the Grimms’ fairy tales lived happily ever after.31.A.Inspired B.Disappointed C.Discouraged D.Relieved32.A.estimated B.produced C.sacrificed D.stocked33.A.medium B.partnership C.finding D.wonder34.A.quality B.wealth C.fame D.benefit35.A.marked as B.robbed of C.aimed at D.prevented from 36.A.intelligent B.hardworking C.peculiar D.patriotic37.A.collection B.storytelling C.entertainment D.meeting38.A.Besides B.Altogether C.However D.Similarly39.A.creating B.developing C.reviewing D.recording40.A.accustomed B.acceptable C.cruel D.compared41.A.manners B.parentship C.publishing D.adaptation42.A.results from B.depends on C.accounts for D.responds to43.A.appeal B.flexibility C.availability D.origin44.A.boundaries B.influences C.indications D.distributions 45.A.writing B.sharing C.reading D.beginning Charles Robert Darwin was born on 12 February 1809 in Shropshire, England. Darwin’s childhood passion was science, and his interest in chemistry, however, was clear; he was even nicknamed‘Gas’ by his classmates.In 1825, his father sent him to study medicine at Edinburgh University, where he learned how to classify plants. Darwin became passionate about natural history and this became his focus while he studied at Cambridge. Darwin went on a voyage together with Robert Fitzroy, the captain of HMS Beagle, to South America to facilitate British trade in Patagonia. The journey was life-changing. Darwin spent much of the trip on land collecting samples of plants, animals and rocks, which helpedhim to develop an understanding of the processes that shape the Earth’s surface. Darwin’s analysis of the plants and animals that he gathered led him to express doubts on former explanations about how species formed and evolved over time.Darwin’s work convinced him that natural selection was key to understanding the development of the natural world. The theory of natural selection says that individuals of a species are more likely to survive when they inherit(经遗传获得) characteristics best suited for that specific environment. These features then become more widespread and can lead eventually to the development of a new species. With natural selection, Darwin argued how a wide variety of life forms developed over time from a single common ancestor.Darwin married his cousin, Emma Wedgwood, in 1839. When Darwin’s eldest daughter, Annie, died from a sudden illness in 1851, he lost his belief in God. His tenth and final child, Charles Waring Darwin, was born in 1856. Significantly for Darwin, this baby was disabled, altering how Darwin thought about the human species. Darwin had previously thought that species remained adapted until the environment changed; he now believed that every new variation was imperfect and that a struggle to survive was what drove species to adapt.Though rejected at the beginning, Darwin’s theory of evolution by natural selection is nowadays well accepted by the scientific community as the best evidence-based explanation for the diversity and complexity of life on Earth. The Natural History Museum’s library alone has 478 editions of his On the Origin of Species in 38 languages.46. What made Darwin reconsider the origin and development of species?A.Examining plants and animals collected.B.His desire for a voyage to different continents.C.Classifying samples in a journey to South America.D.His passion for natural history at Edinburgh University.47. We can learn from paragraphs 1 to 3 that Darwin ________.A.used natural selection to develop new speciesB.enjoyed being called nicknames related to scienceC.learned some knowledge about plants when studying medicineD.argued with others over the diversity of life forms for a long period48. Which of the following changed Darwin’s view on the human species?A.That he had ten children in all. B.His youngest son’s being disabled.C.That he lost his eldest daughter. D.His marriage with Emma Wedgwood.49. This passage is mainly about ________.A.Darwin’s passion for medical science B.Darwin’s theory and experimentsC.Charles Darwin’s changing interest D.Charles Darwin’s life and workA.adventure lovers B.foreign travellers C.creativeteenagers D.professional sportsmen51. Catherine would like to develop her confidence in a safe and positive atmosphere, so she can call ________ for more information.A.(800) 2223-3595 B.(888) 4209-7855 C.(886) 6817-2173 D.(212) 6743-4300 52. Lucy’s dream is to be a film actress, so she’d probably go to ________ to attend a camp.A.Colorado B.Oregon C.Texas D.New York“Choose your friends wisely” may not only be good parental advice but also a way to d o better in college, a research study finds.The group of three researchers put that advice to the test at Berea College, a small liberal arts school in Kentucky, by looking at how much friends actually influence study habits and grades. They found that students who befriended studious (勤奋的) peers spent more hours studying themselves and posted higher grades during their freshman year.“It’s no fun to study by yourself,” said Nirav Mehta, one of the study’s authors, explaining the intuition behind the study. “If you want to goof off, and your friends are at the library, then you’re going to go to the library, too. And while you’re there, you’re probably going to get some studying done too.”Of course, it’s possible that studious people gravitate toward other studious people. They might have hit the books and got as many A’s no matter who their friends were. So the researchers checked to see if randomly assigned roommates also have a positive influence on study habits and grades. They found almost the same results: students who were assigned a studious roommate freshman year also studied more each day and had higher grade-point averages.Unfortunately, the opposite is also true, the researchers found. If you have friends and roommates who don’t study a lot, you’re likely to get dragged down by their poor habits, studying less and earning lower grades.Analyzing friends and study habits is usually difficult for researchers. But students at Berea College were asked to list their four best friends at the end of each semester and they kept careful daily logs of their time, including time spent studying. At the beginning of freshman year, the students were surveyed on their high school study habits. The researchers also had access to roommate assignments, high school grades and college grades.From this information, the economists calculated the average amount of time each student’s college friends had reported studying in high school. They found that for every additional 10 hours a week that a student’s friends had spent studying, on average, the student’s own study time in college would likely increase by almost 25 minutes a day, and the student’s own GPA would likely rise by almost a tenth of a point during freshman year.53. The phrase “goof off” (paragraph 3) most probably means ________.A.achieve higher grades B.choose your friendsC.go to the library D.be lazy about studying54. Why did the researchers also study the randomly assigned roommates?A.To further test the theory. B.To figure out more study habits.C.To put forward a new theory. D.To get more students to work hard.55. To carry out their research, what information did the researchers collect from students at Berea College?A.How many studious friends they have.B.How they comment on their friends’ grades.C.How much time they spent studying each day.D.How they thought of their own college grades.56. What suggestion would the researchers most likely give college students?A.If you want to do well in study, you’d better pick a hardworking friend.B.If you want to get on well with your roommates, you’d better work hard.C.If you want to raise your GPA, you’d better keep track of your study time.D.If you want to have a happy freshman year, you’d better care less about peer effects.A Recipe for Avoiding DisasterEvery log cabin homeowner has had those “what if” thoughts. What if conditions become so dry that a wildfire starts near my cabin? What if the rain is so relentless that the lake outside my doorstep begins to overflow its banks? What if tornado-fueled winds threaten to destroy everything I’ve created?57 . That’s the bad news. The good news? There are ways you can use your landscaping to minimize the damage that extreme weather can cause and still have an attractive yard all at the same time. Here are a few tips for using your outdoor resources in a way that can lessen the impact of a natural disaster on your log home:WILDFIRES: The key to keeping fires from damaging your home is regular maintenance with a focus on how fires spread. Begin by removing dead plants like trees and shrubs promptly and trimming (修剪) any branches that overhang the property or have close contact with the log walls.58 . It’s also a good idea to install irrigation in a 50-foot radius around your home to help create a fireproofing perimeter.FLOODS: A massive amount of flooding that leaves your house and yard in standing water is a tall order in terms of prevention strategies, but more moderate flooding can be stopped with a few simple landscaping tricks. 59 . You can create extra drainage (排水系统) through the use of strategically placed stones and bushes that direct water away from the foundation.TORNADOES: To minimize the damage caused by the sudden gusts tornado weather brings, be sure to keep trees and shrubs trimmed. 60 .61. 栖息地 ________62. 非凡的;显著的 ________63. 精致的;精细的 ________64. 恐龙 ________65. 简图;图解 ________66. 计算;核算 ________汉译英67. 整容手术 ________68. 全新的;崭新的 ________69. 正在运行;正在操作 ________70. 在校园内 ________71. 试穿 ________72. 他总是对自己的工作充满了热情。
A review on time series data miningTak-chung FuDepartment of Computing,Hong Kong Polytechnic University,Hunghom,Kowloon,Hong Konga r t i c l e i n f oArticle history:Received19February2008Received in revised form14March2010Accepted4September2010Keywords:Time series data miningRepresentationSimilarity measureSegmentationVisualizationa b s t r a c tTime series is an important class of temporal data objects and it can be easily obtained from scientificandfinancial applications.A time series is a collection of observations made chronologically.The natureof time series data includes:large in data size,high dimensionality and necessary to updatecontinuously.Moreover time series data,which is characterized by its numerical and continuousnature,is always considered as a whole instead of individual numericalfield.The increasing use of timeseries data has initiated a great deal of research and development attempts in thefield of data mining.The abundant research on time series data mining in the last decade could hamper the entry ofinterested researchers,due to its complexity.In this paper,a comprehensive revision on the existingtime series data mining researchis given.They are generally categorized into representation andindexing,similarity measure,segmentation,visualization and mining.Moreover state-of-the-artresearch issues are also highlighted.The primary objective of this paper is to serve as a glossary forinterested researchers to have an overall picture on the current time series data mining developmentand identify their potential research direction to further investigation.&2010Elsevier Ltd.All rights reserved.1.IntroductionRecently,the increasing use of temporal data,in particulartime series data,has initiated various research and developmentattempts in thefield of data mining.Time series is an importantclass of temporal data objects,and it can be easily obtained fromscientific andfinancial applications(e.g.electrocardiogram(ECG),daily temperature,weekly sales totals,and prices of mutual fundsand stocks).A time series is a collection of observations madechronologically.The nature of time series data includes:large indata size,high dimensionality and update continuously.Moreovertime series data,which is characterized by its numerical andcontinuous nature,is always considered as a whole instead ofindividual numericalfield.Therefore,unlike traditional databaseswhere similarity search is exact match based,similarity search intime series data is typically carried out in an approximatemanner.There are various kinds of time series data related research,forexample,finding similar time series(Agrawal et al.,1993a;Berndtand Clifford,1996;Chan and Fu,1999),subsequence searching intime series(Faloutsos et al.,1994),dimensionality reduction(Keogh,1997b;Keogh et al.,2000)and segmentation(Abonyiet al.,2005).Those researches have been studied in considerabledetail by both database and pattern recognition communities fordifferent domains of time series data(Keogh and Kasetty,2002).In the context of time series data mining,the fundamentalproblem is how to represent the time series data.One of thecommon approaches is transforming the time series to anotherdomain for dimensionality reduction followed by an indexingmechanism.Moreover similarity measure between time series ortime series subsequences and segmentation are two core tasksfor various time series mining tasks.Based on the time seriesrepresentation,different mining tasks can be found in theliterature and they can be roughly classified into fourfields:pattern discovery and clustering,classification,rule discovery andsummarization.Some of the research concentrates on one of thesefields,while the others may focus on more than one of the aboveprocesses.In this paper,a comprehensive review on the existingtime series data mining research is given.Three state-of-the-arttime series data mining issues,streaming,multi-attribute timeseries data and privacy are also briefly introduced.The remaining part of this paper is organized as follows:Section2contains a discussion of time series representation andindexing.The concept of similarity measure,which includes bothwhole time series and subsequence matching,based on the rawtime series data or the transformed domain will be reviewed inSection3.The research work on time series segmentation andvisualization will be discussed in Sections4and5,respectively.InSection6,vary time series data mining tasks and recent timeseries data mining directions will be reviewed,whereas theconclusion will be made in Section7.2.Time series representation and indexingOne of the major reasons for time series representation is toreduce the dimension(i.e.the number of data point)of theContents lists available at ScienceDirectjournal homepage:/locate/engappaiEngineering Applications of Artificial Intelligence0952-1976/$-see front matter&2010Elsevier Ltd.All rights reserved.doi:10.1016/j.engappai.2010.09.007E-mail addresses:cstcfu@.hk,cstcfu@Engineering Applications of Artificial Intelligence24(2011)164–181original data.The simplest method perhaps is sampling(Astrom, 1969).In this method,a rate of m/n is used,where m is the length of a time series P and n is the dimension after dimensionality reduction(Fig.1).However,the sampling method has the drawback of distorting the shape of sampled/compressed time series,if the sampling rate is too low.An enhanced method is to use the average(mean)value of each segment to represent the corresponding set of data points. Again,with time series P¼ðp1,...,p mÞand n is the dimension after dimensionality reduction,the‘‘compressed’’time series ^P¼ð^p1,...,^p nÞcan be obtained by^p k ¼1k kX e ki¼s kp ið1Þwhere s k and e k denote the starting and ending data points of the k th segment in the time series P,respectively(Fig.2).That is, using the segmented means to represent the time series(Yi and Faloutsos,2000).This method is also called piecewise aggregate approximation(PAA)by Keogh et al.(2000).1Keogh et al.(2001a) propose an extended version called an adaptive piecewise constant approximation(APCA),in which the length of each segment is notfixed,but adaptive to the shape of the series.A signature technique is proposed by Faloutsos et al.(1997)with similar ideas.Besides using the mean to represent each segment, other methods are proposed.For example,Lee et al.(2003) propose to use the segmented sum of variation(SSV)to represent each segment of the time series.Furthermore,a bit level approximation is proposed by Ratanamahatana et al.(2005)and Bagnall et al.(2006),which uses a bit to represent each data point.To reduce the dimension of time series data,another approach is to approximate a time series with straight lines.Two major categories are involved.Thefirst one is linear interpolation.A common method is using piecewise linear representation(PLR)2 (Keogh,1997b;Keogh and Smyth,1997;Smyth and Keogh,1997). The approximating line for the subsequence P(p i,y,p j)is simply the line connecting the data points p i and p j.It tends to closely align the endpoint of consecutive segments,giving the piecewise approximation with connected lines.PLR is a bottom-up algo-rithm.It begins with creating afine approximation of the time series,so that m/2segments are used to approximate the m length time series and iteratively merges the lowest cost pair of segments,until it meets the required number of segment.When the pair of adjacent segments S i and S i+1are merged,the cost of merging the new segment with its right neighbor and the cost of merging the S i+1segment with its new larger neighbor is calculated.Ge(1998)extends PLR to hierarchical structure. Furthermore,Keogh and Pazzani enhance PLR by considering weights of the segments(Keogh and Pazzani,1998)and relevance feedback from the user(Keogh and Pazzani,1999).The second approach is linear regression,which represents the subsequences with the bestfitting lines(Shatkay and Zdonik,1996).Furthermore,reducing the dimension by preserving the salient points is a promising method.These points are called as perceptually important points(PIP).The PIP identification process isfirst introduced by Chung et al.(2001)and used for pattern matching of technical(analysis)patterns infinancial applications. With the time series P,there are n data points:P1,P2y,P n.All the data points in P can be reordered by its importance by going through the PIP identification process.Thefirst data point P1and the last data point P n in the time series are thefirst and two PIPs, respectively.The next PIP that is found will be the point in P with maximum distance to thefirst two PIPs.The fourth PIP that is found will be the point in P with maximum vertical distance to the line joining its two adjacent PIPs,either in between thefirst and second PIPs or in between the second and the last PIPs.The PIP location process continues until all the points in P are attached to a reordered list L or the required number of PIPs is reached(i.e. reduced to the required dimension).Seven PIPs are identified in from the sample time series in Fig.3.Detailed treatment can be found in Fu et al.(2008c).The idea is similar to a technique proposed about30years ago for reducing the number of points required to represent a line by Douglas and Peucker(1973)(see also Hershberger and Snoeyink, 1992).Perng et al.(2000)use a landmark model to identify the important points in the time series for similarity measure.Man and Wong(2001)propose a lattice structure to represent the identified peaks and troughs(called control points)in the time series.Pratt and Fink(2002)and Fink et al.(2003)define extrema as minima and maxima in a time series and compress thetime Fig.1.Time series dimensionality reduction by sampling.The time series on the left is sampled regularly(denoted by dotted lines)and displayed on the right with a largedistortion.Fig.2.Time series dimensionality reduction by PAA.The horizontal dotted lines show the mean of each segment.1This method is called piecewise constant approximation originally(Keoghand Pazzani,2000a).2It is also called piecewise linear approximation(PLA).Tak-chung Fu/Engineering Applications of Artificial Intelligence24(2011)164–181165series by selecting only certain important extrema and dropping the other points.The idea is to discard minor fluctuations and keep major minima and maxima.The compression is controlled by the compression ratio with parameter R ,which is always greater than one;an increase of R leads to the selection of fewer points.That is,given indices i and j ,where i r x r j ,a point p x of a series P is an important minimum if p x is the minimum among p i ,y ,p j ,and p i /p x Z R and p j /p x Z R .Similarly,p x is an important maximum if p x is the maximum among p i ,y ,p j and p x /p i Z R and p x /p j Z R .This algorithm takes linear time and constant memory.It outputs the values and indices of all important points,as well as the first and last point of the series.This algorithm can also process new points as they arrive,without storing the original series.It identifies important points based on local information of each segment (subsequence)of time series.Recently,a critical point model (CPM)(Bao,2008)and a high-level representation based on a sequence of critical points (Bao and Yang,2008)are proposed for financial data analysis.On the other hand,special points are introduced to restrict the error on PLR (Jia et al.,2008).Key points are suggested to represent time series in (Leng et al.,2009)for an anomaly detection.Another common family of time series representation approaches converts the numeric time series to symbolic form.That is,first discretizing the time series into segments,then converting each segment into a symbol (Yang and Zhao,1998;Yang et al.,1999;Motoyoshi et al.,2002;Aref et al.,2004).Lin et al.(2003;2007)propose a method called symbolic aggregate approximation (SAX)to convert the result from PAA to symbol string.The distribution space (y -axis)is divided into equiprobable regions.Each region is represented by a symbol and each segment can then be mapped into a symbol corresponding to the region inwhich it resides.The transformed time series ^Pusing PAA is finally converted to a symbol string SS (s 1,y ,s W ).In between,two parameters must be specified for the conversion.They are the length of subsequence w and alphabet size A (number of symbols used).Besides using the means of the segments to build the alphabets,another method uses the volatility change to build the alphabets.Jonsson and Badal (1997)use the ‘‘Shape Description Alphabet (SDA)’’.Example symbols like highly increasing transi-tion,stable transition,and slightly decreasing transition are adopted.Qu et al.(1998)use gradient alphabets like upward,flat and download as symbols.Huang and Yu (1999)suggest transforming the time series to symbol string,using change ratio between contiguous data points.Megalooikonomou et al.(2004)propose to represent each segment by a codeword from a codebook of key-sequences.This work has extended to multi-resolution consideration (Megalooi-konomou et al.,2005).Morchen and Ultsch (2005)propose an unsupervised discretization process based on quality score and persisting states.Instead of ignoring the temporal order of values like many other methods,the Persist algorithm incorporates temporal information.Furthermore,subsequence clustering is a common method to generate the symbols (Das et al.,1998;Li et al.,2000a;Hugueney and Meunier,2001;Hebrail and Hugueney,2001).A multiple abstraction level mining (MALM)approach is proposed by Li et al.(1998),which is based on the symbolic form of the time series.The symbols in this paper are determined by clustering the features of each segment,such as regression coefficients,mean square error and higher order statistics based on the histogram of the regression residuals.Most of the methods described so far are representing time series in time domain directly.Representing time series in the transformation domain is another large family of approaches.One of the popular transformation techniques in time series data mining is the discrete Fourier transforms (DFT),since first being proposed for use in this context by Agrawal et al.(1993a).Rafiei and Mendelzon (2000)develop similarity-based queries,using DFT.Janacek et al.(2005)propose to use likelihood ratio statistics to test the hypothesis of difference between series instead of an Euclidean distance in the transformed domain.Recent research uses wavelet transform to represent time series (Struzik and Siebes,1998).In between,the discrete wavelet transform (DWT)has been found to be effective in replacing DFT (Chan and Fu,1999)and the Haar transform is always selected (Struzik and Siebes,1999;Wang and Wang,2000).The Haar transform is a series of averaging and differencing operations on a time series (Chan and Fu,1999).The average and difference between every two adjacent data points are computed.For example,given a time series P ¼(1,3,7,5),dimension of 4data points is the full resolution (i.e.original time series);in dimension of two coefficients,the averages are (26)with the coefficients (À11)and in dimension of 1coefficient,the average is 4with coefficient (À2).A multi-level representation of the wavelet transform is proposed by Shahabi et al.(2000).Popivanov and Miller (2002)show that a large class of wavelet transformations can be used for time series representation.Dasha et al.(2007)compare different wavelet feature vectors.On the other hand,comparison between DFT and DWT can be found in Wu et al.(2000b)and Morchen (2003)and a combination use of Fourier and wavelet transforms are presented in Kawagoe and Ueda (2002).An ensemble-index,is proposed by Keogh et al.(2001b)and Vlachos et al.(2006),which ensembles two or more representations for indexing.Principal component analysis (PCA)is a popular multivariate technique used for developing multivariate statistical process monitoring methods (Yang and Shahabi,2005b;Yoon et al.,2005)and it is applied to analyze financial time series by Lesch et al.(1999).In most of the related works,PCA is used to eliminate the less significant components or sensors and reduce the data representation only to the most significant ones and to plot the data in two dimensions.The PCA model defines linear hyperplane,it can be considered as the multivariate extension of the PLR.PCA maps the multivariate data into a lower dimensional space,which is useful in the analysis and visualization of correlated high-dimensional data.Singular value decomposition (SVD)(Korn et al.,1997)is another transformation-based approach.Other time series representation methods include modeling time series using hidden markov models (HMMs)(Azzouzi and Nabney,1998)and a compression technique for multiple stream is proposed by Deligiannakis et al.(2004).It is based onbaseFig.3.Time series compression by data point importance.The time series on the left is represented by seven PIPs on the right.Tak-chung Fu /Engineering Applications of Artificial Intelligence 24(2011)164–181166signal,which encodes piecewise linear correlations among the collected data values.In addition,a recent biased dimension reduction technique is proposed by Zhao and Zhang(2006)and Zhao et al.(2006).Moreover many of the representation schemes described above are incorporated with different indexing methods.A common approach is adopted to an existing multidimensional indexing structure(e.g.R-tree proposed by Guttman(1984))for the representation.Agrawal et al.(1993a)propose an F-index, which adopts the R*-tree(Beckmann et al.,1990)to index thefirst few DFT coefficients.An ST-index is further proposed by (Faloutsos et al.(1994),which extends the previous work for subsequence handling.Agrawal et al.(1995a)adopt both the R*-and R+-tree(Sellis et al.,1987)as the indexing structures.A multi-level distance based index structure is proposed(Yang and Shahabi,2005a),which for indexing time series represented by PCA.Vlachos et al.(2005a)propose a Multi-Metric(MM)tree, which is a hybrid indexing structure on Euclidean and periodic spaces.Minimum bounding rectangle(MBR)is also a common technique for time series indexing(Chu and Wong,1999;Vlachos et al.,2003).An MBR is adopted in(Rafiei,1999)which an MT-index is developed based on the Fourier transform and in(Kahveci and Singh,2004)which a multi-resolution index is proposed based on the wavelet transform.Chen et al.(2007a)propose an indexing mechanism for PLR representation.On the other hand, Kim et al.(1996)propose an index structure called TIP-index (TIme series Pattern index)for manipulating time series pattern databases.The TIP-index is developed by improving the extended multidimensional dynamic indexfile(EMDF)(Kim et al.,1994). An iSAX(Shieh and Keogh,2009)is proposed to index massive time series,which is developed based on an SAX.A multi-resolution indexing structure is proposed by Li et al.(2004),which can be adapted to different representations.To sum up,for a given index structure,the efficiency of indexing depends only on the precision of the approximation in the reduced dimensionality space.However in choosing a dimensionality reduction technique,we cannot simply choose an arbitrary compression algorithm.It requires a technique that produces an indexable representation.For example,many time series can be efficiently compressed by delta encoding,but this representation does not lend itself to indexing.In contrast,SVD, DFT,DWT and PAA all lend themselves naturally to indexing,with each eigenwave,Fourier coefficient,wavelet coefficient or aggregate segment map onto one dimension of an index tree. Post-processing is then performed by computing the actual distance between sequences in the time domain and discarding any false matches.3.Similarity measureSimilarity measure is of fundamental importance for a variety of time series analysis and data mining tasks.Most of the representation approaches discussed in Section2also propose the similarity measure method on the transformed representation scheme.In traditional databases,similarity measure is exact match based.However in time series data,which is characterized by its numerical and continuous nature,similarity measure is typically carried out in an approximate manner.Consider the stock time series,one may expect having queries like: Query1:find all stocks which behave‘‘similar’’to stock A.Query2:find all‘‘head and shoulders’’patterns last for a month in the closing prices of all high-tech stocks.The query results are expected to provide useful information for different stock analysis activities.Queries like Query2in fact is tightly coupled with the patterns frequently used in technical analysis, e.g.double top/bottom,ascending triangle,flag and rounded top/bottom.In time series domain,devising an appropriate similarity function is by no means trivial.There are essentially two ways the data that might be organized and processed(Agrawal et al., 1993a).In whole sequence matching,the whole length of all time series is considered during the similarity search.It requires comparing the query sequence to each candidate series by evaluating the distance function and keeping track of the sequence with the smallest distance.In subsequence matching, where a query sequence Q and a longer sequence P are given,the task is tofind the subsequences in P,which matches Q. Subsequence matching requires that the query sequence Q be placed at every possible offset within the longer sequence P.With respect to Query1and Query2above,they can be considered as a whole sequence matching and a subsequence matching,respec-tively.Gavrilov et al.(2000)study the usefulness of different similarity measures for clustering similar stock time series.3.1.Whole sequence matchingTo measure the similarity/dissimilarity between two time series,the most popular approach is to evaluate the Euclidean distance on the transformed representation like the DFT coeffi-cients(Agrawal et al.,1993a)and the DWT coefficients(Chan and Fu,1999).Although most of these approaches guarantee that a lower bound of the Euclidean distance to the original data, Euclidean distance is not always being the suitable distance function in specified domains(Keogh,1997a;Perng et al.,2000; Megalooikonomou et al.,2005).For example,stock time series has its own characteristics over other time series data(e.g.data from scientific areas like ECG),in which the salient points are important.Besides Euclidean-based distance measures,other distance measures can easily be found in the literature.A constraint-based similarity query is proposed by Goldin and Kanellakis(1995), which extended the work of(Agrawal et al.,1993a).Das et al. (1997)apply computational geometry methods for similarity measure.Bozkaya et al.(1997)use a modified edit distance function for time series matching and retrieval.Chu et al.(1998) propose to measure the distance based on the slopes of the segments for handling amplitude and time scaling problems.A projection algorithm is proposed by Lam and Wong(1998).A pattern recognition method is proposed by Morrill(1998),which is based on the building blocks of the primitives of the time series. Ruspini and Zwir(1999)devote an automated identification of significant qualitative features of complex objects.They propose the process of discovery and representation of interesting relations between those features,the generation of structured indexes and textual annotations describing features and their relations.The discovery of knowledge by an analysis of collections of qualitative descriptions is then achieved.They focus on methods for the succinct description of interesting features lying in an effective frontier.Generalized clustering is used for extracting features,which interest domain experts.The general-ized Markov models are adopted for waveform matching in Ge and Smyth(2000).A content-based query-by-example retrieval model called FALCON is proposed by Wu et al.(2000a),which incorporates a feedback mechanism.Indeed,one of the most popular andfield-tested similarity measures is called the‘‘time warping’’distance measure.Based on the dynamic time warping(DTW)technique,the proposed method in(Berndt and Clifford,1994)predefines some patterns to serve as templates for the purpose of pattern detection.To align two time series,P and Q,using DTW,an n-by-m matrix M isfirstTak-chung Fu/Engineering Applications of Artificial Intelligence24(2011)164–181167constructed.The(i th,j th)element of the matrix,m ij,contains the distance d(q i,p j)between the two points q i and p j and an Euclidean distance is typically used,i.e.d(q i,p j)¼(q iÀp j)2.It corresponds to the alignment between the points q i and p j.A warping path,W,is a contiguous set of matrix elements that defines a mapping between Q and P.Its k th element is defined as w k¼(i k,j k)andW¼w1,w2,...,w k,...,w Kð2Þwhere maxðm,nÞr K o mþnÀ1.The warping path is typically subjected to the following constraints.They are boundary conditions,continuity and mono-tonicity.Boundary conditions are w1¼(1,1)and w K¼(m,n).This requires the warping path to start andfinish diagonally.Next constraint is continuity.Given w k¼(a,b),then w kÀ1¼(a0,b0), where aÀa u r1and bÀb u r1.This restricts the allowable steps in the warping path being the adjacent cells,including the diagonally adjacent cell.Also,the constraints aÀa uZ0and bÀb uZ0force the points in W to be monotonically spaced in time.There is an exponential number of warping paths satisfying the above conditions.However,only the path that minimizes the warping cost is of interest.This path can be efficiently found by using dynamic programming(Berndt and Clifford,1996)to evaluate the following recurrence equation that defines the cumulative distance gði,jÞas the distance dði,jÞfound in the current cell and the minimum of the cumulative distances of the adjacent elements,i.e.gði,jÞ¼dðq i,p jÞþmin f gðiÀ1,jÀ1Þ,gðiÀ1,jÞ,gði,jÀ1Þgð3ÞA warping path,W,such that‘‘distance’’between them is minimized,can be calculated by a simple methodDTWðQ,PÞ¼minWX Kk¼1dðw kÞ"#ð4Þwhere dðw kÞcan be defined asdðw kÞ¼dðq ik ,p ikÞ¼ðq ikÀp ikÞ2ð5ÞDetailed treatment can be found in Kruskall and Liberman (1983).As DTW is computationally expensive,different methods are proposed to speedup the DTW matching process.Different constraint(banding)methods,which control the subset of matrix that the warping path is allowed to visit,are reviewed in Ratanamahatana and Keogh(2004).Yi et al.(1998)introduce a technique for an approximate indexing of DTW that utilizes a FastMap technique,whichfilters the non-qualifying series.Kim et al.(2001)propose an indexing approach under DTW similarity measure.Keogh and Pazzani(2000b)introduce a modification of DTW,which integrates with PAA and operates on a higher level abstraction of the time series.An exact indexing approach,which is based on representing the time series by PAA for DTW similarity measure is further proposed by Keogh(2002).An iterative deepening dynamic time warping(IDDTW)is suggested by Chu et al.(2002),which is based on a probabilistic model of the approximate errors for all levels of approximation prior to the query process.Chan et al.(2003)propose afiltering process based on the Haar wavelet transformation from low resolution approx-imation of the real-time warping distance.Shou et al.(2005)use an APCA approximation to compute the lower bounds for DTW distance.They improve the global bound proposed by Kim et al. (2001),which can be used to index the segments and propose a multi-step query processing technique.A FastDTW is proposed by Salvador and Chan(2004).This method uses a multi-level approach that recursively projects a solution from a coarse resolution and refines the projected solution.Similarly,a fast DTW search method,an FTW is proposed by Sakurai et al.(2005) for efficiently pruning a significant number of search candidates. Ratanamahatana and Keogh(2005)clarified some points about DTW where are related to lower bound and speed.Euachongprasit and Ratanamahatana(2008)also focus on this problem.A sequentially indexed structure(SIS)is proposed by Ruengron-ghirunya et al.(2009)to balance the tradeoff between indexing efficiency and I/O cost during DTW similarity measure.A lower bounding function for group of time series,LBG,is adopted.On the other hand,Keogh and Pazzani(2001)point out the potential problems of DTW that it can lead to unintuitive alignments,where a single point on one time series maps onto a large subsection of another time series.Also,DTW may fail to find obvious and natural alignments in two time series,because of a single feature(i.e.peak,valley,inflection point,plateau,etc.). One of the causes is due to the great difference between the lengths of the comparing series.Therefore,besides improving the performance of DTW,methods are also proposed to improve an accuracy of DTW.Keogh and Pazzani(2001)propose a modifica-tion of DTW that considers the higher level feature of shape for better alignment.Ratanamahatana and Keogh(2004)propose to learn arbitrary constraints on the warping path.Regression time warping(RTW)is proposed by Lei and Govindaraju(2004)to address the challenges of shifting,scaling,robustness and tecki et al.(2005)propose a method called the minimal variance matching(MVM)for elastic matching.It determines a subsequence of the time series that best matches a query series byfinding the cheapest path in a directed acyclic graph.A segment-wise time warping distance(STW)is proposed by Zhou and Wong(2005)for time scaling search.Fu et al.(2008a) propose a scaled and warped matching(SWM)approach for handling both DTW and uniform scaling simultaneously.Different customized DTW techniques are applied to thefield of music research for query by humming(Zhu and Shasha,2003;Arentz et al.,2005).Focusing on similar problems as DTW,the Longest Common Subsequence(LCSS)model(Vlachos et al.,2002)is proposed.The LCSS is a variation of the edit distance and the basic idea is to match two sequences by allowing them to stretch,without rearranging the sequence of the elements,but allowing some elements to be unmatched.One of the important advantages of an LCSS over DTW is the consideration on the outliers.Chen et al.(2005a)further introduce a distance function based on an edit distance on real sequence(EDR),which is robust against the data imperfection.Morse and Patel(2007)propose a Fast Time Series Evaluation(FTSE)method which can be used to evaluate the threshold value of these kinds of techniques in a faster way.Threshold-based distance functions are proposed by ABfalg et al. (2006).The proposed function considers intervals,during which the time series exceeds a certain threshold for comparing time series rather than using the exact time series values.A T-Time application is developed(ABfalg et al.,2008)to demonstrate the usage of it.Fu et al.(2007)further suggest to introduce rules to govern the pattern matching process,if a priori knowledge exists in the given domain.A parameter-light distance measure method based on Kolmo-gorov complexity theory is suggested in Keogh et al.(2007b). Compression-based dissimilarity measure(CDM)3is adopted in this paper.Chen et al.(2005b)present a histogram-based representation for similarity measure.Similarly,a histogram-based similarity measure,bag-of-patterns(BOP)is proposed by Lin and Li(2009).The frequency of occurrences of each pattern in 3CDM is proposed by Keogh et al.(2004),which is used to compare the co-compressibility between data sets.Tak-chung Fu/Engineering Applications of Artificial Intelligence24(2011)164–181 168。
Parrots are among the most fascinating and intelligent birds in the world.They are known for their vibrant colors,social behavior,and,most notably,their ability to mimic human speech.Here is a detailed introduction to these remarkable creatures:Origin and Habitat:Parrots are native to tropical and subtropical regions,with the majority of species found in South America,Central America,Africa,and Southeast Asia.They inhabit a variety of environments,including rainforests,savannas,and mangroves.Physical Characteristics:Parrots are characterized by their strong curved beaks,which are adapted for cracking seeds and nuts.They have strong legs with zygodactyl feet,meaning two toes point forward and two backward,which is an adaptation for perching.Their plumage is typically colorful and can vary greatly between species,from the bright green of the budgerigar to the striking blue and yellow of the macaw.Diet:Parrots are omnivorous,with their diet consisting mainly of fruits,nuts,seeds,and sometimes insects.Some species have even been known to eat small animals and snails.Social Behavior:Parrots are highly social birds.They often live in flocks and communicate with a variety of calls and sounds.They are known for their playful and curious nature,which makes them popular pets.Reproduction:Parrots usually mate for life.They build nests in tree cavities or cliffs,and the female lays a clutch of eggs which both parents take turns incubating.After hatching,the chicks are fed by the parents with a special nutrientrich food called crop milk.Cognitive Abilities:Parrots are renowned for their intelligence.They have the ability to learn and mimic sounds,which includes human speech.Some species,like the African Grey parrot,are known to have a vocabulary of several hundred words and can understand the context in which words are used.Conservation Status:Unfortunately,many parrot species are threatened due to habitat loss,poaching for the pet trade,and the illegal wildlife trade.Conservation efforts are in place to protect these birds and their habitats.Caring for Parrots as Pets:If you decide to have a parrot as a pet,its important to provide a large cage with plenty of space to fly,a varied diet,and plenty of mental stimulation.Parrots are social creatures and require interaction to stay happy and healthy.Cultural Significance:In many cultures,parrots are seen as symbols of freedom,love,and beauty.They are also featured in various mythologies and folklore,often as messengers or guides.In conclusion,parrots are not just beautiful to look at but are also intelligent and social creatures with unique behaviors and needs.Understanding and respecting these aspects of their nature is crucial for anyone who wishes to share their life with a parrot.。
DISTRIBUTED INTERACTIVE SIMULATION FOR GROUP-DISTANCEEXERCISES ON THE WEBErik Berglund and Henrik ErikssonDepartment of Computer and Information ScienceLinköping UniversityS-581 83 Linköping, SwedenE-mail: {eribe, her}@ida.liu.seKEYWORDS: Distributed Interactive Simulation, Distance Education, Network, Internet, Personal ComputerABSTRACTIn distributed-interactive simulation (DIS), simulators act as elements of a bigger distributed simulation. A group-distance exercise (GDE) based on the DIS approach can therefore enable group training for group members participating from different locations. Our GDE approach, unlike full-scale DIS systems, uses affordable simulators designed for standard hardware available in homes and offices.ERCIS (group distance exERCISe) is a prototype GDE system that we have implemented. It takes advantage of Internet and Java to provide distributed simulation at a fraction of the cost of full-scale DIS systems.ERCIS illustrates that distributed simulation can bring advanced training to office and home computers in the form of GDE systems.The focus of this paper is to discuss the possibilities and the problems of GDE and of web-based distributed simulation as a means to provide GDE. INTRODUCTIONSimulators can be valuable tools in education. Simulators can reduce the cost of training and can allow training in hazardous situations (Berkum & Jong 1991). Distributed-interactive simulation (DIS) originated in military applications, where simulators from different types of forces were connected to form full battle situations. In DIS, individual simulators act as elements of a bigger distributed simulation (Loper & Seidensticker 1993).Thus, DIS could be used to create a group-distance exercise (GDE), where the participants perform a group exercise from different locations. Even though DIS systems based on complex special-hardware simulators provide impressive training tools, the cost and immobility of these systems prohibit mass training.ERCIS (group distance exERCISe) is a prototype GDE system that uses Internet technologies to provide affordable DIS support. Internet (or Intranet) technologies form a solid platform for GDE systems because they are readily available, and because they provide high level of support for network communication and for graphical simulation. ERCIS, therefore, takes advantage of the programming language Java, to combine group training, distance education and real-time interaction at a fraction of the cost of full-scale DIS systems.In this paper we discuss the possibilities and the problems of GDE and of web-based distributed simulation as a means to provide GDE. We do this by discussing and drawing conclusions from the ERCIS project.BACKGROUNDLet us first provide some background on GDE, DIS, distributed objects, ERCIS’s military application, and related work.Group-Distance Exercise (GDE)The purpose of GDE is to enable group training in distance education through the use of DIS. Unlike full-scale DIS systems, our GDE approach assumes simulators designed for standard hardware available in homes and offices. This approach calls for software-based simulators which are less expensive to use, can be multiplied virtually limitlessly, and can enable training with expensive, dangerous and/or non-existing equipment.A thorough background on the GDE concept can be found in Computer-Based Group-Distance Exercise (Berglund 1997).Distributed Interactive Simulation (DIS)DIS originated as a means to utilize military simulators in full battle situations by connecting them (Loper & Seidensticker 1993). As a result it becomes possible tocombine the use of advanced simulators and group training.The different parts of DIS systems communicate according to predefined data packets (IEEE 1995) that describe all necessary data on the bit level. The implementation of the communication is, therefore, built into DIS systems.Distributed ObjectsDistributed objects (Orfali et al. 1996) can be characterized as network transient objects, objects that can bridge networks. Two issues must be addressed when dealing with distributed objects: to locate them over the network and to transform them from abstract data to a transportation format and vice versa.The common object request broker architecture (CORBA) is a standard protocol for distributed objects, developed by the object management group (OMG). CORBA is used to cross both networks and programming languages. In CORBA, all objects are distributed via the object request broker (ORB). Objects requesting service of CORBA object have no knowledge about the location or the implementation of the CORBA objects (Vinoski 1997).Remote method invocation (RMI) is Java’s support for distributed objects among Java programs. RMI only provides the protocol to locate and distribute abstract data, unlike CORBA. In the ERCIS project we chose RMI because ERCIS is an all Java application. It also provided us with an opportunity to assess Java’s support for distributed objects.RBS-70 Missile UnitERCIS supports training of the RBS-70 missile unit of the Swedish anti-aircraft defense. The RBS-70 missile unit’s main purpose is to defend objects, for instance bridges, against enemy aircraft attacks, see Figure 1. The RBS-70 missile unit is composed of sub units: two intelligence units and nine combat units. The intelligence units use radar to discover and calculate flight data of hostile aircraft. Guided by the intelligence unit, the combat units engage the aircraft with RBS-70 missiles. (Personal Communication).Intelligence unitCombat unitData transferFigure 1. The RBS-70 missile unit. Its main purpose is to defend ground targets.During training, the RBS-70 unit uses simulators, for instance, to simulate radar images. All or part of the RBS-70 unit’s actual equipment is still used (Personal Communication).Related WorkIn military applications, there are several examples of group training conducted using distributed simulation. For instance, MIND (Jenvald 1996), Janus, Eagle, the Brigade/Battalion Simulation (Loper & Seidensticker 1993), and ABS 2000.C3Fire (Granlund 1997) is an example of a tool for group training, in the area of emergency management, that uses distributed simulation.A common high-level architecture for modeling and simulation, that will focus on a broader range of simulators than DIS, is being developed (Duncan 1996). There are several educational applets on the web that use simulation; see for instance the Gamelan applet repository. These applets are, however, generally small and there are few, if any, distributed simulations. ERCISERCIS is a prototype GDE system implemented in Java for Internet (or Intranets). It supports education of the RBS-70 missile unit by creating a DIS of that group’s environment. We have used RMI to implement the distribution of the system.ERCIS has two principal components: the equipment simulators and the simulator server, see Figure 2. A maximum of 11 group members can participate in a single ERCIS session, representing the 11 sub-unit leaders, see Figure 3. The group members join by loading an HTML document with an embedded equipment-simulator applet.Simulator serverIntelligence unit equipmentsimulatorCombat unit equipmentsimulatorFigure 2. The principal parts of ERCIS. The equipment simulators are connected to one another through the simulator server.Figure 3. A full ERCIS session. 11 equipment simulators can be active in one ERCIS session at a time. The equipment simulators communicate via the simulator server.Simulator ServerThe simulator server controls a microworld of the group’s environment, including simulated aircraft, exercise scenario and geographical information.The simulator server also distributes network communication among the equipment simulators. The reason for this client-server type of communication is that Java applets are generally only allowed to connect to the computer they were loaded from (Flanagan 1997).Finally, the simulator server functions as a point of reference by which the distributed parts locate one another. The simulator-server computer is therefore the only computer specified prior to an ERCIS session.Equipment SimulatorsThe equipment simulators simulate equipment used by the RBS-70 sub units and also function as user interfaces for the group members. There are two types of equipment simulators: the intelligence-unit equipment simulator and combat-unit equipment simulator.Intelligence Unit Equipment SimulatorThe intelligence-unit’s equipment simulator, see Figure 4,contains a radar simulator and a target-tracking simulator . The radar simulator monitors the air space. The target-tracking simulator performs part of the work of three intelligence-unit personnel to approximate position,speed and course of three aircraft simultaneously.The intelligence-unit leader’s task is to distribute the hostile aircraft among the combat units and to send approximated information to them.Panel used to send information to the combat unit equipment simulator.Target-tracking symbol used to initate target tracking.Simulated radarFigure 4. The user interface of the intelligence unit equipment simulator, running on a Sun Solaris Applet Viewer.Combat Unit Equipment SimulatorThe combat unit equipment simulator, see Figure 5,contains simulators of the target-data receiver and the RBS-70 missile launcher and operator .The target-data receiver presents information sent from the intelligence unit. The information is recalculated relative to the combat unit’s position and shows information such as: the distance and direction to the target. The RBS-70 missile launcher and operator represents the missile launcher and its crew. Based on the intelligence unit’s information it can locate, track and fire upon the target.The combat-unit leader’s task is to assess the situation and to grant permission to fire if all criteria are met.Switch used to grant fire permissionFigure 5. The user interface of the combat unit equipment simulator, running on a Windows 95 Applet Viewer.DISCUSSIONLet us then, with experience from the ERCIS project, discuss problems and possibilities of GDE though DIS and of the support Internet and Java provide for GDE. Pedagogical valueEducators can use GDE to introduce group training at an early stage by, for instance, simplifying equipment and thereby abstracting from technical details. Thus, GDE can focus the training on group performance.GDE could also be used to support knowledge recapitulation for expert practitioners. To relive the group’s tasks and environment can provide a more vivid experience than notes and books.GDE systems can automate collection and evaluation of performance statistics. It is possible to log sessions and replay them to visualize comments on performance in after-action reviews (Jenvald 1996).To fully assess the values of GDE, however, real-world testing is required.SecurityAccess control, software authentication, and communication encryption are examples of security issues that concern GDE and distributed simulators. Java 1.1 provides basic security, which includes fine-grained access control, and signed applets. The use of dedicated GDE Intranets would increase security, especially access control. It would, however, reduce the ability to chose location from where to participate.Security restrictions, motivated or not, limit applications. ERCIS was designed with a client-server type of communication because web browsers enforce security restrictions on applets (Flanagan 1997). Peer-to-peer communication would have been more suitable, from the perspective of both implementation and scalability. We are not saying that web browsers should give applets total freedom. In ERCIS, it would have sufficed if applets were allowed to make remote calls to RMI objects regardless of their location.Performance characteristicsOur initial concerns about the speed of RMI calls and of data transfer over Internet proved to be unfounded. The speed of communication is not a limiting factor for ERCIS. For instance, a modem link (28.8 kilobits per second) is sufficient to participate in exercises.Instead the speed of animation limits ERCIS. To provide smooth animation, ERCIS, therefore, requires more than the standard hardware of today, for instance a Pentium Pro machine or better.ScalabilityIn response to increased network load, ERCIS scales relatively well, because the volume of the data that is transmitted among the distributed parts is very small, for instance, 1 Kbytes.Incorporating new and better simulators in ERCIS requires considerable programming effort. In a full-scale GDE system it could be beneficial to modularize the simulators in a plug-and-play fashion, to allow variable simulator complexity.Download timeThe download time for applets the size of ERCIS’s equipment simulator can be very long. One way to overcome this problem is to create Java archive files (JAR files). JAR files aggregate many files into one and also compress them to decrease the download time considerably. Push technology such as Marimba’s Castanet could also be used to provide automatic distribution of the equipment-simulator software. Distributed objectsDistributed objects, such as RMI, provide a high level of abstraction in network communication compared to the DIS protocol. There are several examples of typical distributed applications that do not utilize distributed objects but that would benefit greatly from this approach. Two examples are the Nuclear Power Plant applet (Eriksson 1997), and NASA’s distributed control of the Sojourner.CONCLUSIONERCIS is a GDE prototype that can be used in training under teacher supervision or as part of a web site where web pages provide additional information. The system illustrates that the GDE approach can provide equipment-free mass training, which is beneficial, especially in military applications where training can be extremely expensive.Java proved to be a valuable tool for the implementation of ERCIS. Java’s level of abstraction is high in the two areas that concern ERCIS: animation and distributed objects. Java’s speed of animation is, however, too slow to enable acceptable performance for highly graphic-oriented simulators. Apart from this Java has supplied the support that can be expected from a programming language, for instance C++.Using RMI to implement distribution was straightforward. Compared to the DIS protocol, RMI provides a flexible and dynamic communication protocol. In conclusion, ERCIS illustrates that it is possible to use Internet technologies to develop affordable DIS systems. It also shows that distributed simulations can bring advanced training to office and home computers in the form of GDE systems.AcknowledgmentsWe would like to thank Major Per Bergström at the Center for Anti-Aircraft Defense in Norrtälje, Sweden for supplying domain knowledge of the RBS-70 missile unit. This work has been supported in part by the Swedish National Board for Industrial and Technical Development(Nutek) grant no. 93-3233, and by the Swedish Research Council for Engineering Science (TFR) grant no. 95-186. REFERENCESBerglund E. (1997) Computer-Based Group Distance Exercise, M.Sc. thesis no. 97/36, Department of Computer and Information Science, Linköping University (http://www.ida.liu.se/~eribe/publication/ GDE.zip: compressed postscript file).van Berkum J, de Jong T. (1991) Instructional environments for simulations Education & Computing vol. 6: 305-358.Duncan C. (1996) The DoD High Level Architecture and the Next Generation of DIS, Proceedings of the Fourteenth Workshop on Interoperability of Distributed Simulation, Orlando, Florida.Eriksson H. (1996) Expert Systems as Knowledge Servers, IEEE Expert vol. 11 no. 3: 14 -19. Flanagan, D. (1997) Java in a Nutshell 2nd Edition, O’Reilly, Sebastopol, CA.Granlund R. (1997) C3Fire A Microworld Supporting Emergency Management Training, licentiate thesis no.598, Department of Computer and Information Science 598, Department of Computer and Information Science Linköping University.IEEE (1995) IEEE Standard for Distributed Interactive Simulation--Application Protocols, IEEE 1278.1-1995 (Standard): IEEE.Jenvald J. (1996) Simulation and Data Collection in Battle Training, licentiate thesis no. 567, Department of Computer and Information Science, Linköping University.Loper M, Seidensticker S. (1994) The DIS Vision: A Map to the Future of Distributed Simulation, Orlando, Florida: Institute for Simulation & Training (/SISO/dis/library/vision.doc) Orfali R, Harkey D, Edwards J. (1996) The Essential Distributed Objects Survival Guide John Wiley, New York.Vinoski S. (1997) CORBA: Integrating Diverse Applications Within Distributed Heterogeneous Environments, IEEE Communications, vol. 14, no. 2.RESOURCES AT THE WEBThe OMG home page: /CORBA JavaSoft’s Java 1.1 documentation: /-products/jdk/1.1/docs/index.htmlThe Gamelan applet repository: / Marimba’s Castanet home page: http://www.marimba.-com/products/castanet.htmlThe Nuclear Power Plant Applet (Eriksson 1995): http://-www.ida.liu.se/∼her/npp/demo.htmlNASAS Soujourner, The techical details on the control distribution of NASAS Soujourner: /features/1997/july/juicy.wits.details.html AuthorsErik Berglund is a doctoral student of computer science at Linköping University. His research interests include knowledge acquisition, program understanding, software engineering, and computer supported education. He received his M.Sc. at Linköping University in 1997. Henrik Eriksson is an assistant professor of computer science at Linköping University. His research interests include expert systems, knowledge acquisition, reusable problem-solving methods and medical Informatics. He received his M.Sc. and Ph.D. at Linköping University in 1987 and 1991. He was a postdoctoral fellow and research scientist at Stanford University between 1991 and 1994. Since 1996, he is a guest researcher at the Swedish Institute for Computer Science (SICS).。
a r X i v :0806.1256v 1 [p h y s i c s .s o c -p h ] 7 J u n 2008Understanding individual human mobility patternsMarta C.Gonz´a lez,1,2C´e sar A.Hidalgo,1and Albert-L´a szl´o Barab´a si 1,2,31Center for Complex Network Research and Department of Physics and Computer Science,University of Notre Dame,Notre Dame IN 46556.2Center for Complex Network Research and Department of Physics,Biology and Computer Science,Northeastern University,Boston MA 02115.3Center for Cancer Systems Biology,Dana Farber Cancer Institute,Boston,MA 02115.(Dated:June 7,2008)Despite their importance for urban planning [1],traffic forecasting [2],and the spread of biological [3,4,5]and mobile viruses [6],our understanding of the basic laws govern-ing human motion remains limited thanks to the lack of tools to monitor the time resolved location of individuals.Here we study the trajectory of 100,000anonymized mobile phone users whose position is tracked for a six month period.We find that in contrast with the random trajectories predicted by the prevailing L´e vy flight and random walk models [7],human trajectories show a high degree of temporal and spatial regularity,each individual being characterized by a time independent characteristic length scale and a significant prob-ability to return to a few highly frequented locations.After correcting for differences in travel distances and the inherent anisotropy of each trajectory,the individual travel patterns collapse into a single spatial probability distribution,indicating that despite the diversity of their travel history,humans follow simple reproducible patterns.This inherent similarity in travel patterns could impact all phenomena driven by human mobility,from epidemic prevention to emergency response,urban planning and agent based modeling.Given the many unknown factors that influence a population’s mobility patterns,ranging from means of transportation to job and family imposed restrictions and priorities,human trajectories are often approximated with various random walk or diffusion models [7,8].Indeed,early mea-surements on albatrosses,bumblebees,deer and monkeys [9,10]and more recent ones on marine predators [11]suggested that animal trajectory is approximated by a L´e vy flight [12,13],a random walk whose step size ∆r follows a power-law distribution P (∆r )∼∆r −(1+β)with β<2.While the L´e vy statistics for some animals require further study [14],Brockmann et al.[7]generalized this finding to humans,documenting that the distribution of distances between consecutive sight-ings of nearly half-million bank notes is fat tailed.Given that money is carried by individuals, bank note dispersal is a proxy for human movement,suggesting that human trajectories are best modeled as a continuous time random walk with fat tailed displacements and waiting time dis-tributions[7].A particle following a L´e vyflight has a significant probability to travel very long distances in a single step[12,13],which appears to be consistent with human travel patterns:most of the time we travel only over short distances,between home and work,while occasionally we take longer trips.Each consecutive sightings of a bank note reflects the composite motion of two or more indi-viduals,who owned the bill between two reported sightings.Thus it is not clear if the observed distribution reflects the motion of individual users,or some hitero unknown convolution between population based heterogeneities and individual human trajectories.Contrary to bank notes,mo-bile phones are carried by the same individual during his/her daily routine,offering the best proxy to capture individual human trajectories[15,16,17,18,19].We used two data sets to explore the mobility pattern of individuals.Thefirst(D1)consists of the mobility patterns recorded over a six month period for100,000individuals selected randomly from a sample of over6million anonymized mobile phone users.Each time a user initiates or receives a call or SMS,the location of the tower routing the communication is recorded,allowing us to reconstruct the user’s time resolved trajectory(Figs.1a and b).The time between consecutive calls follows a bursty pattern[20](see Fig.S1in the SM),indicating that while most consecutive calls are placed soon after a previous call,occasionally there are long periods without any call activity.To make sure that the obtained results are not affected by the irregular call pattern,we also study a data set(D2)that captures the location of206mobile phone users,recorded every two hours for an entire week.In both datasets the spatial resolution is determined by the local density of the more than104mobile towers,registering movement only when the user moves between areas serviced by different towers.The average service area of each tower is approximately3km2 and over30%of the towers cover an area of1km2or less.To explore the statistical properties of the population’s mobility patterns we measured the dis-tance between user’s positions at consecutive calls,capturing16,264,308displacements for the D1and10,407displacements for the D2datasets.Wefind that the distribution of displacements over all users is well approximated by a truncated power-lawP(∆r)=(∆r+∆r0)−βexp(−∆r/κ),(1)withβ=1.75±0.15,∆r0=1.5km and cutoff valuesκ|D1=400km,andκ|D2=80km(Fig.1c,see the SM for statistical validation).Note that the observed scaling exponent is not far fromβB=1.59observed in Ref.[7]for bank note dispersal,suggesting that the two distributions may capture the same fundamental mechanism driving human mobility patterns.Equation(1)suggests that human motion follows a truncated L´e vyflight[7].Yet,the observed shape of P(∆r)could be explained by three distinct hypotheses:A.Each individual follows a L´e vy trajectory with jump size distribution given by(1).B.The observed distribution captures a population based heterogeneity,corresponding to the inherent differences between individuals.C.A population based heterogeneity coexists with individual L´e vy trajectories,hence(1)represents a convolution of hypothesis A and B.To distinguish between hypotheses A,B and C we calculated the radius of gyration for each user(see Methods),interpreted as the typical distance traveled by user a when observed up to time t(Fig.1b).Next,we determined the radius of gyration distribution P(r g)by calculating r g for all users in samples D1and D2,finding that they also can be approximated with a truncated power-lawP(r g)=(r g+r0g)−βr exp(−r g/κ),(2) with r0g=5.8km,βr=1.65±0.15andκ=350km(Fig.1d,see SM for statistical validation). L´e vyflights are characterized by a high degree of intrinsic heterogeneity,raising the possibility that(2)could emerge from an ensemble of identical agents,each following a L´e vy trajectory. Therefore,we determined P(r g)for an ensemble of agents following a Random Walk(RW), L´e vy-Flight(LF)or Truncated L´e vy-Flight(T LF)(Figure1d)[8,12,13].Wefind that an en-semble of L´e vy agents display a significant degree of heterogeneity in r g,yet is not sufficient to explain the truncated power law distribution P(r g)exhibited by the mobile phone users.Taken together,Figs.1c and d suggest that the difference in the range of typical mobility patterns of indi-viduals(r g)has a strong impact on the truncated L´e vy behavior seen in(1),ruling out hypothesis A.If individual trajectories are described by a LF or T LF,then the radius of gyration should increase in time as r g(t)∼t3/(2+β)[21,22]while for a RW r g(t)∼t1/2.That is,the longer we observe a user,the higher the chances that she/he will travel to areas not visited before.To check the validity of these predictions we measured the time dependence of the radius of gyration for users whose gyration radius would be considered small(r g(T)≤3km),medium(20<r g(T)≤30km)or large(r g(T)>100km)at the end of our observation period(T=6months).Theresults indicate that the time dependence of the average radius of gyration of mobile phone users is better approximated by a logarithmic increase,not only a manifestly slower dependence than the one predicted by a power law,but one that may appear similar to a saturation process(Fig.2a and Fig.S4).In Fig.2b,we have chosen users with similar asymptotic r g(T)after T=6months,and measured the jump size distribution P(∆r|r g)for each group.As the inset of Fig.2b shows,users with small r g travel mostly over small distances,whereas those with large r g tend to display a combination of many small and a few larger jump sizes.Once we rescale the distributions with r g(Fig.2b),wefind that the data collapses into a single curve,suggesting that a single jump size distribution characterizes all users,independent of their r g.This indicates that P(∆r|r g)∼r−αg F(∆r/r g),whereα≈1.2±0.1and F(x)is an r g independent function with asymptotic behavior F(x<1)∼x−αand rapidly decreasing for x≫1.Therefore the travel patterns of individual users may be approximated by a L´e vyflight up to a distance characterized by r g. Most important,however,is the fact that the individual trajectories are bounded beyond r g,thus large displacements which are the source of the distinct and anomalous nature of L´e vyflights, are statistically absent.To understand the relationship between the different exponents,we note that the measured probability distributions are related by P(∆r)= ∞0P(∆r|r g)P(r g)dr g,whichsuggests(see SM)that up to the leading order we haveβ=βr+α−1,consistent,within error bars, with the measured exponents.This indicates that the observed jump size distribution P(∆r)is in fact the convolution between the statistics of individual trajectories P(∆r g|r g)and the population heterogeneity P(r g),consistent with hypothesis C.To uncover the mechanism stabilizing r g we measured the return probability for each indi-vidual F pt(t)[22],defined as the probability that a user returns to the position where it was first observed after t hours(Fig.2c).For a two dimensional random walk F pt(t)should follow ∼1/(t ln(t)2)[22].In contrast,wefind that the return probability is characterized by several peaks at24h,48h,and72h,capturing a strong tendency of humans to return to locations they visited before,describing the recurrence and temporal periodicity inherent to human mobility[23,24].To explore if individuals return to the same location over and over,we ranked each location based on the number of times an individual was recorded in its vicinity,such that a location with L=3represents the third most visited location for the selected individual.Wefind that the probability offinding a user at a location with a given rank L is well approximated by P(L)∼1/L, independent of the number of locations visited by the user(Fig.2d).Therefore people devote mostof their time to a few locations,while spending their remaining time in5to50places,visited with diminished regularity.Therefore,the observed logarithmic saturation of r g(t)is rooted in the high degree of regularity in their daily travel patterns,captured by the high return probabilities(Fig.2b) to a few highly frequented locations(Fig.2d).An important quantity for modeling human mobility patterns is the probabilityΦa(x,y)tofind an individual a in a given position(x,y).As it is evident from Fig.1b,individuals live and travel in different regions,yet each user can be assigned to a well defined area,defined by home and workplace,where she or he can be found most of the time.We can compare the trajectories of different users by diagonalizing each trajectory’s inertia tensor,providing the probability offinding a user in a given position(see Fig.3a)in the user’s intrinsic reference frame(see SM for the details).A striking feature ofΦ(x,y)is its prominent spatial anisotropy in this intrinsic reference frame(note the different scales in Fig3a),and wefind that the larger an individual’s r g the more pronounced is this anisotropy.To quantify this effect we defined the anisotropy ratio S≡σy/σx, whereσx andσy represent the standard deviation of the trajectory measured in the user’s intrinsic reference frame(see SM).Wefind that S decreases monotonically with r g(Fig.3c),being well approximated with S∼r−ηg,forη≈0.12.Given the small value of the scaling exponent,other functional forms may offer an equally goodfit,thus mechanistic models are required to identify if this represents a true scaling law,or only a reasonable approximation to the data.To compare the trajectories of different users we remove the individual anisotropies,rescal-ing each user trajectory with its respectiveσx andσy.The rescaled˜Φ(x/σx,y/σy)distribution (Fig.3b)is similar for groups of users with considerably different r g,i.e.,after the anisotropy and the r g dependence is removed all individuals appear to follow the same universal˜Φ(˜x,˜y)prob-ability distribution.This is particularly evident in Fig.3d,where we show the cross section of ˜Φ(x/σ,0)for the three groups of users,finding that apart from the noise in the data the curves xare indistinguishable.Taken together,our results suggest that the L´e vy statistics observed in bank note measurements capture a convolution of the population heterogeneity(2)and the motion of individual users.Indi-viduals display significant regularity,as they return to a few highly frequented locations,like home or work.This regularity does not apply to the bank notes:a bill always follows the trajectory of its current owner,i.e.dollar bills diffuse,but humans do not.The fact that individual trajectories are characterized by the same r g-independent two dimen-sional probability distribution˜Φ(x/σx,y/σy)suggests that key statistical characteristics of indi-vidual trajectories are largely indistinguishable after rescaling.Therefore,our results establish the basic ingredients of realistic agent based models,requiring us to place users in number propor-tional with the population density of a given region and assign each user an r g taken from the observed P(r g)ing the predicted anisotropic rescaling,combined with the density function˜Φ(x,y),whose shape is provided as Table1in the SM,we can obtain the likelihood offinding a user in any location.Given the known correlations between spatial proximity and social links,our results could help quantify the role of space in network development and evolu-tion[25,26,27,28,29]and improve our understanding of diffusion processes[8,30].We thank D.Brockmann,T.Geisel,J.Park,S.Redner,Z.Toroczkai and P.Wang for discus-sions and comments on the manuscript.This work was supported by the James S.McDonnell Foundation21st Century Initiative in Studying Complex Systems,the National Science Founda-tion within the DDDAS(CNS-0540348),ITR(DMR-0426737)and IIS-0513650programs,and the U.S.Office of Naval Research Award N00014-07-C.Data analysis was performed on the Notre Dame Biocomplexity Cluster supported in part by NSF MRI Grant No.DBI-0420980.C.A.Hi-dalgo acknowledges support from the Kellogg Institute at Notre Dame.Supplementary Information is linked to the online version of the paper at /nature.Author Information Correspondence and requests for materials should be addressed to A.-L.B.(e-mail:alb@)[1]Horner,M.W.&O’Kelly,M.E.S Embedding economies of scale concepts for hub networks design.Journal of Transportation Geography9,255-265(2001).[2]Kitamura,R.,Chen,C.,Pendyala,R.M.&Narayaran,R.Micro-simulation of daily activity-travelpatterns for travel demand forecasting.Transportation27,25-51(2000).[3]Colizza,V.,Barrat,A.,Barth´e l´e my,M.,Valleron,A.-J.&Vespignani,A.Modeling the WorldwideSpread of Pandemic Influenza:Baseline Case and Containment Interventions.PLoS Medicine4,095-0110(2007).[4]Eubank,S.,Guclu,H.,Kumar,V.S.A.,Marathe,M.V.,Srinivasan,A.,Toroczkai,Z.&Wang,N.Controlling Epidemics in Realistic Urban Social Networks.Nature429,180(2004).[5]Hufnagel,L.,Brockmann,D.&Geisel,T.Forecast and control of epidemics in a globalized world.Proceedings of the National Academy of Sciences of the United States of America101,15124-15129 (2004).[6]Kleinberg,J.The wireless epidemic.Nature449,287-288(2007).[7] D.Brockmann,D.,Hufnagel,L.&Geisel,T.The scaling laws of human travel.Nature439,462-465(2006).[8]Havlin,S.&ben-Avraham,D.Diffusion in Disordered Media.Advances in Physics51,187-292(2002).[9]Viswanathan,G.M.,Afanasyev,V.,Buldyrev,S.V.,Murphy,E.J.,Prince,P.A.&Stanley,H.E.L´e vyFlight Search Patterns of Wandering Albatrosses.Nature381,413-415(1996).[10]Ramos-Fernandez,G.,Mateos,J.L.,Miramontes,O.,Cocho,G.,Larralde,H.&Ayala-Orozco,B.,L´e vy walk patterns in the foraging movements of spider monkeys(Ateles geoffroyi).Behavioral ecol-ogy and Sociobiology55,223-230(2004).[11]Sims D.W.et al.Scaling laws of marine predator search behaviour.Nature451,1098-1102(2008).[12]Klafter,J.,Shlesinger,M.F.&Zumofen,G.Beyond Brownian Motion.Physics Today49,33-39(1996).[13]Mantegna,R.N.&Stanley,H.E.Stochastic Process with Ultraslow Convergence to a Gaussian:TheTruncated L´e vy Flight.Physical Review Letters73,2946-2949(1994).[14]Edwards,A.M.,Phillips,R.A.,Watkins,N.W.,Freeman,M.P.,Murphy,E.J.,Afanasyev,V.,Buldyrev,S.V.,da Luz,M.G.E.,Raposo,E.P.,Stanley,H.E.&Viswanathan,G.M.Revisiting L´e vyflightsearch patterns of wandering albatrosses,bumblebees and deer.Nature449,1044-1049(2007). [15]Sohn,T.,Varshavsky,A.,LaMarca,A.,Chen,M.Y.,Choudhury,T.,Smith,I.,Consolvo,S.,High-tower,J.,Griswold,W.G.&de Lara,E.Lecture Notes in Computer Sciences:Proc.8th International Conference UbiComp2006.(Springer,Berlin,2006).[16]Onnela,J.-P.,Saram¨a ki,J.,Hyv¨o nen,J.,Szab´o,G.,Lazer,D.,Kaski,K.,Kert´e sz,K.&Barab´a si A.L.Structure and tie strengths in mobile communication networks.Proceedings of the National Academy of Sciences of the United States of America104,7332-7336(2007).[17]Gonz´a lez,M.C.&Barab´a si,plex networks:From data to models.Nature Physics3,224-225(2007).[18]Palla,G.,Barab´a si,A.-L.&Vicsek,T.Quantifying social group evolution.Nature446,664-667(2007).[19]Hidalgo C.A.&Rodriguez-Sickert C.The dynamics of a mobile phone network.Physica A387,3017-30224.[20]Barab´a si,A.-L.The origin of bursts and heavy tails in human dynamics.Nature435,207-211(2005).[21]Hughes,B.D.Random Walks and Random Environments.(Oxford University Press,USA,1995).[22]Redner,S.A Guide to First-Passage Processes.(Cambridge University Press,UK,2001).[23]Schlich,R.&Axhausen,K.W.Habitual travel behaviour:Evidence from a six-week travel diary.Transportation30,13-36(2003).[24]Eagle,N.&Pentland,A.Eigenbehaviours:Identifying Structure in Routine.submitted to BehavioralEcology and Sociobiology(2007).[25]Yook,S.-H.,Jeong,H.&Barab´a si A.L.Modeling the Internet’s large-scale topology.Proceedings ofthe Nat’l Academy of Sciences99,13382-13386(2002).[26]Caldarelli,G.Scale-Free Networks:Complex Webs in Nature and Technology.(Oxford UniversityPress,USA,2007).[27]Dorogovtsev,S.N.&Mendes,J.F.F.Evolution of Networks:From Biological Nets to the Internet andWWW.(Oxford University Press,USA,2003).[28]Song C.M.,Havlin S.&Makse H.A.Self-similarity of complex networks.Nature433,392-395(2005).[29]Gonz´a lez,M.C.,Lind,P.G.&Herrmann,H.J.A system of mobile agents to model social networks.Physical Review Letters96,088702(2006).[30]Cecconi,F.,Marsili,M.,Banavar,J.R.&Maritan,A.Diffusion,peer pressure,and tailed distributions.Physical Review Letters89,088102(2002).FIG.1:Basic human mobility patterns.a,Week-long trajectory of40mobile phone users indicate that most individuals travel only over short distances,but a few regularly move over hundreds of kilometers. Panel b,displays the detailed trajectory of a single user.The different phone towers are shown as green dots,and the V oronoi lattice in grey marks the approximate reception area of each tower.The dataset studied by us records only the identity of the closest tower to a mobile user,thus we can not identify the position of a user within a V oronoi cell.The trajectory of the user shown in b is constructed from186 two hourly reports,during which the user visited a total of12different locations(tower vicinities).Among these,the user is found96and67occasions in the two most preferred locations,the frequency of visits for each location being shown as a vertical bar.The circle represents the radius of gyration centered in the trajectory’s center of mass.c,Probability density function P(∆r)of travel distances obtained for the two studied datasets D1and D2.The solid line indicates a truncated power law whose parameters are provided in the text(see Eq.1).d,The distribution P(r g)of the radius of gyration measured for the users, where r g(T)was measured after T=6months of observation.The solid line represent a similar truncated power lawfit(see Eq.2).The dotted,dashed and dot-dashed curves show P(r g)obtained from the standard null models(RW,LF and T LF),where for the T LF we used the same step size distribution as the onemeasured for the mobile phone users.FIG.2:The bounded nature of human trajectories.a,Radius of gyration, r g(t) vs time for mobile phone users separated in three groups according to theirfinal r g(T),where T=6months.The black curves correspond to the analytical predictions for the random walk models,increasing in time as r g(t) |LF,T LF∼t3/2+β(solid),and r g(t) |RW∼t0.5(dotted).The dashed curves corresponding to a logarithmicfit of the form A+B ln(t),where A and B depend on r g.b,Probability density function of individual travel distances P(∆r|r g)for users with r g=4,10,40,100and200km.As the inset shows,each group displays a quite different P(∆r|r g)distribution.After rescaling the distance and the distribution with r g(main panel),the different curves collapse.The solid line(power law)is shown as a guide to the eye.c,Return probability distribution,F pt(t).The prominent peaks capture the tendency of humans to regularly return to the locations they visited before,in contrast with the smooth asymptotic behavior∼1/(t ln(t)2)(solid line)predicted for random walks.d,A Zipf plot showing the frequency of visiting different locations.The symbols correspond to users that have been observed to visit n L=5,10,30,and50different locations.Denoting with(L)the rank of the location listed in the order of the visit frequency,the data is well approximated by R(L)∼L−1. The inset is the same plot in linear scale,illustrating that40%of the time individuals are found at theirfirsttwo preferred locations.FIG.3:The shape of human trajectories.a,The probability density functionΦ(x,y)offinding a mobile phone user in a location(x,y)in the user’s intrinsic reference frame(see SM for details).The three plots, from left to right,were generated for10,000users with:r g≤3,20<r g≤30and r g>100km.The trajectories become more anisotropic as r g increases.b,After scaling each position withσx andσy theresulting˜Φ(x/σx,y/σy)has approximately the same shape for each group.c,The change in the shape of Φ(x,y)can be quantified calculating the isotropy ratio S≡σy/σx as a function of r g,which decreases as S∼r−0.12(solid line).Error bars represent the standard error.d,˜Φ(x/σx,0)representing the x-axis cross gsection of the rescaled distribution˜Φ(x/σx,y/σy)shown in b.。
Example analysis of biodiversity survey data with R packagegradientForestC.Roland Pitcher,Nick Ellis,Stephen J.SmithMarch21,2011Contents1Introduction1 2Gradient Forest basics22.1Data (2)2.2gradientForest analysis (2)2.3gradientForest plots (3)3Gradient Forest predictions83.1Transforming predictors (9)3.2Biplot of the biological space (9)3.3Mapping in geographic space (12)3.4A clustered version (12)4Session information15 References16 1IntroductionR package gradientForest[Ellis et˜al.,2010]developsflexible non-parametric functions to quantify multi-species compositional turnover along environmental gradients.Theflexibility comes from the method’s origins in Random Forests[Breiman,2001];specifically R package randomForest[Liaw and Wiener,2002].This document provides an example to demonstrate the use of gradientForest for ecological analysis of biodiversity survey data.A multi-regional application is provided by[Pitcher et˜al.,2010].The document assumes some familiarity both with R and with community analysis.The example has some analogues with constrained ordi-nation methods and with Generalized Dissimilarity Modelling[Ferrier et˜al.,2007],which are both complementary.Package randomForest includes functions for plotting non-linear responses in compositional along environmental gradients,and for using these responses to transform en-vironmental data layers to biological scales.The transformed multi-dimensional biological space can be represented as a biplot and can be mapped in geographic space.This example demonstrates typical scripts for a gradient forest analysis and provides in the package a sites-by-species(row-by-columns)matrix and a matching sites-by-environment(row-by-columns)dataframe.The number of rows and their order must match between these two data objects.The data should not include NAs.It is assumed that users will be familiar with the data-processing steps necessary to produce such data objects.12Gradient Forest basics2.1DataThe example data provided in package gradientForest are real species data from a cross-shelf survey in the far northern Great Barrier Reef of northeast Australia[Poiner et˜al.,1998, Burridge et˜al.,2006].Of>1,000species observed,a subset of110are included from197of the sites sampled.The environmental data include28predictors,either measured at each site or attributed to each site by interpolation[Pitcher et˜al.,2002].>require(gradientForest)>load("GZ.sps.mat.Rdata")>dim(Sp_mat)[1]197110>load("GZ.phys.site.Rdata")>dim(Phys_site)[1]197282.2gradientForest analysisThe function gradientForest is a wrapper function that calls extendedForest,a modified version of randomForest,and collates its output across all the species in the data matrix. The key modification in extendedForest extracts the numerous tree split values along each predictor gradient and their associatedfit improvement,for each predictor in each tree,for the forests and returns that information to gradientForest.Like randomForest,extendedForest assesses the importance of each variable for predic-tion accuracy;information that is further collated and processed by gradientForest.Often, predictor variables are correlated however.The standard approach in random forests assesses marginal importance of predictor by randomly permuting each predictor in turn,across all sites in the dataset,and calculating the degradation prediction performance of each tree.Package extendedForest can account for correlated predictors by implementing conditional permutation [Ellis et˜al.,2010],following the strategy outlined by Strobl et˜al.[2008].In conditional per-mutation,the predictor to be assessed is permuted only within blocks of the dataset defined by splits in the given tree on any other predictors correlated above a certain threshold(e.g. r=0.5)and up to a maximum number of splits set by the maxLevel option(if required).>nSites<-dim(Sp_mat)[1]>nSpecs<-dim(Sp_mat)[2]>lev<-floor(log2(nSites*0.368/2))>lev[1]5The gradientForest may take several minutes to run.Other options that can be set include the number of trees typically500,whether the splits should be compact into bins(advising to prevent memory problems for large datasets)and the number of bins,and the correlation threshold for conditional permutation.The summary shows the number of species with positive R2ie.those species that could be predicted to any extent by the available predictor.The returned object is a list containing the data,predictor importances,species R2’s and other information described in the html help pages under Value.>gf<-gradientForest(cbind(Phys_site,Sp_mat),+predictor.vars=colnames(Phys_site),response.vars=colnames(Sp_mat),+ntree=500,transform=NULL,compact=T,+nbin=201,maxLevel=lev,corr.threshold=0.5)>gfA forest of500regression trees for each of90speciesCall:gradientForest(data=cbind(Phys_site,Sp_mat),predictor.vars=colnames(Phys_site), response.vars=colnames(Sp_mat),ntree=500,transform=NULL,maxLevel=lev,corr.threshold=0.5,compact=T,nbin=201)Important variables:[1]BSTRESS MUD S_AV Si_AV CHLA_AV>names(gf)[1]"X""Y""result"[4]"overall.imp""overall.imp2""ntree"[7]"imp.rsq""species.pos.rsq""ranForest.type"[10]"res""res.u""dens"[13]"call"2.3gradientForest plotsSeveral types of plots are available for the gradientForest object.Thefirst is the predictor overall importance plot.This show the mean accuracy importance and the mean importance weighted by species R2.in this example,both are conditional importance.Seabed stress and sediment mud fraction are clearly the most important variables across these89species.>plot(gf,plot.type="O")PO4_SRASPECT SLOPE O2_SR O2_AV PO4_AV CHLA_SR S_SR K490_SRSi_SR SST_SR BIR_AV BIR_SR GRAVEL BA THY NO3_SR T_AV T_SR SST_AV CRBNT NO3_AV Si_AV SAND K490_AV CHLA_AV S_AV MUD BSTRESSAccuracy importance0.000.040.08PO4_SRASPECT O2_SR SLOPE O2_AV CHLA_SR PO4_AV Si_SR S_SR BIR_AV BIR_SR K490_SR SST_SR NO3_SR BA THY T_SR GRAVEL SST_AV NO3_AV T_AV CRBNT K490_AV SAND CHLA_AVSi_AV S_AV MUD BSTRESSR 2weighted importance0.0000.0100.020The predictor gradient plots are best presented in order of importance;in this example the top 25predictors are presented in 5by 5panels.>most_important <-names(importance(gf))[1:25]>par(mgp =c(2,0.75,0))The second plot is the splits density plot (plot.type="S"),which shows binned split impor-tance and location on each gradient (spikes),kernel density of splits (black lines ),of observations (red lines )and of splits standardised by observations density (blue lines ).Each distribution in-tegrates to predictor importance.These show where important changes in the abundance of multiple species are occurring along the gradient;they indicate a composition change rate.Many of the usual plot options can be set in the call.>plot(gf,plot.type ="S",imp.vars =most_important,+leg.posn ="topright",cex.legend =0.4,cex.axis =0.6,+b =0.7,line.ylab =0.9,par.args =list(mgp =c(1.5,+0.5,0),mar =c(3.1,1.5,0.1,1)))0.10.30.50.000.040.080.12BSTRESS02040600e +004e −048e −04MUD 34.835.20.000.040.080.12S_AV 1.5 2.5 3.50.0000.0060.012Si_AV0.51.0 1.52.00.000.020.04CHLA_AV020*******.000000.00015SAND0.060.100.140.00.20.40.6K490_AV4060801000.00000.00100.0020CRBNT27.528.028.529.00.0000.010T_AV0.20.30.40.50.000.050.100.15NO3_AV26.726.927.10.000.040.08SST_AV020*******.000000.00015GRAVEL1.02.03.00.0000.0060.012T_SR−50−40−30−20−100e +003e −046e −04BATHY 0.10.20.30.40.50.000.020.040.06NO3_SR4.2 4.65.0 5.40.0000.0060.012SST_SR 0.050.100.150.200.000.100.20K490_SR0.050.150.250.000.020.04BIR_SR0.00.20.40.60.80.0000.0150.030BIR_AV 1.02.03.00.0000.0020.004S_SR1234560.00000.0010Si_SR 0.140.180.000.10PO4_AV 123450.0000.0040.008CHLA_SR 4.254.35 4.450.0000.0100.020O2_AV 0.0 1.0 2.00.0000.0020.0040.006SLOPED e n s i t yThe third plot is the species cumulative plot (plot.type="C",show.overall=F ),which for each species shows cumulative importance distributions of splits improvement scaled by R 2weighted importance,and standardised by density of observations.These show cumulative change in abundance of individual species,where changes occur on the gradient,and the species changing most on each gradient.Again many of the usual plot options can be set in the call;in this example the legend identifies the top 5most responsive species for each predictor >plot(gf,plot.type ="C",imp.vars =most_important,+show.overall =F,legend =T,leg.posn ="topleft",+leg.nspecies =5,b =0.7,cex.legend =0.4,+cex.axis =0.6,line.ylab =0.9,par.args =list(mgp =c(1.5,+0.5,0),mar =c(2.5,1,0.1,0.5),omi =c(0,+0.3,0,0)))0.10.20.30.40.50.000.100.20BSTRESS01030500.000.020.040.06MUD34.835.035.235.40.000.040.08S_AV1.52.53.50.000.040.080.12Si_AV0.5 1.0 1.5 2.00.000.020.04CHLA_AV0204060801000.000.040.08SAND0.060.100.140.000.020.040.06K490_AV4060801000.000.020.04CRBNT27.528.028.529.00.000.020.040.06T_AV0.20.30.40.50.000.020.04NO3_AV26.726.826.927.027.10.000.020.04SST_AV0204060801000.000.020.04GRAVEL1.0 1.52.0 2.53.0 3.50.000.020.04T_SR−50−40−30−20−100.000.020.04BATHY0.10.20.30.40.50.000.020.04NO3_SR4.2 4.65.0 5.40.0000.0100.020SST_SR0.050.100.150.200.000.020.040.06K490_SR0.050.150.250.0000.0100.020BIR_SR0.00.20.40.60.80.0000.0100.020BIR_AV1.02.03.00.000.020.04S_SR1234560.000.020.04Si_SR 0.140.160.180.200.0000.0150.030PO4_AV 123450.0000.0100.0200.030CHLA_SR 4.254.354.450.0000.0040.0080.012O2_AV 0.00.5 1.0 1.5 2.0 2.50.0000.0100.0200.030SLOPEC u m u l a t i v e i m p o r t a n c eThe fourth plot is the predictor cumulative plot (plot.type="C",show.species=F ),which for each predictor shows cumulative importance distributions of splits improvement scaled by R 2weighted importance,and standardised by density of observations,averaged over all species.These show cumulative change in overall composition of the community,where changes occur on the gradient.Again many of the usual plot options can be set in the call;in this example com-mon.scale=T ensures that plots for all predictors have the same y-scale as the most important predictor.>plot(gf,plot.type ="C",imp.vars =most_important,+show.species =F,common.scale =T,cex.axis =0.6,+b =0.7,line.ylab =0.9,par.args =list(mgp =c(1.5,+0.5,0),mar =c(2.5,1,0.1,0.5),omi =c(0,+0.3,0,0)))0.10.20.30.40.50.0000.0100.020BSTRESS01030500.0000.0100.020MUD34.835.035.235.40.0000.0100.020S_AV1.52.53.50.0000.0100.020Si_AV0.51.0 1.52.00.0000.0100.020CHLA_AV0204060801000.0000.0100.020SAND0.060.100.140.0000.0100.020K490_AV4060801000.0000.0100.020CRBNT27.528.028.529.00.0000.0100.020T_AV0.20.30.40.50.0000.0100.020NO3_AV26.726.826.927.027.10.0000.0100.020SST_AV0204060801000.0000.0100.020GRAVEL1.0 1.52.0 2.53.0 3.50.0000.0100.020T_SR−50−40−30−20−100.0000.0100.020BATHY0.10.20.30.40.50.0000.0100.020NO3_SR4.2 4.65.0 5.40.0000.0100.020SST_SR0.050.100.150.200.0000.0100.020K490_SR0.050.150.250.0000.0100.020BIR_SR0.00.20.40.60.80.0000.0100.020BIR_AV1.02.03.00.0000.0100.020S_SR1234560.0000.0100.020Si_SR 0.140.160.180.200.0000.0100.020PO4_AV 123450.0000.0100.020CHLA_SR 4.25 4.35 4.450.0000.0100.020O2_AV0.00.5 1.0 1.5 2.0 2.50.0000.0100.020SLOPEC u m u l a t i v e i m p o r t a n c eThe fifth plot shows the R 2measure of the fit of the random forest model for each species,ordered in various ways.>plot(gf,plot.type ="P",s =F,horizontal =F,+cex.axis =1,bels =0.7,line =2.5)qq q q qq q q qq q q q qqq q q q qq q q qq q q q q q q q qq q q q q q q q q q q q q q q q q q qq q q q q q q q q qq q q q q qq q q q q q q q qq q q q q q q q q q q q q1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889900.00.10.20.30.40.50.6Species performance rankR 2Overall performance of random forests over speciesSeveral other alternative formats of the R 2fit performance plot are available,e.g.:>plot(gf,plot.type ="P",s =T,horizontal =F,+cex.axis =1,bels =0.7,line =2.5)>plot(gf,plot.type ="P",s =F,horizontal =T,+cex.axis =1,bels =0.6,line =2.5)>plot(gf,plot.type ="P",s =T,horizontal =T,+cex.axis =1,bels =0.6,line =2.5)3Gradient Forest predictionsIn addition to examining compositional change along environmental gradients,the predictor cumulative functions can also be used to transform grid data layers of environmental variables to a common biological importance scale.This transformation of the multi-dimensional en-vironment space is to a biological space in which coordinate position represents compositionassociated with the predictors.These inferred compositional patterns can be mapped in bio-logical space and geographic space in a manner analogous to ordination,but takes into account the non-linear and sometimes threshold changes that occur along gradients.3.1Transforming predictorsThe example provided includes gridded environmental variables for a roughly10,000km2area of the far northern Great Barrier Reef where the biological surveys were conducted.The data include North and East coordinates plus28predictors at8,682grid cells.The grid data must include the same predictors with the same names as sites included in the gradientForest call.>load("GZ.phys.grid.Rdata")>dim(Phys_grid)[1]868230>names(Phys_grid)[1]"NORTH""EAST""BATHY""SLOPE""ASPECT" [6]"BSTRESS""CRBNT""GRAVEL""SAND""MUD"[11]"NO3_AV""NO3_SR""PO4_AV""PO4_SR""O2_AV"[16]"O2_SR""S_AV""S_SR""T_AV""T_SR"[21]"Si_AV""Si_SR""CHLA_AV""CHLA_SR""K490_AV"[26]"K490_SR""SST_AV""SST_SR""BIR_AV""BIR_SR"The grid variables are transformed using the gradientForest predict function.>imp.vars<-names(importance(gf))>Trns_grid<-cbind(Phys_grid[,c("EAST","NORTH")],+predict(gf,Phys_grid[,imp.vars]))It is useful to also transform the site environmental predictors,which are available from gf$X.>Trns_site<-predict(gf)3.2Biplot of the biological spaceThe multi-dimensional biological space can most effectively be represented by taking the prin-ciple components of the transformed grid and presenting thefirst two dimensions in a biplot. It must be acknowledged that while most variation in patterns is captured by thefirst dimen-sions,additional compositional pattern contained in the higher dimensions is not shown.A user defined RGB colour palette is set up based on thefirst3dimensions.>PCs<-prcomp(Trns_grid[,imp.vars])>a1<-PCs$x[,1]>a2<-PCs$x[,2]>a3<-PCs$x[,3]>r<-a1+a2>g<--a2>b<-a3+a2-a1>r<-(r-min(r))/(max(r)-min(r))*255>g<-(g-min(g))/(max(g)-min(g))*255>b<-(b-min(b))/(max(b)-min(b))*255The environmental variables may be shown as vectors,perhaps limited to the most important predictors—in this example,variables to show as vectors are selected.>nvs<-dim(PCs$rotation)[1]>vec<-c("BSTRESS","MUD","SST_AV","T_AV","CHLA_AV", +"SAND","CRBNT","GRAVEL")>lv<-length(vec)>vind<-rownames(PCs$rotation)%in%vec>scal<-40>xrng<-range(PCs$x[,1],PCs$rotation[,1]/scal)*+ 1.1>yrng<-range(PCs$x[,2],PCs$rotation[,2]/scal)*+ 1.1>plot((PCs$x[,1:2]),xlim=xrng,ylim=yrng,+pch=".",cex=4,col=rgb(r,g,b,max=255),+asp=1)>points(PCs$rotation[!vind,1:2]/scal,pch="+")>arrows(rep(0,lv),rep(0,lv),PCs$rotation[vec,+1]/scal,PCs$rotation[vec,2]/scal,length=0.0625)>jit<-0.0015>text(PCs$rotation[vec,1]/scal+jit*sign(PCs$rotation[vec,+1]),PCs$rotation[vec,2]/scal+jit*sign(PCs$rotation[vec,+2]),labels=vec)Different coordinate positions in the biplot represent differing compositions,as associated with the predictors.Further information may be added to the biplot including the location of sites in biological space,the weight mean location of species,and selected species may be identified interactively.−0.02−0.010.000.010.02−0.02−0.010.000.010.02PC1P C 2>PCsites <-predict(PCs,Trns_site[,imp.vars])>points(PCsites[,1:2])>SpsWtd <-sweep(gf$Y,2,apply(gf$Y,2,min),+"-")>SpsWtdPCs <-(t(SpsWtd)%*%(PCsites[,1:2]))/colSums(SpsWtd)>points(SpsWtdPCs,col ="red",pch ="+")If required,the abundance of any given species may be plotted on the biplot.For example the first species from gf$Y =A1010102,an alga from the family Caulerpaceae that appears to prefer carbonate gravelly sand area with moderate bedstress and lower temperature.>sp <-colnames(SpsWtd)[1]>points(PCsites[,1:2],col ="blue",cex =SpsWtd[,+sp]/2)Alternatively,specifically named examples could be plotted:e.g.E4030373a Fungiid coral;M2020101a Strombid mollusc;or S1010671a Perophorid ascidian to name a few.3.3Mapping in geographic spaceThe biplot and the colour palette in the previous section can be used as a key to visualise com-positional patterns mapped in geographic space.The following map plots predicted PC scores in geographic coordinates,using the same colour palette as above,and represents continuous changes in inferred compositional patterns associated with the predictors.>plot(Trns_grid[,c("EAST","NORTH")],pch =".",+cex =3,asp =1,col =rgb(r,g,b,max =255))−0.6−0.4−0.20.00.20.40.6−0.6−0.4−0.20.00.20.4EASTN O R T H3.4A clustered versionSome applications may require a hard clustered output,representing inferred assemblages,rather than a continuous representation of biodiversity composition.The following example uses clara to make 8clusters.This is a fast claustering algorithm suitable for large data sets.The medoids are labelled and the colour key takes the value for each medoid.Other clustering methods maybe used(for example,pam would take several minutes)as alternatives,and their various cluster diagnostics may provide a guide to the appropriate numbers of clusters.>require(cluster)>ncl<-8>clPCs<-clara(PCs$x,ncl,sampsize=1000)>medcolR<-r[clPCs$i.med]>medcolG<-g[clPCs$i.med]>medcolB<-b[clPCs$i.med]>plot((PCs$x[,1:2]),xlim=xrng,ylim=yrng,+pch=".",cex=4,col=rgb(medcolR[clPCs$clustering],+medcolG[clPCs$clustering],medcolB[clPCs$clustering],+max=255),asp=1)>points(PCs$rotation[!vind,1:2]/scal,pch="+")>arrows(rep(0,lv),rep(0,lv),PCs$rotation[vec,+1]/scal,PCs$rotation[vec,2]/scal,length=0.0625)>text(PCs$rotation[vec,1]/scal+jit*sign(PCs$rotation[vec,+1]),PCs$rotation[vec,2]/scal+jit*sign(PCs$rotation[vec,+2]),labels=vec)>text(clPCs$medoids[,1:2],labels=seq(1,ncl))>legend("bottomleft",as.character(seq(1,ncl)),+pch=15,cex=1,col=rgb(medcolR,medcolG,+medcolB,max=255))−0.02−0.010.000.010.02−0.02−0.010.000.010.02PC1P C 2>plot(Trns_grid[,c("EAST","NORTH")],pch =".",+cex =3,asp =1,col =rgb(medcolR[clPCs$clustering],+medcolG[clPCs$clustering],medcolB[clPCs$clustering],+max =255))>points(Trns_grid[clPCs$i.med,c("EAST","NORTH")],+pch =as.character(seq(1,ncl)))>legend("bottomleft",as.character(seq(1,ncl)),+pch =15,cex =1,col =rgb(medcolR,medcolG,+medcolB,max =255))−0.6−0.4−0.20.00.20.40.6−0.6−0.4−0.20.00.20.4EASTN O R T H4Session informationThe simulation and output in this document were generated in the following computing envi-ronment.•R version 2.12.2Patched (2011-03-18r54866),x86_64-unknown-linux-gnu•Locale:LC_CTYPE=en_US.UTF-8,LC_NUMERIC=C ,LC_TIME=en_US.UTF-8,LC_COLLATE=C ,LC_MONETARY=C ,LC_MESSAGES=en_US.UTF-8,LC_PAPER=en_US.UTF-8,LC_NAME=C ,LC_ADDRESS=C ,LC_TELEPHONE=C ,LC_MEASUREMENT=en_US.UTF-8,LC_IDENTIFICATION=C•Base packages:base,datasets,grDevices,graphics,methods,stats,utils •Other packages:cluster˜1.13.3,extendedForest˜1.4,gradientForest˜0.1-11,lattice˜0.19-17•Loaded via a namespace(and not attached):grid˜2.12.2,tools˜2.12.2ReferencesL.˜Breiman.Random Forests.Machine Learning,45(1):5–32,2001.C.Y.Burridge,C.R.Pitcher,B.J.Hill,T.J.Wassenberg,and I.R.Poiner.A comparison of demersal communities in an area closed to trawling with those in adjacent areas open to trawling:a study in the Great Barrier Reef Marine Park,Australia.Fisheries Research,79: 64–74,2006.N.˜Ellis,S.J.Smith,and C.R.Pitcher.Gradient forests:calculating importance gradients on physical predictors.submitted manuscript.2010.S.˜Ferrier,G.˜Manion,J.˜Elith,and K.˜ing generalized dissimilarity modelling to analyse and predict patterns of beta diversity in regional biodiversity assessment.Diversity and Distributions,13(3):252–264,2007.Andy Liaw and Matthew Wiener.Classification and regression by randomforest.R News,2(3): 18–22,2002.URL /doc/Rnews/.C.R.Pitcher,W.˜Venables,N.˜Ellis,I.˜McLeod,M.˜Cappo, F.˜Pantus,M.˜Austin, P.˜Doherty,and N.˜Gribble.Gbr seabed biodiversity mapping project:Phase1 report to crc-reef.Technical report,CSIRO/AIMS/QDPI Report,2002.URL .au/resprogram/programC/seabed/Seabedphase1rpt.htm.C.R.Pitcher,P.˜Lawton,N.˜Ellis,S.J.Smith,L.S.Incze,C-L.Wei,M.E.Greenlaw,N.H.Wolff, J.˜Sameoto,and P.V.R.Snelgrove.The role of physical environmental variables in shaping patterns of biodiversity composition in seabed assemblages.submitted manuscript.2010.IR˜Poiner,J.˜Glaister,CR˜Pitcher, C.˜Burridge,T.˜Wassenberg,N.˜Gribble, B.˜Hill,SJM Blaber,DA˜Milton, D.˜Brewer,et˜al.The environmental ef-fects of prawn trawling in the far northern section of the Great Barrier Reef Marine Park:1991–1996.Final Report to GBRMPA and FRDC,1998.URL http://www.publish.csiro.au/books/bookpage.cfm?PID=2419.C.˜Strobl,A.L.Boulesteix,T.˜Kneib,T.˜Augustin,and A.˜Zeileis.Conditional variable im-portance for random forests.BMC Bioinformatics,9(1):307,2008.。
UNIT 4 学业质量检测选择题部分第一部分:听力(共两节,满分30分)第一节(共5小题;每小题分,满分分)听下面5段对话。
每段对话后有一个小题,从题中所给的A、B、C三个选项中选出最佳选项。
听完每段对话后,你都有10秒钟的时间来回答有关小题和阅读下一小题。
每段对话仅读一遍。
1.How much should the woman pay? __B__A.£2.85.B.£.C.£.2.What is the probable relationship between the speakers? __A__A.Teacher and student.B.Doctor and patient.C.Passenger and conductor.3.What is the weather like now? __A__A.It’s raining.B.It’s clear.C.It’s windy.4.Why did the man go back to the office? __A__A.He wanted to get the important things.B.He wanted to find the lost key.C.He went back to lock the office door.5.Where does the conversation probably take place? __A__A.At an office. B.In a library. C.In a park.第二节(共15小题;每小题分,满分分)听下面5段对话或独白。
每段对话或独白后有几个小题,从题中所给的A、B、C三个选项中选出最佳选项。
听每段对话或独白前,你将有时间阅读各个小题,每小题5秒钟;听完后,各小题将给出5秒钟的作答时间。
每段对话或独白读两遍。
听第6段材料,回答第6和第7两个小题。
6.What did the man use to take pictures? __B__A.A helicopter. B.A drone. C.A smartphone.7.What takes a lot of practice for the man? __C__A.Driving a helicopter. B.Taking pictures. C.Controlling a drone.听第7段材料,回答第8和第9两个小题。
松江区2023学年度第二学期模拟考质量监控试卷高三英语(满分140分,完卷时间120分钟)2024.4 考生注意:1.本考试设试卷和答题纸两部分,试卷包括试题与答题要求,所有答题必须涂(选择题)或写(非选择题)在答题纸上,做在试卷上一律不得分。
2.答题前,务必在答题纸上填写学校、班级、姓名和考号。
3.答题纸与试卷在试题编号上是一一对应的,答题时应特别注意,不能错位。
Ⅰ.Listening ComprehensionSection ADirections: In Section A, you will hear ten short conversations between two speakers. At the end of each conversation, a question will be asked about what was said. The conversations and the questions will be spoken only once. After you hear a conversation and the question about it, read the four possible answers on your paper, and decide which one is the best answer to the question you have heard.1.A.At 8:00. B.At 8:15. C.At 8:30. D.At 8:45.2.A.A professor. B.A coach. C.An engineer. D.A nurse.3.A.In a restaurant. B.In a hairdres ser’s.C.At a cinema. D.At a tailor’s. 4.A.Ways to visit a university. B.Two student tour guides.C.A tour of Fudan University. D.The campus of Fudan University.5.A.They did not make it there finally.B.They were not well received there.C.They experienced something unpleasant on the way.D.They had a wonderful time before they arrived there.6.A.Excited. B.Interested. C.Confused. D.Annoyed. 7.A.Practice the presentation in front of him. B.Watch how he makes a presentation. C.Reduce the time spent in practicing. D.Find out who her audience will be.8.A.She is always absent-minded. B.She forgot to tell the man about it.C.She is unclear about Sophie’s plan.D.She slipped in the neighboring town. 9.A.Because it took him much time to go to work.B.Because he had to save money for his journey.C.Because the job arranged many business journeys.D.Because he considered it unlucky to have that job.10.A.Buy a new printer with less noise. B.Ask the man to borrow a printer.C.Read a book on how to fix the printer. D.Get someone to repair the printer.Section BDirections: In Section B, you will hear two passages and one longer conversation. After each passage or conversation, you will be asked several questions. The passages and the conversation will be read twice, but the questions will be spoken only once. When you hear a question, read the four possible answers on your paper, and decide which one is the best answer to the question you have heard.Questions 11 through 13 are based on the following passage.11.A.How encores came into existence. B.How bands perform encores properly.C.Why audiences used to need encores. D.Why encores are part of a performance. 12.A.The 17th century. B.The 18th century. C.The 19th century. D.The 20th century. 13.A.French people were more interested in encores than others.B.Bands usually prepare more than two encores for each show.C.Recording technology boosted audiences’ needs for encores.D.Musicians can get recharged during the break before encores.Questions 14 through 16 are based on the following passage.14.A.Because of the rule for the class. B.Because of the course materials.C.Because the speaker changed his topics. D.Because the speaker disliked technology. 15.A.The students do not assess the speaker’s class fairly.B.The students are satisfied with the class environment.C.The speaker did not favor leaving technology at the door.D.The speaker were worried about students’ evaluation on him.16.A.It will stop students getting on well together.B.It may help students better understand themes.C.It will improve teaching effect by giving students more help.D.It may distract students from digging deep within themselves.Questions 17 through 20 are based on the following conversation.17.A.Doctor and patient. B.Salesman and customer.C.Teacher and student. D.Employer and employee.18.A.Fishing industry. B.Statistics. C.Computer modeling. D.Note-taking. 19.A.She is good at making model computers. B.She has decided on the title of the essay. C.She is uninterested in coping with statistics. D.She has always been weak at note-taking. 20.A.Learn to take notes immediately. B.Find out possible strategies alone.C.Read for more useful information. D.Work on her weaknesses by herself. Ⅱ.Grammar and VocabularySection ADirections: After reading the passage below, fill in the blanks to make the passage coherent and grammatically correct. For the blanks with a given word, fill in each blank with the proper form of the given word; for the other blanks, use one word that best fits each blank.Remote Work Slows Senior Housing Market RecoveryWith the rise of remote work, the market for senior housing has met with problems in its recovery. Only a few old people choose to live in senior-living communities (21)______the growing senior population and the cancelation of COVID-19 restrictions once making family visits difficult.(22)______ this trend suggests is that people’s shift to remote work contributes to the slow rebound of the senior housing market. That is, remote work is keeping many older Americans from moving into senior-living communities once warmly (23)______(welcome).When more adults began working remotely during the pandemic(流行病), they were able to check in on aging parents easily —they (24)______ take care of their parents’ issues on short notice.Experts have been analyzing the phenomenon in different ways. Some found that the greater flexibility to care for parents (25)______(mean)people’s delay in sending aged parents to expensive senior-housing accommodations. Therefore, markets with high levels of people working from home usually have lower senior-housing occupancy rates. Others said remote work might have some effect but also pointed to different factors. For instance, many seniors think that their family wallets are getting thinner, making some of them reluctant (26)______(send)to senior-living communities.The age at which people enter senior housing is also increasing, (27)______serves as another sign that shows people are choosing to delay transitioning. The rising cost of senior living weighs heavily on that decision. The CPI (consumer-price index)for nursing homes and adult day services rose 4.5% last May compared with (28)______in May, 2022.Still, many senior-housing operators are optimistic. When (29)______(illustrate)their point, they showed an increase in the number of people turning 80 years old over the following years and the actual wealth they have collected. Moreover, they find remote work arrangements are decreasing in some parts of the country, (30)______ employees there have seen their lowered productivity while working from home.Section BDirections: After reading the passage below, fill in each blank with a proper word chosen from the box. Each word can be used only once. Note that there is one word more than you need.A.accompanied B.allowed C.feasibly D.fueled E.intensity F.option G.prompting H.routine I.surgically J.underlying K.variedBrain Signals for Lasting PainBrain signals that reveal how much pain a person is in have been discovered by scientists who say the work is a step towards new treatments for people living with lasting pain.It is the first time researchers have decoded the brain activity 31 patients’ lasting pain. That has raised the hope that brain stimulation treatment alre ady used for Parkinson’s and major depression can help those running out of any other 32 . “We’ve learned that lasting pain can be tracked and predicted in the real world,” said Prasad Shirvalkar, lead researcher on the project at the University of California.Lasting pain affects nearly 28 million adults in the UK alone, and the causes are 33 . ranging from cancer to back problems. That being the case, lasting pain has 34 a rise in taking powerful painkillers. But nomedical treatments work well for the condition, 35 experts to call for a complete rethink in how health services handle patients with lasting pain.For the latest study, Shirvalkar and his colleagues 36 implanted electrodes(电极)into four patients with lasting pain hard to deal with after the loss of legs. The devices 37 the patients to record activity and collect data in two brain regions—the ACC and the OFC—at the press of one button on a remote handset. Several times a day, the volunteers were asked to complete short surveys on the 38 of pain, meaning how strong the pain was, and then record their brain activity. These scientists, armed with the survey responses and brain recordings, found they could use computers to predict a person’s pa in based on the electrical signals in their OFC. “We found very different brain activity 39 severe pain and have developed an objective biomarker for that kind of pain,” said Shirvalkar. The finding may explain, at least in part, why 40 painkillers are less effective for lasting pain. “The hope is that we can use the information to develop personalized brain stimulation treatment for the most severe forms of pain.”Ⅲ.Reading ComprehensionSection ADirections: For each blank in the following passage there are four words or phrases marked A, B, C and D. Fill in each blank with the word or phrase that best fits the context.The way of recording things has never ceased to develop. In the 1980s, as sales of video recorders went up, old 8mm home movies were gradually replaced by VHS (video home system)tapes. Later, video tapes of family holidays lost their appeal and the use of DVDs 41 . Those, too, have had their day. Even those holding their childhood memories in digital files on their laptops now know these files face the risk of 42 .Digitising historical documents brings huge benefits—files can be 43 and distributed, reducing the risk of their entire loss through physical damage caused by fire or flooding. And developing digital versions reduces44 on the original items. The International Dunhuang Project, 45 , has digitised items like manuscripts(手稿)from the Mogao caves in China, enabling scholars from around the world to access records easily without touching the real items.But the news that the Ministry of Justice of the UK is proposing to scan the 110 million people’s wills it holds and then destroy a handful of 46 after 25 years has shocked historians. The ministry cites this as a way of providing easier access for researchers. But that only justifies digitisation, not the 47 of the paper copies. The officials note the change will be economically efficient (saving around £4.5m a year)while keeping all the essential information.Scholars 48 . Most significantly, physical records can themselves carry important information — the kind of ink or paper used may be part of the history that historians are 49 . and error s are often made in scanning. Besides, digital copies are arguably more 50 than the material items, just in different ways. The attack from the Internet on the British Library last October has prevented scholars from 51 digitised materials it holds: imagine if researchers could not return to the originals. Some even think digitised information can easily be lost within decades no matter what 52 are put in place.The government says that it will save the original wills of “famous people for historic record”, such as that of Princess Diana’s. However, assuming that we know who will 53 to future generations is extraordinarilyproud. Mary Seacole, a pioneering nurse who now appears on the national school course in the UK, was largely54 for almost a century.The digitisation of old documents is a valuable, even essential measure. But to destroy the originals once they have been scanned, is not a matter of great 55 , but of huge damage.41.A.paused B.boomed C.recovered D.disappeared 42.A.getting outdated B.coming into style C.being fined D.making an error 43.A.deleted B.named C.copied D.altered 44.A.fight or flight B.life or death C.wear and tear D.awe and wonder 45.A.unfortunately B.additionally C.in summary D.for example 46.A.the originals B.the essentials C.the visualised D.the digitised 47.A.preservation B.classification C.publication D.destruction 48.A.applaud B.disagree C.discriminate D.withdraw 49.A.revising B.abandoning C.uncovering D.enduring 50.A.meaningful B.favourable C.resistant D.delicate 51.A.inventing B.adjusting C.accessing D.damaging 52.A.outcomes B.safeguards C.deadlines D.byproducts 53.A.matter B.respond C.lose D.live 54.A.spared B.discussed C.forgotten D.protected 55.A.sacrifice B.courage C.efficiency D.admirationSection BDirections: Read the following three passages. Each passage is followed by several questions or unfinished statements. For each of them there are four choices marked A, B, C and D. Choose the one that fits best according to the information given in the passage you have just read.(A)Charles Robert Darwin was born on 12 February 1809 in Shropshire, England. Darwin’s childhood passion was science, and his interest in chemistry, however, was clear; he was even nicknamed ‘Gas’ by his classmates.In 1825, his father sent him to study medicine at Edinburgh University, where he learned how to classify plants. Darwin became passionate about natural history and this became his focus while he studied at Cambridge. Darwin went on a voyage together with Robert Fitzroy, the captain of HMS Beagle, to South America to facilitate British trade in Patagonia. The journey was life-changing. Darwin spent much of the trip on land collecting samples of plants, animals and rocks, which helped him to develop an understanding of the processes that shape the Earth’s surface. Darwin’s analysis of the plants and animals that he gathered led him to express doubts on former explanations about how species formed and evolved over time.Darwin’s work convinced him that natural selection was key to understanding the development of the natural world. The theory of natural selection says that individuals of a species are more likely to survive when they inherit (经遗传获得)characteristics best suited for that specific environment. These features then become more widespread and can lead eventually to the development of a new species. With natural selection, Darwin argued how a wide variety of life forms developed over time from a single common ancestor.Darwin married his cousin, Emma Wedgwood, in 1839. When Darwin’s eldest daughter, Annie, died from a sudden illness in 1851, he lost his belief in God. His tenth and final child, Charles Waring Darwin, was born in 1856.Significantly for Darwin, this baby was disabled, altering how Darwin thought about the human species. Darwin had previously thought that species remained adapted until the environment changed; he now believed that every new variation was imperfect and that a struggle to survive was what drove species to adapt.Though rejected at the beginning, Darwin’s theory of evolution by natural selection is nowadays well acc epted by the scientific community as the best evidence-based explanation for the diversity and complexity of life on Earth. The Natural History Museum’s library alone has 478 editions of his On the Origin of Species in 38 languages.56.What made Darwin reconsider the origin and development of species?A.Examining plants and animals collected.B.His desire for a voyage to different continents.C.Classifying samples in a journey to South America.D.His passion for natural history at Edinburgh University.57.We can learn from paragraphs 1 to 3 that Darwin ______.A.used natural selection to develop new speciesB.enjoyed being called nicknames related to scienceC.learned some knowledge about plants when studying medicineD.argued with others over the diversity of life forms for a long period58.Which of the following changed Darwin’s view on the human species?A.That he had ten children in all. B.His youngest son’s being disabled.C.That he lost his eldest daughter. D.His marriage with Emma Wedgwood.59.This passage is mainly about ______.A.Darwin’s passion for medical science B.Darwin’s theory and experimentsC.Charles Darwin’s changing interest D.Charles Darwin’s life and work(B)Welcome to Muir Woods! This rare ancient forest is a kingdom of coast redwoods, many over 600 years old. How to get here?People using personal vehicles must have reservations before arriving at the park. (Details at.)Muir Woods National Monument is open daily, 8 a. m. to sunset. Stop by Visitor Center to get trails(路线)and program information, and to take in exhibits.What’s your path?Enjoy a walk on the paved Redwood Creek Trail (also called Main Trail). Choose short, medium, or long loops(环线). Other trails go deep into Muir Woods and Mount Tamalpais State Park.(Refer to the map of Muir Woods on the right for details.)Ready to explore more?Muir Woods is part of Golden Gate National Recreation Area, which includes Marin Headlands, Alcatraz, the Presidio, and Ocean Beach. Download the app at /goga.Stay safe and protect your park.Wi-Fi and cell service are not available. ·Watch for poisonous plants and falling branches. ·Do not feed or disturb animals. ·Fishing is prohibited in the park. ·Do not mark or remove trees, flowers, or other natural features. ·Go to the park website for more safety tips and regulations.AccessibilityWe make a great effort to make facilities, services, and programs accessible to all. For information, go to Visitor Center, ask a ranger, call, or check our website.More InformationMuir Woods National Monument /muwo Mill Valley, CA 94941-269660.Muir Woods will probably attract ______.①redwood lovers ②hunting lovers ③fishing lovers ④hiking loversA.①②B.③④C.①④D.②③61.What can be learned from the passage?A.Muir woods is surrounded by highland and ocean beaches.B.Visitors can read electronic maps using Wi-Fi in Muir Woods.C.Visitors are advised to call Visitor Center for safety tips and regulations.D.Reservations should be made if visitors drive private cars to Muir Woods.62.According to the map of Muir Woods, ______.A.Bridge 4 is the farthest from the parking lots of all bridgesB.Mill Valley is located on the southwest side of Muir BeachC.Bootjack Trail can lead one to Visitor Center from Bridge 3D.food and gifts can be bought on various sites in Muir Woods(C)Precognitive dreams are dreams that seemingly predict the future which cannot be inferred from actually available information. Former US President Abraham Lincoln once revealed the frightening dream to his law partner and friend Ward Hill Lamon, “…Then I heard people weep… ‘Who is dead in the White House?’ I demanded. ‘The President,’ ‘he was killed!’…” The killing did happen later.Christopher French, Professor in the Department of Psychology at Goldsmiths, stated the most likely explanation for such a phenomenon was coincidence(巧合). “In addition to pure coincidences we must also consider the unreliability of memory”, he added. Asked what criteria would have to be met for him to accept that precognitive dreams were a reality, he said, “The primary problem with tests of the claim is that the subjects are unable to tell when the event(s)they’ve dreamed about will happen.”However, some claimed to make such tests practicable. Professor Caroline Watt at the University of Edinburgh, has conducted studies into precognitive dreaming. She stated that knowing future through dreams challenged the basic assumption of science — causality (relationship of cause and effect).Dick Bierman, a retired physicist and psychologist, who has worked at the Universities of Amsterdam, Utrecht and Groningen, has put forward a theory that may explain precognitive dreams. It is based on the fact that when scientists use certain mathematical descriptions to talk about things like electromagnetism(电磁学), these descriptions favour the belief that time only moves in one direction. However, in practice the wave that is running backwards in time does exist. This concept is called the time symmetry, meaning that the laws of physics look the same when time runs forward or backward. But he believes that time symmetry breaks down due to external conditions. “The key of the theory is that it assumes that there is a special context that restores th e broken time-symmetry, if the waves running backwards are ‘absorbed’ by a consistent multi-particle(多粒子)system. The brain under a dream state may be such a system where broken time-symmetry is partially restored. This is still not a full explanation for precognitive dreams but it shows where physics might be adjusted to accommodate the phenomenon,” he explains.Although Bierman’s explanation is still based on guesses and has not accepted by mainstream science, Watt does think it is worth considering. For now, believing that it’s possible to predict future with dreams remains an act of faith. Yet, it’s possible that one day we’ll wake up to a true understanding of this fascinating phenomenon. 63.According to French, what makes it difficult to test precognitive dreams?A.Unavailability of people’s dreams.B.That coincidences happen a lot in reality.C.That criteria for dream reliability are not trustworthy.D.People’s inability to tell when dreamt events will happen.64.Believers in precognitive dreams may question the truth of ______.A.the assumption of causality B.the time symmetryC.memories of ordinary people D.modern scientific tests65.We can infer from the passage that ______.A.Lincoln was warned of the killing by his friendB.Watt carried out several experiments on causalityC.researches on electromagnetism are based on the time symmetryD.time’s moving in two directions may justify precognitive dreams66.Which might be the best title of the passage?A.Should Dreams Be Assessed?B.Can Dreams Predict the Future?C.How Can Physics Be Changed to Explain Dreams?D.Why Should Scientists Study Precognitive Dreams?Section CDirections: Read the following passage. Fill in each blank with a proper sentence given in the box. Each sentence can only be used once. Note that there are two sentences more than you need.A.Labeling poses even more of a problem when it comes to kids.B.It can be helpful for those not quite able to understand why they feel the way they do.C.There seems to be a desire to see negative emotions as something requiring intervention or diagnosis. D.Labeling leads to children’s overcoming their addiction to what is posted online.E.Someone has had only a certain experience and judges all behavior with that experience.F.The basic function of a diagnosis is to give you a name for those behaviors once felt unusual.Addiction to LabelingMaybe you’ve noticed it in the comments section of popular social media posts about anxiety. depression or things alike, with a number of people claiming to pick these labels for themselves.These days, labeling is everywhere. (67)______ However, the negative part is that it’s easy for someone to identify with the characteristics without truly recognizing the context in which these characteristics would require diagnosis, according to Charlotte Armitage, a registered integrative psychotherapist and psychologist.If you have done your research and genuinely feel that you have some form of mental health concern, then finally having a name for your behaviors can be great. But the risk is that many people will seek labels and intervention for any behavior, pattern or emotion that is outside of the permanent happy group that society has set as the norm. “(68)______ Then the saying ‘a little bit of knowledge is dangerous’ springs to my mind,” Armitage adds.(69)______“Children are still developing and evolving, and many childhood behavioral features may seem like those of a disorder when there’re other potential explanations for that behavior,” Armitage notes. Ideally, a diagnosis for a child should be carried out by a qualified mental health professional. So it is with an adult.Nevertheless, the most important thing to bear in mind is that diagnosis doesn’t mean to indicate that you are broken or less capable.(70)______ And if you go deeper, it can alert you to the fact that you are not alone, and that many people experience life in the same way as you do.Ⅳ.Summary WritingDirections: Read the following passage. Summarize the main idea and the main point(s)of the passage in no more than 60 words. Use your own words as far as possible.71.Why Willing to Wait?First it was the fried chicken. Then a variety of fancy milkshakes. No matter what time it is or how bad the streets smell, there are plenty of people waiting in line for hours to get their hands on the food that everyone’s talking about. If you are not the type of person crazy for trendy foods, you probably wonder why someone would like to wait in a long line just to get a taste of a popular cream tea. There is a bit of psychology behind the craze of waiting before getting one’s chopsticks on a trendy food.People are born curiosity hunters, especially for fresh ideas, according to some experts. At the sight of a long waiting line, they just can’t help having a try. And when the trendy foods are novel in looks and favors, even innovative in their sales environment, the desire for them is upgraded. All those stimulate people to investigate more—to deal with their curiosity.In addition, having access to something that is sought out but hard to possess equips people with a feeling that improves their self-definitions. When someone is envied due to something he gained with efforts, his self-worth gets enhanced. Although it is yet to be determined whether the number of likes he receives on the photos of foods he’s posted online is connected with the level of envy from on-lookers, that feeling automatically becomes stronger.Even more, “mob psychology” comes into play: when many people are doing something—waiting in line for the sought-after milkshakes, for instance —others are eager to be part of the group and share such a type of social familiarity, kind of like the natural pursuit of a sense of belonging. Tasting the same wait-worthy food has something in common.Ⅴ.TranslationDirections: Translate the following sentences into English, using the words given in the brackets.72.大多数中国人喜欢在生日的时候吃碗面。
Marching Cube 算法原理1.1.1 Marching Cube 算法概述面绘制法则是根据设定的阈值,从体数据中提取出表面的三角面片集,再用光照模型对三角面片进行渲染,形成三维图像。
面绘制法主要分为基于断层轮廓线的方法和基于体素的方法。
基于断层轮廓线的方法是先在不同的断层上提取出感兴趣区的轮廓线,然后在相邻的断层的轮廓线间构造出三角面片的方法,这在同一断层上有多个轮廓线时会产生模糊性,上下两层的轮廓线不易对应。
用户干预可以避免一定的模糊性,但是这样大大增加了操作的复杂性。
因此不被广泛采纳使用。
基于体素的方法以移动立方体法(Marching Cube,MC)为代表。
Marching Cubes算法是面显示算法中的经典算法,它也被称为“等值面提取”(Iso-surface Extraction)。
本质是将一系列两维的切片数据看作是一个三维的数据场,从中将具有某种域值的物质抽取出来,以某种拓扑形式连接成三角面片。
算法的基本原理MC算法的基本思想是逐个处理体数据场中的各个体元,并根据体元各个顶点的值来决定该体元内部等值面的构造形式"算法实现过程中,体元内等值面构造要经过以下两个主要计算:1、体元中三角面片逼近等值面的计算;2、三角面片各顶点法向量的计算。
1.1.2预备知识介绍(体素模型和等值面介绍)1、体素模型的介绍体素一般有两种定义:一种与二维图像中像素定义相类似。
直接把体数据中的采样点作为体素,另一种是把八个相邻的采样点包含的区域定义为体素。
在三维空间某一个区域内进行采样,若采样点在x,y,z,三个方向上分布是均匀的。
采样间距分别为Δx,Δy,Δz,则体数据可以用三维数字矩阵来表示。
每八个相临的采样点相临的立方体区域就定义为一个体素。
而这八个采样点称为该体素的角点。
他们的坐标分别为:(i, j, k), (i+1,j,k), (i,j+1,k), (i+1,j+1,k), (i,j,k+1), (i,j,k+1), (i+1.j+k+1), (i,j+1,k+1) 和(i+1,j+1,k+1)如图-1所示这样的体素为边界体素。
EDCircles:A real-time circle detector with a false detection controlCuneyt Akinlar n,Cihan TopalDepartment of Computer Engineering,Anadolu University,Eskisehir26470,Turkeya r t i c l e i n f oArticle history:Received9April2012Received in revised form21September2012Accepted26September2012Available online3October2012Keywords:Circle detectionEllipse detectionReal-time image processingHelmholtz PrincipleNFAa b s t r a c tWe propose a real-time,parameter-free circle detection algorithm that has high detection rates,producesaccurate results and controls the number of false circle detections.The algorithm makes use of thecontiguous(connected)set of edge segments produced by our parameter-free edge segment detector,theEdge Drawing Parameter Free(EDPF)algorithm;hence the name EDCircles.The proposed algorithmfirstcomputes the edge segments in a given image using EDPF,which are then converted into line segments.The detected line segments are converted into circular arcs,which are joined together using two heuristicalgorithms to detect candidate circles and near-circular ellipses.The candidates arefinally validated by an acontrario validation step due to the Helmholtz principle,which eliminates false detections leaving onlyvalid circles and near-circular ellipses.We show through experimentation that EDCircles works real-time(10–20ms for640Â480images),has high detection rates,produces accurate results,and is very suitablefor the next generation real-time vision applications including automatic inspection of manufacturedproducts,eye pupil detection,circular traffic sign detection,etc.&2012Elsevier Ltd.All rights reserved.1.IntroductionDetection of circular objects in digital images is an importantand recurring problem in image processing[1]and computervision[2],and has many applications especially in such automa-tion problems as automatic inspection of manufactured products[3],aided vectorization of line drawing images[4,5],pupil and irisdetection[6–8],circular traffic sign detection[9–11],and manyothers.An ideal circle detection algorithm should run with afixed set ofinternal parameters for all images,i.e.,require no parameter tuningfor different images,be very fast(real-time if possible),detectmultiple small and large circles,work with synthetic,natural andnoisy images,have high detection rate and good accuracy,andproduce a few or no false detections.The circle detection algorithmpresented in this paper satisfies all of these properties.Traditionally,the most popular circle detection techniques arebased on the famous circle Hough transform(CHT)[12–16].Thesetechniquesfirst compute an edge map of the image using atraditional edge detector such as Canny[17],map the edge pixelsinto the three dimensional Hough circle space(x,y,r)and extractcircles that contain a certain number of edge pixels.Not onlyCHT-based techniques are very slow and memory-demanding,butthey also produce many false detections especially in the presenceof noise.Additionally,these methods have many parameters thatmust be preset by the user,which greatly limits their use.To overcome the limitations of the classical CHT-based meth-ods,many variants have been proposed including probabilistic HT[18,19],randomized HT[20,21],fuzzy HT[22],etc.There are alsoapproaches based on HT and hypothesisfiltering[23–25].Allthese methods try to correct different shortcomings of CHT,butare still memory-demanding and slow to be of any use in real-time applications.Apart from the CHT-based methods,there are several rando-mized algorithms for circle detection.Chen et al.[26]propose arandomized circle detection(RCD)algorithm that randomly selectsfour pixels from the edge map of an image,uses a distance criterionto determine whether there is a possible circle in the image.Theythen use an evidence-collecting step to test if the candidate circle isa real-circle.RCD produces good results,but is slow.Recently,Chunget al.[27,28]have proposed efficient sampling and refinementstrategies to speed up RCD and increase the accuracy of RCD’sresults.Although the new RCD variants named GRCD-R,GLRCD-R[28]have good detection rates and produce accurate results,theystill are far from being real-time.Furthermore,all RCD-variantswork on the edge map of an image computed by a traditional edgedetector such as the Sobelfilter or the Canny edge detector,whichhave many parameters that must be set by the user.Recently,many efforts have concentrated on using geneticalgorithms and evolutionary computation techniques in circledetection[29–36].Ayala-Ramirez et al.[30]proposed a geneticalgorithm(GA)for circle detection,which is capable of detectingmultiple circles but fails frequently to detect small or imperfectContents lists available at SciVerse ScienceDirectjournal homepage:/locate/prPattern Recognition0031-3203/$-see front matter&2012Elsevier Ltd.All rights reserved./10.1016/j.patcog.2012.09.020n Corresponding author.Tel.:þ902223213550x6553.E-mail addresses:cakinlar@.tr(C.Akinlar),cihant@.tr(C.Topal).Pattern Recognition46(2013)725–740circles.Dasgupta et al.[31–33]developed a swarm intelligence technique named adaptive bacterial foraging optimization (ABFO)for circle detection.Their algorithm produces good results but is sensitive to noise.Cuevas et e discrete differential evolution (DDE)optimization [34],harmony search optimization (HSA)[35]and an artificial immune system optimization technique named Clonal Selection Algorithm (CSA)[36]for circle detection.Although these evolutionary computation techniques have good detection rates and accurate results,they usually require multiple runs to detect multiple circles,and are quite slow to be suitable for real-time applications.Just like RCD,these algorithms work on an edge map pre-computed by a traditional edge detection algorithm with many parameters.Frosio et al.[37]propose a real-time circle detection algorithm based on maximum likelihood.Their method is fast andcan detect partially occluded circular objects,but requires that the radius of the circles to be detected be predefined,which greatly limits its applications.Wu et al.[41]present a circle detection algorithm that runs 7frames/s on 640Â480images.The authors claim to achieve high success rate,but there is not much experi-mental validation to back their claims.Zhang et al.[38]propose an ellipse detection algorithm that can be used for real-time face detection.Liu et al.[39]present an ellipse detector for noisy images and Prasad et al.[40]present an ellipse detector using the edge curvature and convexity information.While both algorithms produce good results,they are slow and not suitable for real-time applications.Vizireanu et al.[42–44]make use of mathematical morphol-ogy for shape decomposition of an image and use the morpholo-gical shape decomposition representation of the image for recognition of different shapes and patterns in the image.While their algorithms are good for the detection of general shapes in an image,they are not suitable for real-time applications.Desolneux et al.[60]is the first to talk about the a contrario circular arc detection.Recently,Patraucean et al.[45,46]propose a parameter-free ellipse detection algorithm based on the a contrario framework of Desolneux et al.[58].The authors extend the line segment detector (LSD)by Grompone von Gioi et al.[63]to detect circular and elliptic arcs in a given image without requiring any parameters,while controlling the number of false detections by the Helmholtz principle [58].They then use the proposed algorithm (named ELSD [46])for the detection identification of Bubble Tags [47].In this paper,we present a real-time (10–20ms on 640Âimages),parameter-free circle detection algorithm that has high detection rates,produces accurate results,and has an a contrario validation step due to the Helmholtz principle that lets it control the number of false detections.The proposed algorithm makes use of the contiguous (connected)set of edge segments produced by our parameter-free edge segment detector,the edge drawing parameter free (EDPF)[48–53];hence the name EDCircles [54,55].Given an input image,EDCircles first computes the edge segments of the image using EDPF.Next,the resulting edge segments are turned into line segments using our line segment detector,EDLines [56,57].Computed lines are then converted into arcs,which are combined together using two heuristic algorithms to generate many candidate circles and near-circular ellipses.Finally,the candidates are vali-dated by the Helmholtz principle [58–63],which eliminates false detections leaving only valid circles and near-circular ellipses.2.The proposed algorithm:EDCirclesEDCircles follows several steps to compute the circles in a given image.The general idea is to extract line segments in an image,convert them into circular arcs and then combine these arcs to detect circles and near-circular ellipses.General outline ofEDCircles algorithm is presented in Algorithm 1and we will describe each step of EDCircles in detail in the following sections.Algorithm 1.Steps of EDCircles algorithm.1.Detect edge segments by EDPF and extract complete circles and ellipses.2.Convert the remaining edge segments into line segments.3.Detect arcs by combining line segments.4.Join arcs to detect circle candidates.5.Join the remaining arcs to detect near-circular ellipse candidates.6.Validate the candidate circles/ellipses using the Helmholtz principle.7.Output the remaining valid circles/ellipses.2.1.Edge segment detection by edge drawing parameter free (EDPF)Given an image,the first step of EDCircles is the detection of the edge segments in the image.To achieve this,we employ our recently proposed,real-time edge/edge segment detector,edge drawing (ED)[48–51].Unlike traditional edge detectors,e.g.,Canny [17],which work by identifying a set of potential edge pixels in an image and eliminating non-edge pixels through operations such as non-maximal suppression,hysteresis thresh-olding,erosion,etc.,ED follows a proactive approach and works by first identifying a set of points in the image,called the anchors,and then joins these anchors using a smart routing procedure;that is,ED literally draws edges in an image.ED outputs not only a binary edge map similar to those output by traditional edge detectors,but it also outputs the result as a set of edge segments each of which is a contiguous (connected)pixel chain [49].ED has many parameters that must be set by the user,which requires the tuning of ED’s parameters for different types of images.Ideally,one would want to have a real-time edge/edge segment detector which runs with a fixed set of internal parameters for all types of images and requires no parameter tuning.To achieve this goal,we have recently incorporated ED with the a contrario edge validation mechanism due to the Helmholtz principle [58–60],and obtained a real-time parameter-free edge segment detector,which name edge drawing parameter free (EDPF)[52,53].EDPF works running ED with all ED’s parameters at their extremes,which all possible edge segments in a given image with many false positives.We then validate the extracted edge segments by the Helmholtz principle,which eliminates false detections leaving only perceptually meaningful edge segments with respect to the a contra-rio approach.Fig.1(a)shows a 424Â436grayscale synthetic image contain-ing a big circle obstructed by four rectangular blocks,a small ellipse obstructed by three rectangular blocks,a small circle,an ellipse and an arbitrary polygon-like object.When this image is fed into EDPF,the edge segments shown in Fig.1(b)are produced.Each color in the edge map represents a different edge segment,each of which is a contiguous chain of pixels.For this image,EDPF outputs 15edge segments in just 3.7ms in a PC with 2.2GHz Intel 2670QM CPU.Notice the high quality nature of the edge map with all details clearly visible.Each edge segment traces the boundary of one or more objects in the figure.While the boundary of an object may be traced by a single edge segment,as the small circle,the ellipse and the polygonal object are in Fig.1(b),it is also possible that an object’s boundary be traced by many different edge segments.This is the case for the big circle as the circle’s boundary is traced by four different edge segments,and the small obstructed ellipse,which is traced by three different edge segments.The result totally depends on the structure of the objects,the amount of obstruction and noise in the image.That is,there is noC.Akinlar,C.Topal /Pattern Recognition 46(2013)725–740726ellipse and the polygon,the entire boundary of anobject in the image is returned as a closed curve;that is,the edge segment starts at a pixel on the boundary of an object,traces its entire boundary and ends at where it starts.In other words,the first and last pixels of the edge segment are neighbors of each other.It is highly likely that such a closed edge segment traces the boundary of a circle,an ellipse or a polygonal shape as is the case in Fig.1.So as the first step after the detection of the edge segments,we go over all edge segments,take the closed ones and see if the closed edge segment traces the entire boundary of a circle or an ellipse.Processing of a closed edge segment follows a very simple idea:We first fit a circle to the entire list of pixels in the edge segment using the least squares circle fit algorithm [64]and compute the root mean square error.If the circle fit error,i.e.,the root mean square error,is smaller than some threshold (fixed at 1.5pixels for the proposed algorithm),then we add the circle to the list of circle candidates.Just because the circle fit error is small does not mean that the edge segment is an actual circle;it is just a candidate yet and needs to go through circle validation by the Helmholtz principle to be returned as a real circle.Section 2.6describes the details of circle validation.If the circle fit fails,then we try fitting an ellipse to the pixels of the edge segment.We use the ellipse fit algorithm described in [65],which returns an ellipse equation of the form Ax 2þBxy þCy 2þDx þEy þF ¼0.If the ellipse fit error,i.e.,the root mean square error,is smaller than a certain threshold (fixed at 1.5pixels for the proposed algorithm),then we add the ellipse to the list of ellipse candidates,which also needs to go through validation by the Helmholtz principle before being returned as a real ellipse.If the edge segment is accepted either as a circle or an ellipse candidate,it is removed from the list of edge segments and is not processed any further.Otherwise,the edge segment is used in further processing along with other non-closed edge segments.2.2.Conversion of edge segments into line segmentsAfter the removal of the closed edge segments,which are taken as circle or ellipse candidates,the remaining edge segments are converted into line segments (lines for short in the rest of the paper).The motivation for this step comes from the observation that any circular shape is approximated by a consecutive set of lines (as seen in Fig.1(c)),and these lines can easily be turned into circular arcs by a simple post-processing step as described in the next section.Conversion of an edge segment into a set of lines follows the algorithm given in our line detector,EDLines [56,57].The idea is to start with a short line that satisfies a certain straightness criterion,and extend the line for as long as the root mean square error is smaller than a certain threshold,i.e.,1pixel error.Refer to EDLines [56,57]for the details of line segment extraction,where we validate the lines after detection using the Helmholtz principle to eliminate invalid detections.In EDCircles,though,we do not validate the lines after detection.The reason for this decision comes from our observation that the line segment validation algorithm due to the Helmholtz principle usually eliminates many short lines,which may be valuable for the detection of small circles in an image.So,unlike EDLines,we do not eliminate any detected lines and use all detected lines for further processing and detection of arcs.Fig.1(c)shows the lines extracted from the image shown in Fig.1(a).Clearly,circular objects are approximated by a set of consecutive lines.In the next section,we describe how these lines can be converted into circular arcs by processing of consecutive lines.2.3.Circular arc detectionWe at least three consecutive lines that Using this definition,we detect a list of lines making up an edge segment,simply walk over the lines and compute the angle between consecutive lines and the direction of turn from one line to the next.If at least three lines turn in the same direction and the angle between the lines is in-between certain thresholds,then these lines may form a circular arc.Fig.2illustrates a hypothetical edge segment being approxi-mated by 18consecutive line segments,labeled l 1through l 18.To compute the angle between two consecutive lines,we simply threat each line as a vector and compute the vector dot product.Similarly,to compute the turn of direction from one line to the next,we simply compute the vector cross product and use the sign of the result as the turn direction.Fig.3(a)illustrates the approximation of the blue right vertical edge segment in Fig.1(b)by 11consecutive line segments,labeled v1through v11.Fig.3(b)shows the details of the 11lines:their lengths,the angle between consecutive lines and the direction of the turn going from one line to the next,where a ‘þ’denotes a left turn,and ‘À’denotes a right turn.Our arc detection algorithm is based on the following idea:For a set of lines to be a potential arc candidate,they all must have the same turn direction (to the left or to the right)and the angle between consecutive lines must be in-between certain thresh-olds.If the angle is too small,we assume that the lines are collinear so they cannot be part of an arc;if the angle is too big,we assume that the lines are part of a strictly turning object such as a square,a rectangle,etc.For the purposes of our current implementation,we fix the low angle threshold to 61,and the high angle threshold to 601.These values have been obtained byFig.1.(a)A sample image (424Â436).(b)Edge segments (a contiguous chain of pixels)extracted by EDPF.Each color represents a different edge segment.EDPF outputs 15edge segments in 3.7milliseconds (ms).(c)Lines approximating the edge segments.A total of 98lines are extracted.(For interpretation of the references to color in this figure caption,the reader is referred to the web version of this article.)C.Akinlar,C.Topal /Pattern Recognition 46(2013)725–740727experimentation on a variety ofimages containing various circular objects.The bottom part of Fig.2depicts the angles between consecutive lines of the edge segment shown at the top of Fig.2,and the turn of direction from one line to the next.The angles smaller than the low angle threshold or bigger than the high angle threshold,e.g.,y 1,y 2,y 9,and y 16,have been colored red;all other angles have been colored either blue or green depending on the turn of direction.Specifically,if the next line turns to the left,the angle has been colored blue,and if the next line turns to the right,then the angle has been colored green.Having computed the angles and the turn of direction infor-mation,we simply walk over the lines of an edge segment lookingfor a set of at least three consecutive lines which all turn in the same direction and the turn angle from one line to the next is in-between the low and high angle thresholds.In Fig.2,lines v 3through v 7satisfy our criterion and is a potential arc candidate.Similarly,lines v 10through v 16make up for another arc candidate.Given a set of at least three lines that satisfy our arc candidate constraints,we first try fitting a circle to all pixels making up the lines using the circle fit algorithm in [64].If the circle fit succeeds,i.e.,if the root mean square error is less than 1.5pixels,then the extracted arc is simply added to the list of arcs,and we are done.Otherwise,we start with a short arc consisting of only of three lines and extend it line-by-line by fitting a new circle [64]until the root mean square error exceeds 1.5pixels.At this point,the to the list of arcs,and we continue processing detect more circular ing this algorithm,in Fig.2:Lines v 3through v 7form arc A1with center ðx c A 1,y c A 1Þand radius r A 1.Similarly,lines v 10through v 16form arc A 2with center ðx c A 2,y c A 2Þand radius r A 2.In a complex image consisting of many edge segments,we will have hundreds of arcs.Fig.4shows the arcs computed from the lines of Fig.1(c),and Table 1gives the details of these arcs.An arc spans a part between (StartAngle,EndAngle)of the great circle specified by (Center X,Center Y,Radius).The arc is assumed to move counter-clockwise from StartAngle to EndAngle over the great circle.As an example,A 2covers a total of 611from 911to 1521of the great circle with center coordinates (210.6,211.3)and radius ¼182.6.2.4.Candidate circle detection by arc joinAfter the computation of the arcs,the next step is to join the arcs into circle candidates.To do this,we first sort all arcs with respect to their length in descending order,and start extending the longest arc first.The motivation for this decision comes from the observation that the longest arc is the closest to a full circle,so it must be extended and completed into a full circle before theFig.3.(a)An illustration of the blue right vertical segment in Fig.1(b)being approximated by 11consecutive line segments labeled v 1through v 11.The angle between line segments v 1and v 2(y 1),v 3and v 4ðy 3Þ,v 7and v 8ðy 7Þ,and v 10and v 11(y 10)are also illustrated.(b)Lines making up the blue right vertical segment in Fig.1(b).Fig.2.(a)A hypothetical edge segment being approximated by 18consecutive line segments labeled l 1through l 18.(b)The angle y i between v i and v i þ1are illustrated and colored with red,green or blue.If the angle is bigger than a high threshold,e.g.,y 1,y 2and y 9(colored red),or if the angle is smaller than a low threshold,e.g.,y 16(also colored red),then these lines cannot be part of an arc.Otherwise,if three or more consecutive lines turn to the left,e.g.,lines v 3through v 7(angles colored blue),then these lines may form an arc.Similarly,if three or more consecutive lines turn to the right,e.g.,lines v 10through v 16(angles colored green),then these lines may form an arc.(For interpretation of the references to color in this figure caption,the reader is referred to the web version of this article.)C.Akinlar,C.Topal /Pattern Recognition 46(2013)725–740728other arcswould.During the extension of an arc,the idea is to look for arcs having similar radii and close centers,and collect a list of candidate arcs that may be combined with the current arc.Given an arc A 1to extend into a full circle,we go over all detected arcs and generate a set of candidate arcs that may be joined with A 1.We have two criterions for arc join:(1)Radius difference constraint:The radius difference between A 1and the candidate arc A 2must be within some threshold.Specifically,if A 2’s radius is within 25%of A 1’s radius,then A 2is taken as a candidate for join;otherwise A 2cannot be joined with A 1.As an example,if A 1’s radius is 100,then all arcs whose radii are between 75and 125would be taken as candidates for arc join.(2)Center distance constraint:The distance between the center of A 1and the center of the candidate arc A 2must be within some threshold.Specifically,we require that the distance between the centers of A 1and A 2must not exceed 25%of A1’s radius.As an example,if A 1’s radius is 100,then all arcs whose centers are within 25pixels of A 1’s center would be taken as candidates for arc join assuming they also satisfy the radius difference constraint.Fig.5illustrates possible scenarios during arc join for circle detection.In Fig.5(a),we illustrate a case where all potential arc candidates satisfy the center distance constraint,but one fails the radius difference constraint.Here,A 1is the arc to be extended with A 2,A 3and A 4as potential candidates for arc join.As illustrated,the centers of all arcs are very close to each other;that is,the distance of the centers of A 2,A 3and A 4from the center of A 1are all within the center distance threshold r T .As for the radius difference constraint,only A 3and A 4satisfy it,while A 2’s radius falls out of the radius difference range.So in Fig.5(a),only arcs A 3and A 4would be selected as candidates for joining with A 1.In Fig.5(b),we illustrate a case where all potential arc candidates satisfy the radius difference constraint,but one fails the center distance constraint.Here,A 1is the arc to be extended with A 2,A 3and A 4as potential candidates for arc join.As illustrated,the radii ofall arcs are very close to each other,so they all satisfy the radius difference constraint.As for the center distance constraint,only A 2and A 4satisfy it,while A 3’s center falls out of the center distance threshold r T .So in Fig.5(b),only arcs A 2and A 4would be selected as candidates for joining with A 1.After the computation of the candidate arcs,the next step is to combine them one-by-one with the extended arc A 1by fitting a new circle to the pixels making up both of the arcs.Instead of trying the join in random order,we start with the arc whose either end-point is the closest to either end-point of A 1.The motivation for this decision comes from the observation that if there is more than one arc that is part of the same great circle,it is better to start the join with the arc closest to the extended arc A 1.In Fig.5(a)for example,we would first join A 1with A 4and then with A 3.Similarly,in Fig.5(b)we would first join A 1with A 2and then A 4.After an arc A 1is extended with other arcs on the same great circle,we decide at the last step whether to make the extended arc a circle candidate.Here,we take the view that if an arc spans at least 50%of the circumference of its great circle,then we make the arc a circle candidate.Otherwise,the arc is left for circular ellipse detection.In Fig.5(a)for example,when A 1joined with A 4and A 3,the extended arc would span more 50%of the circumference of its great circle.So the extended arc would be made a circle candidate.In Fig.5(c)however,when A 1,A 2and A 3are joined together,we observe that the extended arc does not span at least 50%of the circumference of its great circle,i.e.,y 1þy 2þy 3o p ;so the extended arc is not taken as a circle putation of the total arc span is performed by simply looking at the ratio of the total number of pixels making up the joined arcs to the circumference of the newly fitted circle.If this ratio is greater than 50%,then the extended arc is taken as a circle candidate.To exemplify the ideas presented above,here is how the seven arcs depicted in Fig.4(a)and detailed in Table 1would be processed:we first take A 1,the longest arc,as the arc to be extended,with A 2,A 3,A 4,A 5,A 6and A 7as the remaining arcs.Since the radii of A 2,A 3and A 4are within 25%of A 1’s radius and their center distances are within the center distance threshold,only these three arcs would be taken as candidates for join.We next join A 1and A 2since A 2’s end-point is closest to A 1(refer to Fig.4(a)).After A 1and A 2are joined,the extended arc would now be joined with A 3since A 3’s end-point would now be closest to the extended arc.Finally,A 4would be joined.Since the final extended arc covers more than 50%of its great circle,it is taken as a circle candidate.Continuing similarly,the next longest remaining arc is A 5,so we try extending A 5with A 6and A 7being the only remaining arcs in our list of arcs.The only candidateFig.4.(a)Arcs computed from the lines of Fig.1(c).(b)Candidate circles and ellipses before validation (overlayed on top of the image with red color).(For interpretation of the references to color in this figure caption,the reader is referred to the web version of this article.)Table 1Details of the arcs shown in Fig.4(a).An arc spans a part between (StartAngle,EndAngle)of a great circle specified by (center X,center Y,Radius).The arc moves counter-clockwise from StartAngle to EndAngle over the circle.ArcCenter XCenter YRadiusStart angle (deg.)End angle (deg.)A 1210.3211.8182.232586A 2210.6211.3182.691152A 3212.2215.9178.5275312A 4210.7211.6183.0173264A 5111.1267.552.3275312A 6120.1291.434.9141219A 7139.4288.649.294143C.Akinlar,C.Topal /Pattern Recognition 46(2013)725–740729。
小学下册英语上册试卷(含答案)英语试题一、综合题(本题有100小题,每小题1分,共100分.每小题不选、错误,均不给分)1. A rabbit's favorite snack is ______ (胡萝卜).2.What do we call the device used to take pictures?A. CameraB. ProjectorC. MonitorD. Scanner答案:A3.The ______ (植物的形状) can affect light absorption.4.At the park, children play with _________ (飞盘) and _________ (球).5.She has a nice ________.6.What is the capital of Lesotho?A. MaseruB. TeyateyanengC. MafetengD. Mohale's Hoek答案:A7.What is the name of our planet?A. MarsB. EarthC. VenusD. Jupiter答案:B8. A parrot can live for several _______ (年).9.The chemical formula for calcium acetate is _______.10.My sister is a ______. She studies very hard.11.What do caterpillars turn into?A. FrogsB. ButterfliesC. MothsD. Snakes答案:B Butterflies12.The ______ (植物的生态功能) is vital for balance.13.I enjoy ___ (playing) with my little brother.14.The bed is very ___ (comfortable).15.The __________ (历史的启蒙者) inspire change.16. A snake moves by __________ its body.17.What do you call the act of telling a story?A. NarrativeB. DescriptionC. ExpositionD. Dialogue答案:A18.What do we call a scientist who studies plants?A. ZoologistB. BotanistC. GeologistD. Biologist答案:B19.What do we call the time it takes for one full rotation of the Earth?A. DayB. MonthC. YearD. Hour答案:A20.The ______ is the part of the brain that controls emotions.21.The __________ (历史的记录) serve as cautionary tales.22.My friend is very ________.23.Flowers are often very _____ (美丽的).24.The __________ is known for its vibrant nightlife.25.The kangaroo hops to _______ (寻找) food.26.The capital of Ghana is __________.27.What is the name of the famous ship that sank in 1912?A. Queen MaryB. TitanicC. BritannicD. Lusitania答案:B28.How many sides does a hexagon have?A. 4B. 5C. 6D. 7答案:C29.What do we call the art of creating visual works?A. PaintingB. SculptureC. DrawingD. All of the Above答案:D30.What do we call a person who repairs cars?A. DoctorB. MechanicC. PilotD. Chef31.They are dancing at the ___. (party)32.My cousin is very __________ (勤奋的) and studies hard.33.The elephant's trunk is an extension of its ________________ (鼻子).34.I like to participate in school ______ (俱乐部) to meet new friends and learn new skills.35.What is the name of the famous landmark in Egypt?A. Great WallB. Eiffel TowerC. PyramidsD. Colosseum36.The first successful double organ transplant was performed in ________.37.What is the capital of Zambia?A. LusakaB. NdolaC. KitweD. Livingstone答案:A38.They are _____ (dancing) to music.39.I want to know more about _______ (主题). Learning is a lifelong _______ (旅程).40.The turtle moves very _______ (乌龟的动作很_______).41.We visit the ______ (博物馆) for educational trips.42. A chemical reaction can occur in _____ conditions.43.The fall of the Berlin Wall occurred in _______.44.n Tea Party was a protest against the ________ (茶税). The Brit45.What is the name of the famous bear who loves honey?A. Paddington BearB. Winnie the PoohC. Yogi BearD. Baloo答案:B46.What do we call a person who studies the impact of culture on society?A. AnthropologistB. SociologistC. HistorianD. Psychologist答案:A47.The butterfly's life cycle includes ______ (多个阶段).48.The rabbit loves to hop around the _________. (院子)49. A _______ is a diagram that shows the arrangement of electrons in an atom.50.The ______ (花期) of a plant varies with species.51.The cat is ___ (playing) with a ball of yarn.52.She has a _____ (happy/sad) face.53. A reaction that produces gas may cause bubbles to _______.54.What do we call the act of traveling to different countries?A. ExploringB. AdventuringC. TouringD. Vacationing答案:C55. A ____(community coalition) unites diverse stakeholders.56.I want to be a ______ (pilot) when I grow up.57.We will have a ______ tomorrow. (party)58.We have a ______ (美丽的) garden full of flowers.59.What do you call the line that separates two countries?A. BoundaryB. BorderC. FrontierD. Division60.I have a _____ (电子玩具) that lights up.61.The Earth’s crust is divided into many pieces called ______ plates.62.What do you call a series of events that happen one after another?A. SequenceB. CycleC. PatternD. Series答案:A63.The boiling point of a substance can change with ______.64.What do we call a person who flies an airplane?A. DriverB. PilotC. SailorD. Engineer答案:B65.What do we call the art of making pottery?A. SculptureB. PaintingC. PotteryD. Weaving答案:C66.Many chemical reactions require an increase in ________ to proceed.67.The __________ is a famous waterfall located in North America. (尼亚加拉大瀑布)68.The capital of the United Arab Emirates is ________ (阿布扎比).69.What do we call a scientist who studies ecosystems?A. EcologistB. BiologistC. Environmental scientistD. Conservationist答案:A70. A ______ is a type of animal that can climb very well.71.The ancient Romans used ________ to build their cities.72.When I feel _______ (情感), I like to listen to _______ (音乐类型). It makes me feel better.73.The __________ is a significant geological feature in South America. (安第斯山脉)74.I want to ___ a great chef. (become)75.The __________ is a critical area for studying geological processes.76. A _______ (海马) is a unique type of fish.77. A __________ is formed through the interaction of multiple geological processes.78.What is the name of the galaxy we live in?A. Milky WayB. AndromedaC. WhirlpoolD. Triangulum79.My dad works at a _____ (hospital/school).80.I enjoy ______ (与同事们合作).81.The cake is ________ and sweet.82.The cat stretches after a long ____.83.The __________ (历史的反响) echoes through time.84.I like to ___ (build) sandcastles at the beach.85.The __________ (历史的引导) shapes our future.86.I want to _______ (学习) about physics.87.What do you call a written message sent electronically?A. LetterB. EmailC. TextD. Note88.The __________ is essential for transporting blood in animals.89.My brother is interested in ____ (mathematics).90.I see a ______ (car) in the driveway.91.There are _____ (三) cats in the house.92.I want to ___ a new toy. (get)93.What animal says "moo"?A. DogB. CatC. CowD. Sheep答案:C94. A reaction that absorbs energy is known as an ______ reaction.95.Every Friday, we have a ________ (运动会) at school. I participate in the________ (跑步) race.96.What is the capital of Colombia?A. BogotáB. MedellínC. CaliD. Cartagena97. A solution that can conduct electricity is called an ______ solution.98.What do you call the process by which plants lose water?A. PhotosynthesisB. TranspirationC. RespirationD. Germination答案:B99.What do you call the process of a liquid turning into a solid?A. FreezingB. MeltingC. BoilingD. Evaporating答案:A100.What do we call the study of living things?A. BiologyB. ChemistryC. PhysicsD. Geology。
Dynamic and distribution of ammonia-oxidizing bacteria communities during sludge granulation in an anaerobic e aerobic sequencing batch reactorZhang Bin a ,b ,Chen Zhe a ,b ,Qiu Zhigang a ,b ,Jin Min a ,b ,Chen Zhiqiang a ,b ,Chen Zhaoli a ,b ,Li Junwen a ,b ,Wang Xuan c ,*,Wang Jingfeng a ,b ,**aInstitute of Hygiene and Environmental Medicine,Academy of Military Medical Sciences,Tianjin 300050,PR China bTianjin Key Laboratory of Risk Assessment and Control for Environment and Food Safety,Tianjin 300050,PR China cTianjin Key Laboratory of Hollow Fiber Membrane Material and Membrane Process,Institute of Biological and Chemical Engineering,Tianjin Polytechnical University,Tianjin 300160,PR Chinaa r t i c l e i n f oArticle history:Received 30June 2011Received in revised form 10September 2011Accepted 10September 2011Available online xxx Keywords:Ammonia-oxidizing bacteria Granular sludgeCommunity development Granule sizeNitrifying bacteria distribution Phylogenetic diversitya b s t r a c tThe structure dynamic of ammonia-oxidizing bacteria (AOB)community and the distribution of AOB and nitrite-oxidizing bacteria (NOB)in granular sludge from an anaerobic e aerobic sequencing batch reactor (SBR)were investigated.A combination of process studies,molecular biotechniques and microscale techniques were employed to identify and characterize these organisms.The AOB community structure in granules was substantially different from that of the initial pattern of the inoculants sludge.Along with granules formation,the AOB diversity declined due to the selection pressure imposed by process conditions.Denaturing gradient gel electrophoresis (DGGE)and sequencing results demonstrated that most of Nitrosomonas in the inoculating sludge were remained because of their ability to rapidly adapt to the settling e washing out action.Furthermore,DGGE analysis revealed that larger granules benefit more AOB species surviving in the reactor.In the SBR were various size granules coexisted,granule diameter affected the distribution range of AOB and NOB.Small and medium granules (d <0.6mm)cannot restrict oxygen mass transfer in all spaces of the rger granules (d >0.9mm)can result in smaller aerobic volume fraction and inhibition of NOB growth.All these observations provide support to future studies on the mechanisms responsible for the AOB in granules systems.ª2011Elsevier Ltd.All rights reserved.1.IntroductionAt sufficiently high levels,ammonia in aquatic environments can be toxic to aquatic life and can contribute to eutrophica-tion.Accordingly,biodegradation and elimination of ammonia in wastewater are the primary functions of thewastewater treatment process.Nitrification,the conversion of ammonia to nitrate via nitrite,is an important way to remove ammonia nitrogen.It is a two-step process catalyzed by ammonia-oxidizing and nitrite-oxidizing bacteria (AOB and NOB).Aerobic ammonia-oxidation is often the first,rate-limiting step of nitrification;however,it is essential for the*Corresponding author .**Corresponding author.Institute of Hygiene and Environmental Medicine,Academy of Military Medical Sciences,Tianjin 300050,PR China.Tel.:+862284655498;fax:+862223328809.E-mail addresses:wangxuan0116@ (W.Xuan),jingfengwang@ (W.Jingfeng).Available online atjournal homepage:/locate/watresw a t e r r e s e a r c h x x x (2011)1e 100043-1354/$e see front matter ª2011Elsevier Ltd.All rights reserved.doi:10.1016/j.watres.2011.09.026removal of ammonia from the wastewater(Prosser and Nicol, 2008).Comparative analyses of16S rRNA sequences have revealed that most AOB in activated sludge are phylogeneti-cally closely related to the clade of b-Proteobacteria (Kowalchuk and Stephen,2001).However,a number of studies have suggested that there are physiological and ecological differences between different AOB genera and lineages,and that environmental factors such as process parameter,dis-solved oxygen,salinity,pH,and concentrations of free ammonia can impact certain species of AOB(Erguder et al., 2008;Kim et al.,2006;Koops and Pommerening-Ro¨ser,2001; Kowalchuk and Stephen,2001;Shi et al.,2010).Therefore, the physiological activity and abundance of AOB in waste-water processing is critical in the design and operation of waste treatment systems.For this reason,a better under-standing of the ecology and microbiology of AOB in waste-water treatment systems is necessary to enhance treatment performance.Recently,several developed techniques have served as valuable tools for the characterization of microbial diversity in biological wastewater treatment systems(Li et al., 2008;Yin and Xu,2009).Currently,the application of molec-ular biotechniques can provide clarification of the ammonia-oxidizing community in detail(Haseborg et al.,2010;Tawan et al.,2005;Vlaeminck et al.,2010).In recent years,the aerobic granular sludge process has become an attractive alternative to conventional processes for wastewater treatment mainly due to its cell immobilization strategy(de Bruin et al.,2004;Liu et al.,2009;Schwarzenbeck et al.,2005;Schwarzenbeck et al.,2004a,b;Xavier et al.,2007). Granules have a more tightly compact structure(Li et al.,2008; Liu and Tay,2008;Wang et al.,2004)and rapid settling velocity (Kong et al.,2009;Lemaire et al.,2008).Therefore,granular sludge systems have a higher mixed liquid suspended sludge (MLSS)concentration and longer solid retention times(SRT) than conventional activated sludge systems.Longer SRT can provide enough time for the growth of organisms that require a long generation time(e.g.,AOB).Some studies have indicated that nitrifying granules can be cultivated with ammonia-rich inorganic wastewater and the diameter of granules was small (Shi et al.,2010;Tsuneda et al.,2003).Other researchers reported that larger granules have been developed with the synthetic organic wastewater in sequencing batch reactors(SBRs)(Li et al., 2008;Liu and Tay,2008).The diverse populations of microor-ganisms that coexist in granules remove the chemical oxygen demand(COD),nitrogen and phosphate(de Kreuk et al.,2005). However,for larger granules with a particle diameter greater than0.6mm,an outer aerobic shell and an inner anaerobic zone coexist because of restricted oxygen diffusion to the granule core.These properties of granular sludge suggest that the inner environment of granules is unfavorable to AOB growth.Some research has shown that particle size and density induced the different distribution and dominance of AOB,NOB and anam-mox(Winkler et al.,2011b).Although a number of studies have been conducted to assess the ecology and microbiology of AOB in wastewater treatment systems,the information on the dynamics,distribution,and quantification of AOB communities during sludge granulation is still limited up to now.To address these concerns,the main objective of the present work was to investigate the population dynamics of AOB communities during the development of seedingflocs into granules,and the distribution of AOB and NOB in different size granules from an anaerobic e aerobic SBR.A combination of process studies,molecular biotechniques and microscale techniques were employed to identify and char-acterize these organisms.Based on these approaches,we demonstrate the differences in both AOB community evolu-tion and composition of theflocs and granules co-existing in the SBR and further elucidate the relationship between distribution of nitrifying bacteria and granule size.It is ex-pected that the work would be useful to better understand the mechanisms responsible for the AOB in granules and apply them for optimal control and management strategies of granulation systems.2.Material and methods2.1.Reactor set-up and operationThe granules were cultivated in a lab-scale SBR with an effective volume of4L.The effective diameter and height of the reactor was10cm and51cm,respectively.The hydraulic retention time was set at8h.Activated sludge from a full-scale sewage treat-ment plant(Jizhuangzi Sewage Treatment Works,Tianjin, China)was used as the seed sludge for the reactor at an initial sludge concentration of3876mg LÀ1in MLSS.The reactor was operated on6-h cycles,consisting of2-min influent feeding,90-min anaerobic phase(mixing),240-min aeration phase and5-min effluent discharge periods.The sludge settling time was reduced gradually from10to5min after80SBR cycles in20days, and only particles with a settling velocity higher than4.5m hÀ1 were retained in the reactor.The composition of the influent media were NaAc(450mg LÀ1),NH4Cl(100mg LÀ1),(NH4)2SO4 (10mg LÀ1),KH2PO4(20mg LÀ1),MgSO4$7H2O(50mg LÀ1),KCl (20mg LÀ1),CaCl2(20mg LÀ1),FeSO4$7H2O(1mg LÀ1),pH7.0e7.5, and0.1mL LÀ1trace element solution(Li et al.,2007).Analytical methods-The total organic carbon(TOC),NHþ4e N, NOÀ2e N,NOÀ3e N,total nitrogen(TN),total phosphate(TP) concentration,mixed liquid suspended solids(MLSS) concentration,and sludge volume index at10min(SVI10)were measured regularly according to the standard methods (APHA-AWWA-WEF,2005).Sludge size distribution was determined by the sieving method(Laguna et al.,1999).Screening was performed with four stainless steel sieves of5cm diameter having respective mesh openings of0.9,0.6,0.45,and0.2mm.A100mL volume of sludge from the reactor was sampled with a calibrated cylinder and then deposited on the0.9mm mesh sieve.The sample was subsequently washed with distilled water and particles less than0.9mm in diameter passed through this sieve to the sieves with smaller openings.The washing procedure was repeated several times to separate the gran-ules.The granules collected on the different screens were recovered by backwashing with distilled water.Each fraction was collected in a different beaker andfiltered on quantitative filter paper to determine the total suspended solid(TSS).Once the amount of total suspended solid(TSS)retained on each sieve was acquired,it was reasonable to determine for each class of size(<0.2,[0.2e0.45],[0.45e0.6],[0.6e0.9],>0.9mm) the percentage of the total weight that they represent.w a t e r r e s e a r c h x x x(2011)1e10 22.2.DNA extraction and nested PCR e DGGEThe sludge from approximately8mg of MLSS was transferred into a1.5-mL Eppendorf tube and then centrifuged at14,000g for10min.The supernatant was removed,and the pellet was added to1mL of sodium phosphate buffer solution and aseptically mixed with a sterilized pestle in order to detach granules.Genomic DNA was extracted from the pellets using E.Z.N.A.äSoil DNA kit(D5625-01,Omega Bio-tek Inc.,USA).To amplify ammonia-oxidizer specific16S rRNA for dena-turing gradient gel electrophoresis(DGGE),a nested PCR approach was performed as described previously(Zhang et al., 2010).30m l of nested PCR amplicons(with5m l6Âloading buffer)were loaded and separated by DGGE on polyacrylamide gels(8%,37.5:1acrylamide e bisacrylamide)with a linear gradient of35%e55%denaturant(100%denaturant¼7M urea plus40%formamide).The gel was run for6.5h at140V in 1ÂTAE buffer(40mM Tris-acetate,20mM sodium acetate, 1mM Na2EDTA,pH7.4)maintained at60 C(DCodeäUniversal Mutation Detection System,Bio-Rad,Hercules,CA, USA).After electrophoresis,silver-staining and development of the gels were performed as described by Sanguinetti et al. (1994).These were followed by air-drying and scanning with a gel imaging analysis system(Image Quant350,GE Inc.,USA). The gel images were analyzed with the software Quantity One,version4.31(Bio-rad).Dice index(Cs)of pair wise community similarity was calculated to evaluate the similarity of the AOB community among DGGE lanes(LaPara et al.,2002).This index ranges from0%(no common band)to100%(identical band patterns) with the assistance of Quantity One.The Shannon diversity index(H)was used to measure the microbial diversity that takes into account the richness and proportion of each species in a population.H was calculatedusing the following equation:H¼ÀPn iNlogn iN,where n i/Nis the proportion of community made up by species i(bright-ness of the band i/total brightness of all bands in the lane).Dendrograms relating band pattern similarities were automatically calculated without band weighting(consider-ation of band density)by the unweighted pair group method with arithmetic mean(UPGMA)algorithms in the Quantity One software.Prominent DGGE bands were excised and dissolved in30m L Milli-Q water overnight,at4 C.DNA was recovered from the gel by freeze e thawing thrice.Cloning and sequencing of the target DNA fragments were conducted following the estab-lished method(Zhang et al.,2010).2.3.Distribution of nitrifying bacteriaThree classes of size([0.2e0.45],[0.45e0.6],>0.9mm)were chosen on day180for FISH analysis in order to investigate the spatial distribution characteristics of AOB and NOB in granules.2mg sludge samples werefixed in4%para-formaldehyde solution for16e24h at4 C and then washed twice with sodium phosphate buffer;the samples were dehydrated in50%,80%and100%ethanol for10min each. Ethanol in the granules was then completely replaced by xylene by serial immersion in ethanol-xylene solutions of3:1, 1:1,and1:3by volume andfinally in100%xylene,for10min periods at room temperature.Subsequently,the granules were embedded in paraffin(m.p.56e58 C)by serial immer-sion in1:1xylene-paraffin for30min at60 C,followed by 100%paraffin.After solidification in paraffin,8-m m-thick sections were prepared and placed on gelatin-coated micro-scopic slides.Paraffin was removed by immersing the slide in xylene and ethanol for30min each,followed by air-drying of the slides.The three oligonucleotide probes were used for hybridiza-tion(Downing and Nerenberg,2008):FITC-labeled Nso190, which targets the majority of AOB;TRITC-labeled NIT3,which targets Nitrobacter sp.;TRITC-labeled NSR1156,which targets Nitrospira sp.All probe sequences,their hybridization condi-tions,and washing conditions are given in Table1.Oligonu-cleotides were synthesized andfluorescently labeled with fluorochomes by Takara,Inc.(Dalian,China).Hybridizations were performed at46 C for2h with a hybridization buffer(0.9M NaCl,formamide at the percentage shown in Table1,20mM Tris/HCl,pH8.0,0.01% SDS)containing each labeled probe(5ng m LÀ1).After hybrid-ization,unbound oligonucleotides were removed by a strin-gent washing step at48 C for15min in washing buffer containing the same components as the hybridization buffer except for the probes.For detection of all DNA,4,6-diamidino-2-phenylindole (DAPI)was diluted with methanol to afinal concentration of1ng m LÀ1.Cover the slides with DAPI e methanol and incubate for15min at37 C.The slides were subsequently washed once with methanol,rinsed briefly with ddH2O and immediately air-dried.Vectashield(Vector Laboratories)was used to prevent photo bleaching.The hybridization images were captured using a confocal laser scanning microscope (CLSM,Zeiss710).A total of10images were captured for each probe at each class of size.The representative images were selected andfinal image evaluation was done in Adobe PhotoShop.w a t e r r e s e a r c h x x x(2011)1e1033.Results3.1.SBR performance and granule characteristicsDuring the startup period,the reactor removed TOC and NH 4þ-N efficiently.98%of NH 4þ-N and 100%of TOC were removed from the influent by day 3and day 5respectively (Figs.S2,S3,Supporting information ).Removal of TN and TP were lower during this period (Figs.S3,S4,Supporting information ),though the removal of TP gradually improved to 100%removal by day 33(Fig.S4,Supporting information ).To determine the sludge volume index of granular sludge,a settling time of 10min was chosen instead of 30min,because granular sludge has a similar SVI after 60min and after 5min of settling (Schwarzenbeck et al.,2004b ).The SVI 10of the inoculating sludge was 108.2mL g À1.The changing patterns of MLSS and SVI 10in the continuous operation of the SBR are illustrated in Fig.1.The sludge settleability increased markedly during the set-up period.Fig.2reflects the slow andgradual process of sludge granulation,i.e.,from flocculentsludge to granules.3.2.DGGE analysis:AOB communities structure changes during sludge granulationThe results of nested PCR were shown in Fig.S1.The well-resolved DGGE bands were obtained at the representative points throughout the GSBR operation and the patterns revealed that the structure of the AOB communities was dynamic during sludge granulation and stabilization (Fig.3).The community structure at the end of experiment was different from that of the initial pattern of the seed sludge.The AOB communities on day 1showed 40%similarity only to that at the end of the GSBR operation (Table S1,Supporting information ),indicating the considerable difference of AOB communities structures between inoculated sludge and granular sludge.Biodiversity based on the DGGE patterns was analyzed by calculating the Shannon diversity index H as204060801001201401254159738494104115125135147160172188Time (d)S V I 10 (m L .g -1)10002000300040005000600070008000900010000M L S S (m g .L -1)Fig.1e Change in biomass content and SVI 10during whole operation.SVI,sludge volume index;MLSS,mixed liquid suspendedsolids.Fig.2e Variation in granule size distribution in the sludge during operation.d,particle diameter;TSS,total suspended solids.w a t e r r e s e a r c h x x x (2011)1e 104shown in Fig.S5.In the phase of sludge inoculation (before day 38),H decreased remarkably (from 0.94to 0.75)due to the absence of some species in the reactor.Though several dominant species (bands2,7,10,11)in the inoculating sludge were preserved,many bands disappeared or weakened (bands 3,4,6,8,13,14,15).After day 45,the diversity index tended to be stable and showed small fluctuation (from 0.72to 0.82).Banding pattern similarity was analyzed by applying UPGMA (Fig.4)algorithms.The UPGMA analysis showed three groups with intragroup similarity at approximately 67%e 78%and intergroup similarity at 44e 62%.Generally,the clustering followed the time course;and the algorithms showed a closer clustering of groups II and III.In the analysis,group I was associated with sludge inoculation and washout,group IIwithFig.3e DGGE profile of the AOB communities in the SBR during the sludge granulation process (lane labels along the top show the sampling time (days)from startup of the bioreactor).The major bands were labeled with the numbers (bands 1e15).Fig.4e UPGMA analysis dendrograms of AOB community DGGE banding patterns,showing schematics of banding patterns.Roman numerals indicate major clusters.w a t e r r e s e a r c h x x x (2011)1e 105startup sludge granulation and decreasing SVI 10,and group III with a stable system and excellent biomass settleability.In Fig.3,the locations of the predominant bands were excised from the gel.DNA in these bands were reamplified,cloned and sequenced.The comparative analysis of these partial 16S rRNA sequences (Table 2and Fig.S6)revealed the phylogenetic affiliation of 13sequences retrieved.The majority of the bacteria in seed sludge grouped with members of Nitrosomonas and Nitrosospira .Along with sludge granula-tion,most of Nitrosomonas (Bands 2,5,7,9,10,11)were remained or eventually became dominant in GSBR;however,all of Nitrosospira (Bands 6,13,15)were gradually eliminated from the reactor.3.3.Distribution of AOB and NOB in different sized granulesFISH was performed on the granule sections mainly to deter-mine the location of AOB and NOB within the different size classes of granules,and the images were not further analyzed for quantification of cell counts.As shown in Fig.6,in small granules (0.2mm <d <0.45mm),AOB located mainly in the outer part of granular space,whereas NOB were detected only in the core of granules.In medium granules (0.45mm <d <0.6mm),AOB distributed evenly throughout the whole granular space,whereas NOB still existed in the inner part.In the larger granules (d >0.9mm),AOB and NOB were mostly located in the surface area of the granules,and moreover,NOB became rare.4.Discussion4.1.Relationship between granule formation and reactor performanceAfter day 32,the SVI 10stabilized at 20e 35mL g À1,which is very low compared to the values measured for activated sludge (100e 150mL g À1).However,the size distribution of the granules measured on day 32(Fig.2)indicated that only 22%of the biomass was made of granular sludge with diameter largerthan 0.2mm.These results suggest that sludge settleability increased prior to granule formation and was not affected by different particle sizes in the sludge during the GSBR operation.It was observed,however,that the diameter of the granules fluctuated over longer durations.The large granules tended to destabilize due to endogenous respiration,and broke into smaller granules that could seed the formation of large granules again.Pochana and Keller reported that physically broken sludge flocs contribute to lower denitrification rates,due to their reduced anoxic zone (Pochana and Keller,1999).Therefore,TN removal efficiency raises fluctuantly throughout the experiment.Some previous research had demonstrated that bigger,more dense granules favored the enrichment of PAO (Winkler et al.,2011a ).Hence,after day 77,removal efficiency of TP was higher and relatively stable because the granules mass fraction was over 90%and more larger granules formed.4.2.Relationship between AOB communities dynamic and sludge granulationFor granule formation,a short settling time was set,and only particles with a settling velocity higher than 4.5m h À1were retained in the reactor.Moreover,as shown in Fig.1,the variation in SVI 10was greater before day 41(from 108.2mL g À1e 34.1mL g À1).During this phase,large amounts of biomass could not survive in the reactor.A clear shift in pop-ulations was evident,with 58%similarity between days 8and 18(Table S1).In the SBR system fed with acetate-based synthetic wastewater,heterotrophic bacteria can produce much larger amounts of extracellular polysaccharides than autotrophic bacteria (Tsuneda et al.,2003).Some researchers found that microorganisms in high shear environments adhered by extracellular polymeric substances (EPS)to resist the damage of suspended cells by environmental forces (Trinet et al.,1991).Additionally,it had been proved that the dominant heterotrophic species in the inoculating sludge were preserved throughout the process in our previous research (Zhang et al.,2011).It is well known that AOB are chemoau-totrophic and slow-growing;accordingly,numerous AOBw a t e r r e s e a r c h x x x (2011)1e 106populations that cannot become big and dense enough to settle fast were washed out from the system.As a result,the variation in AOB was remarkable in the period of sludge inoculation,and the diversity index of population decreased rapidly.After day 45,AOB communities’structure became stable due to the improvement of sludge settleability and the retention of more biomass.These results suggest that the short settling time (selection pressure)apparently stressed the biomass,leading to a violent dynamic of AOB communities.Further,these results suggest that certain populations may have been responsible for the operational success of the GSBR and were able to persist despite the large fluctuations in pop-ulation similarity.This bacterial population instability,coupled with a generally acceptable bioreactor performance,is congruent with the results obtained from a membrane biore-actor (MBR)for graywater treatment (Stamper et al.,2003).Nitrosomonas e like and Nitrosospira e like populations are the dominant AOB populations in wastewater treatment systems (Kowalchuk and Stephen,2001).A few previous studies revealed that the predominant populations in AOB communities are different in various wastewater treatment processes (Tawan et al.,2005;Thomas et al.,2010).Some researchers found that the community was dominated by AOB from the genus Nitrosospira in MBRs (Zhang et al.,2010),whereas Nitrosomonas sp.is the predominant population in biofilter sludge (Yin and Xu,2009).In the currentstudy,Fig.5e DGGE profile of the AOB communities in different size of granules (lane labels along the top show the range of particle diameter (d,mm)).Values along the bottom indicate the Shannon diversity index (H ).Bands labeled with the numbers were consistent with the bands in Fig.3.w a t e r r e s e a r c h x x x (2011)1e 107sequence analysis revealed that selection pressure evidently effect on the survival of Nitrosospira in granular sludge.Almost all of Nitrosospira were washed out initially and had no chance to evolve with the environmental changes.However,some members of Nitrosomonas sp.have been shown to produce more amounts of EPS than Nitrosospira ,especially under limited ammonia conditions (Stehr et al.,1995);and this feature has also been observed for other members of the same lineage.Accordingly,these EPS are helpful to communicate cells with each other and granulate sludge (Adav et al.,2008).Therefore,most of Nitrosomonas could adapt to this challenge (to become big and dense enough to settle fast)and were retained in the reactor.At the end of reactor operation (day 180),granules with different particle size were sieved.The effects of variation in granules size on the composition of the AOBcommunitiesFig.6e Micrographs of FISH performed on three size classes of granule sections.DAPI stain micrographs (A,D,G);AOB appear as green fluorescence (B,E,H),and NOB appear as red fluorescence (C,F,I).Bar [100m m in (A)e (C)and (G)e (I).d,particle diameter.(For interpretation of the references to colour in this figure legend,the reader is referred to the web version of this article.)w a t e r r e s e a r c h x x x (2011)1e 108were investigated.As shown in Fig.5,AOB communities structures in different size of granules were varied.Although several predominant bands(bands2,5,11)were present in all samples,only bands3and6appeared in the granules with diameters larger than0.6mm.Additionally,bands7and10 were intense in the granules larger than0.45mm.According to Table2,it can be clearly indicated that Nitrosospira could be retained merely in the granules larger than0.6mm.Therefore, Nitrosospira was not present at a high level in Fig.3due to the lower proportion of larger granules(d>0.6mm)in TSS along with reactor operation.DGGE analysis also revealed that larger granules had a greater microbial diversity than smaller ones. This result also demonstrates that more organisms can survive in larger granules as a result of more space,which can provide the suitable environment for the growth of microbes(Fig.6).4.3.Effect of variance in particle size on the distribution of AOB and NOB in granulesAlthough an influence of granule size has been observed in experiments and simulations for simultaneous N-and P-removal(de Kreuk et al.,2007),the effect of granule size on the distribution of different biomass species need be revealed further with the assistance of visible experimental results, especially in the same granular sludge reactors.Related studies on the diversity of bacterial communities in granular sludge often focus on the distribution of important functional bacteria populations in single-size granules(Matsumoto et al., 2010).In the present study,different size granules were sieved,and the distribution patterns of AOB and NOB were explored.In the nitrification processes considered,AOB and NOB compete for space and oxygen in the granules(Volcke et al.,2010).Since ammonium oxidizers have a higheroxygen affinity(K AOBO2<K NOBO2)and accumulate more rapidly inthe reactor than nitrite oxidizers(Volcke et al.,2010),NOB are located just below the layer of AOB,where still some oxygen is present and allows ready access to the nitrite produced.In smaller granules,the location boundaries of the both biomass species were distinct due to the limited existence space provided by granules for both microorganism’s growth.AOB exist outside of the granules where oxygen and ammonia are present.Medium granules can provide broader space for microbe multiplying;accordingly,AOB spread out in the whole granules.This result also confirms that oxygen could penetrate deep into the granule’s core without restriction when particle diameter is less than0.6mm.Some mathematic model also supposed that NOBs are favored to grow in smaller granules because of the higher fractional aerobic volume (Volcke et al.,2010).As shown in the results of the batch experiments(Zhang et al.,2011),nitrite accumulation temporarily occurred,accompanied by the more large gran-ules(d>0.9mm)forming.This phenomenon can be attrib-uted to the increased ammonium surface load associated with larger granules and smaller aerobic volume fraction,resulting in outcompetes of NOB.It also suggests that the core areas of large granules(d>0.9mm)could provide anoxic environment for the growth of anaerobic denitrificans(such as Tb.deni-trificans or Tb.thioparus in Fig.S7,Supporting information).As shown in Fig.2and Fig.S3,the removal efficiency of total nitrogen increased with formation of larger granules.5.ConclusionsThe variation in AOB communities’structure was remarkable during sludge inoculation,and the diversity index of pop-ulation decreased rapidly.Most of Nitrosomonas in the inocu-lating sludge were retained because of their capability to rapidly adapt to the settling e washing out action.DGGE anal-ysis also revealed that larger granules had greater AOB diversity than that of smaller ones.Oxygen penetration was not restricted in the granules of less than0.6mm particle diameter.However,the larger granules(d>0.9mm)can result in the smaller aerobic volume fraction and inhibition of NOB growth.Henceforth,further studies on controlling and opti-mizing distribution of granule size could be beneficial to the nitrogen removal and expansive application of granular sludge technology.AcknowledgmentsThis work was supported by grants from the National Natural Science Foundation of China(No.51108456,50908227)and the National High Technology Research and Development Program of China(No.2009AA06Z312).Appendix.Supplementary dataSupplementary data associated with this article can be found in online version at doi:10.1016/j.watres.2011.09.026.r e f e r e n c e sAdav,S.S.,Lee, D.J.,Show,K.Y.,2008.Aerobic granular sludge:recent advances.Biotechnology Advances26,411e423.APHA-AWWA-WEF,2005.Standard Methods for the Examination of Water and Wastewater,first ed.American Public Health Association/American Water Works Association/WaterEnvironment Federation,Washington,DC.de Bruin,L.M.,de Kreuk,M.,van der Roest,H.F.,Uijterlinde,C., van Loosdrecht,M.C.M.,2004.Aerobic granular sludgetechnology:an alternative to activated sludge?Water Science and Technology49,1e7.de Kreuk,M.,Heijnen,J.J.,van Loosdrecht,M.C.M.,2005.Simultaneous COD,nitrogen,and phosphate removal byaerobic granular sludge.Biotechnology and Bioengineering90, 761e769.de Kreuk,M.,Picioreanu,C.,Hosseini,M.,Xavier,J.B.,van Loosdrecht,M.C.M.,2007.Kinetic model of a granular sludge SBR:influences on nutrient removal.Biotechnology andBioengineering97,801e815.Downing,L.S.,Nerenberg,R.,2008.Total nitrogen removal ina hybrid,membrane-aerated activated sludge process.WaterResearch42,3697e3708.Erguder,T.H.,Boon,N.,Vlaeminck,S.E.,Verstraete,W.,2008.Partial nitrification achieved by pulse sulfide doses ina sequential batch reactor.Environmental Science andTechnology42,8715e8720.w a t e r r e s e a r c h x x x(2011)1e109。
A BSTRACTTesting semiconductor memories is increasingly important today because of the high density of current memory chips. In this paper we investigate on the various functional fault models present for today’s memory technology and discuss about the ways and means to detect these faults. Emphasis is laid on classical Walking, Galloping and March pattern tests, which are widely used today for testing chip level, array level and board level functional memory defects. Test strategies and pseudo code for implementation of these techniques are also discussed.I NTRODUCTIONThis paper talks about walking, marching and galloping pattern tests for RAM. Random access memory circuits are among some of the highly dense VLSI circuits that are being fabricated today. Since the transistor lines are very close to each other, RAM circuits suffer from a very high average number of physical defects per unit chip area compared with other circuits. This fact has motivated researchers to develop efficient RAM test sequences that provide good fault coverage.For testing today’s high density memories traditional algorithms take too much test time. For instance GALPAT and WALKING I/O [5][6] require test times of order n2 and n3/2(where n is the number of bits in the chip). At that rate, assuming a cycle time of 100 ns, testing a 16Mbit chip would require 500 hours for an n2 test and 860 seconds for an order n3/2 test. Other older tests, such as Zero-One and Checkerboard, are of order n, but they have poor fault coverage. Table 1 shows the memory testing time as a function of memory size.Table 1:Test time as a function of memory size [1] This paper introduces the different fault models used for today’s RAM technologies. It then talks about the different memory testing approaches like walking, marching & galloping pattern tests and analyses the effectiveness and implementation of each of these tests.H ISTORICAL N OTE &T ODAY’S T ECHNOLOGY Through the years different approaches have been investigated and proposed for testing memories. The most traditional approach is to simply apply a sequence of test patterns to the I/O pins and test the functionality of the memory, this approach is investigated in this paper in detail. Checker board, Zero-one (MSCAN) [Breuer & Friedman, 1976], Walking [2], galloping [3], MARCH tests –Marching 1/0 test [Breuer & Friedman, 1976][5], MATS Test [Nair, Thatte & Abraham, 1979][8], MATS+ Test [Abadir & Reghbati, 1983], MATS++ [Goor, 1991], MARCH X [unpublished], MARCH C [Marinescu, 1982][10], MARCH C- [Goor, 1991], MARCH AWalking, marching and galloping patterns formemory testsTerm paper – ELEC 7250Submitted by - Arvind Raghuraman[Suk & Reddy, 1981], MARCH Y [unpublished ], MARCH B [Suk & Reddy, 1981][9], butterfly, Moving inversion (MOVI) [De Jonge & Smeulders, 1976], surround disturb are some traditional pattern tests [2][3]. A more recent approach is to redesign and augment the peripheral circuit surrounding the RAM circuit to improve the testability, popularly referred as design for testability. The other approaches propose adding even more extra hardware to the memory circuitry to realize built-in self test (BIST). But the advantage of these memory design modifications are often offset by the overhead ’s that they introduce. This overhead is a function of memory size in terms of extra hardware, so we can substantiate its presence for large memories. The main goal behind these approaches is to reduce the memory testing time which rises exponentially with memory size.M EMORY F AILURE M ODES : Classical fault models are not sufficient to represent all important failure modes in a RAM; Functional Fault models should be employed. Memory Fault models can be classified under the categories shown below, brief descriptions of the models are given as follows.Figure 1.0 Memory cell faults 1. Stuck-at fault (SAF): cell or line s-a-0 or s-a-1 [1]. 2. Stuck-open fault (SOF): open cell or broken line . 3. Transition fault (TF): cell fails to transit [1]. 4. Data retention fault (DRF): cell fails to retain its logic value after some specified time due to, e.g., leakage, resistor opens, or feedback path opens [2]. 5. Coupling fault (CF): Coupling faults are of three types [1].• Inversion coupling fault (CFin): a transitionin one cell (aggressor) inverts the content of another cell (victim). [1,3]• Idempotent coupling fault (CFid): a transitionin one cell forces a fixed logic value into another cell. [1,3]• State coupling fault (CFst): a cell/line isforced to a fixed state only if the coupling cell/line is in a given state (a.k.a. pattern sensitivity fault (PSF)). [1,3]6. Bridging fault (BF): short between cells (can be AND type or OR type) [1]7. Neighborhood Pattern Sensitive Fault (NPSF) [1] 8. Active (Dynamic) NPSF [1] 9. Passive NPSF [1] 10. Static NPSF [1]Address decoder faults (AFs)1. No cell accessed by certain address [1,3].2. Multiple cells accessed by certain address [1,3].3. Certain cell not accessed by any address [1,3].4. Certain cell accessed by multiple addresses [1].Dynamic Faults1. Recovery faults: when some part of the memorycannot recover fast enough from a previous state [2].• Sense amplifier recovery: sense amplifiersaturation after reading/writing a long string of 0s or 1s. • Write recovery: a write followed by a read orwrite at a different location resulting in reading or writing at the same location due to slow address decoder. 2. Disturb faults: victim cell forced to 0 or 1 if we read or write aggressor cell (may be the same cell)[2].3. Data Retention faults: memory loses its content spontaneously, not caused by read or write [2]. • DRAM refresh fault: Refresh-line stuck-at fault • DRAM leakage fault: Sleeping sickness —loose data in less than specified hold time (typically hundreds of micro sec to tens of ms); caused by chargeleakage or environment sensitivity; usually affects a row or a column.Static data losses —defective pull-up device Inducing excessive leakage currents which can change the state of a cell Checkerboard pattern triggers max leakage.A LGORITHM ’S AND ANALYSIS :MARCH tests [2,3]:A MARCH test consists of a finite sequence of March elements, while a March element is a finite sequence of operations applied to every cell in the memory array before proceeding to the next cell. An o peration can consist of writing a 0 into a cell (w0), writing a 1 into a cell (w1), reading an expected 0 from a cell (r0), and reading an expected 1 from a cell (r1).MARCH Test Notations:Some of the most popular notations for MARCH tests which will be used through out his paper are shown below [1].)1(:1)0(:0)1(:1)0(:0:::cell a to a writing operation write w cell a to a writing operation write w cell a from a reading operation read r cell a from a reading operation read r way either change can sequence address orderdescending in changes sequence address orderascending in changes sequence addresss ⇓⇑ MARCHING 1/0 Test [Breuer & Friedman, 1976][5]:The MARCHING 1/0 is a test of 14n complexity. It is a complete test for AF ’s, SAF ’s and TF ’s but has the ability to detect only a part of CF ’s [2]. The test sequence is given as follows.tion implementa for pseudocode r w r r w r w r w r r w r w MARCHING })1,1,0();0,0,1();1();0,0,1();1,1,0();0({:0/1⇓⇑⇑⇓⇑⇑;}{1}};{){}0)((};{};0)(){1)(({);0);1((}};{){}1)((};{};1)(){0)(({));1(;0(0pass return reversed data with sequence same the repeat to background write return else i m if return else i m i m if i i n i for return else i m if return else i m i m if i n i i for to background write ===−−>=−====++−<==MATS Test [Nair, Thatte & Abraham, 1979][8]:MATS stands for Modified Algorithmic Test Sequence . MATS is the shortest MARCH test for unlinked SAF ’s in memory cell array and read/write logic circuitry [3]. The algorithm can detect all faults for OR type technology since the result of readingmultiple cells is considered as an OR function of thecontents of those cells. This Algorithm can also beused for AF ’s of AND type technology using the MATS-AND test sequence given below [2]. TheMATS Algorithm has a complexity of 4n with abetter fault coverage compared to equivalent zero-one and checkerboard tests [2].})0();0,1();1({:})1();1,0();0({:r w r w type AND MATS r w r w type OR MATS −−}};{){}1)(({));1(;0(}};{});1)(){0)(({));1(;0(};0)({));1(;0(fail return else i m if i n i i for fail return else i m i m if i n i i for i m i n i i for tion implementa for pseudocode =++−<====++−<===++−<==MATS+ Test [Abadir & Reghbati, 1983]:The MATS+ test sequence detects all SAF ’s and AF ’s, its often used instead of MATS when the technology used under test is unknown. The MATS+ algorithm has a test complexity of 5n.;}};{};0)(){1)(({);0);1({}};{};1)(){0)(({));1(;0(;0:})0,1();1,0();0({:pass return fail return else i m i m if i i n i for fail return else i m i m if i n i i for to background write tion implementa for pseudocode w r w r w MATS ==−−>=−===++−<==⇓⇑+MATS++ [Goor, 1991]:The MATS++ test sequence is a complete, irredundant, & optimized test sequence. It is similar to the MATS+ test but allows fault coverage for TF ’s. Recommended test of 6n test complexity for unlinked SAF ’s and TF ’s.0})0,0,1();1,0();0({:to background write tion implementa for pseudocode r w r w r w MATS ⇓⇑++;}};{){}0)((};{};0)(){1)(({);0);1((}};{};1)(){0)(({));1(;0(pass return fail return else i m if fail return else i m i m if i i n i for fail return else i m i m if i n i i for ===−−>=−===++−<==MARCH X [unpublished ]The MARCH X test is called so since it has been used without being published [3]. This test detects unlinked SAF ’s, AF ’s, TF ’s and CFin ’s. The MARCH X test is a test of 6n complexity.;}};{){}0)(({));1(;0(}};{};0)(){1)(({);0);1({}};{};1)(){0)(({));1(;0(;0:})0();0,1();1,0();0({:pass return fail return else i m if i n i i for fail return else i m i m if i i n i for fail return else i m i m if i n i i for to background write tion implementa for pseudocode r w r w r w X MARCH =++−<====−−>=−===++−<==⇓⇑MARCH C [Marinescu, 1982][10]:The MARCH C test is suited for AF ’s, SAF ’s, TF ’s and all CF ’s [3]. It is a test of 11n complexity.})0();0,1();1,0();0();0,1();1,0();0({:r w r w r r w r w r w C MARCH ⇓⇓⇑⇑;}};{){}0)(({);;0(}};{}0)(){1)(({);0);1((}};{};1)(){0)(();0);1((}};{){}0)(({));1(;0(}};{};0)(){1)(({));1(;0(}};{};1)({0)(({));1(;0(0pass return fail return else i m if i n i i for fail return else i m i m if i i n i for fail return else i m i m if i i n i for fail return else i m if i n i i for fail return else i m i m if i n i i for fail return else i m i m if i n i i for to background write tion implementa for pseudocode ++<====−−>=−===−−>=−==++−<====++−<====++−<=MARCH C- [Goor, 1991] This test sequence is a modification to MARCH C test implemented in order to remove redundancy present in it. Detects unlinked AF ’s, SAF ’s, TF ’s andall CF ’s. This test is of complexity 10n.}};{}0)(){1)(({);0);1((}};{};1)(){0)(();0);1((}};{};0)(){1)(({));1(;0(}};{};1)({0)(({));1(;0(0})0();0,1();1,0();0,1();1,0();0({:fail return else i m i m if i i n i for fail return else i m i m if i i n i for fail return else i m i m if i n i i for fail return else i m i m if i n i i for to background write tion implementa for pseudocode r w r w r w r w r w C MARCH ==−−>=−===−−>=−===++−<====++−<=⇓⇓⇑⇑−;}};{){}0)(({);;0(pass return fail return else i m if i n i i for ++<==MARCH A [Suk & Reddy, 1981]The MARCH A test is the shortest test for AF ’s, SAF ’s, linked CFid ’s, TF ’s not linked with CFid ’s, and certain CFin ’s linked with CFid ’s [2]. It is a complete and irredundant test of complexity 15n.backgroundto write tion implementa for pseudocode w w r w w w r w w r w w w r w A MARCH 0})0,1,0();0,1,0,1();1,0,1();1,0,1,0();0({:⇓⇓⇑⇑;}{};0)(;1)(){0)(({);0);1((}{};0)(;1)(;0)(){1)(({);0);1((}{};1)(;0)(){1)(({));1(;0(}{};1)(;0)(;1)(){0)(({));1(;0(pass return fail return else i m i m i m if i i n i for fail return else i m i m i m i m if i i n i for fail return else i m i m i m if i n i i for fail return else i m i m i m i m if i n i i for ===−−>=−=====−−>=−====++−<======++−<==MARCH Y [unpublished ]MARCH Y test is an extension of MARCH X. This test is of complexity 8n and can detect all faults detectable by MARCH X. })0();0,0,1();1,1,0();0({:r r w r r w r w Y MARCH ⇓⇑;}};{){}0)(({));1(;0(}};{){}0)((};{);0)(){1)(({);0);1((}};{){}1)((};{};1)(){0)(({));1(;0(0pass return fail return else i m if i n i i for fail return else i m if fail return else i m i m if i i n i for fail return else i m if fail return else i m i m if i n i i for to background write tion implementa for pseudocode =++−<=====−−>=−====++−<==MARCH B [Suk & Reddy, 1981][9];The MARCH B test is an extension of MARCH A test. It is a complete and irredundant test capable of detecting AF ’s, SAF ’s, linked CFid ’s or CFin ’s. This test is of complexity 17n.;}}{};0)(;1)(){0)(({);0);1((};{};0)(;1)(;0)(){1)(({);0);1((};{};1)(;0)(){1)(({);;0(}};{}1)(){0)((};{}0)(){1)((};{}1)(){0)(({));1(;0(0})0,1,0();0,1,0,1();1,0,1();1,0,0,1,1,0();0({:pass return fail return else i m i m i m if i i n i for fail return else i m i m i m i m if i i n i for fail return else i m i m i m if i n i i for fail return else i m i m if fail return else i m i m if fail return else i m i m if i n i i for backgroundto write tion implementa for pseudocode w w r w w w r w w r w r w r w r w B MARCH ===−−>=−=====−−>=−====++<========++−<==⇓⇓⇑⇑GALPAT:GALPAT known as Galloping patterns test is a test of 4n 2 complexity. The galloping patterns test is a strong test for most faults; it is a complete test to detect and locate all SAF ’s, TF ’s, AF ’s and CF ’s. The algorithm involves writing a basic cell with its complement and then verifying every other cell for its data integrity and repeating the algorithm with data inversed.endprocedure same repeat background to write i m i m OC verify and read BC verify and read OC OC OC OC for i m i m BC BC BC BC for backgroundto write tion implementa for pseudocode GALPAT ;;1};)()(};;{));1(;0(;)()({));1(;0(0:=++−<===++−<==Sliding Galloping Row/Column/Diagonal:This is a test sequence based on galloping patterns test but instead of shifting a 1 through the memory a diagonal of one ’s is shifted and the whole memory is read after each shift. Has the same fault coverage as GALPAT expect some CF ’s that may be missed. This test is of complexity 4n 3/2.1 11 1;;;1};'};';'{));1(;0(;'{));1(;0(;0{://end procedure same repeat background to write s BC diagonal complement s BC other verify and read s BC diagonal verify and read OC OC OC OC for s BC diagonal complement BC BC BC BC for to background write tion implementa for pseudocode Diagonal Column Row Galloping Sliding ++−<==++−<==WALPAT:WALPAT (walking patterns test) is a classical memory testing technique. The walking patterns test is similar to the galloping patterns test except that the basic cell is read only once after all the other cells are read. The WALPAT test is a test of complexity 2n 2.};)()(;};{));1(;0(;)()({));1(;0(0::i m i m BC verify and read OC verify and read OC OC OC OC for i m i m BC BC BC BC for backgroundto write tion implementa for pseudocode WALPAT =++−<===++−<==end procedure same repeat background to write ;;1A NALYSIS & R ESULTS :A complete summary of different MARCH tests their fault detection capabilities and test complexity is given is Table 1.MARCH TEST DETECTION & COMPLEXITYFaultsAF SAF TF CF other TC MARCHING 1/0 DS D N N 14n MATS D D N N 4n MATS+ D D D N 5n MATS++ D D D N6n MARCH X D D D D Ul-CFin 6n MARCH C D D D D Ul-CFin 11n MARCH C-DDDDUl-CF10nMARCH A D D D D l-TF 15n MARCH Y D D D D l-CF 8n MARCH BDDDDRead access time7nN=’No ’, L=’locate ’, D=’detect ’, LS= ‘locate some ’, DS= ‘detect some ’.Table 2.0 [2]A complete summary of other pattern based memory tests their fault detection capabilities and test complexity is given is Table 1.WALPAT & GALPAT SUMMARYFaults AF SAF TF CFother TC WALPAT L L L L refresh 2n 2 GALPATL L LLSense amp rec4n 2 Galloping Diagonal LS L L N Write rec 4n 3/2N=’No ’, L=’locate ’, D=’detect ’, LS= ‘locate some ’, DS= ‘detect some ’.Table 3.0 [2]Fault coverage for different pattern based tests [4]: Fault Coverage for MARCH tests [3]:Fault MATS++ MARCHX MARCHYMARCHC-SAF’s 100% 100% 100% 100%TF’s 100% 100% 100% 100% SOF’s 100% 0.2% 100% 0.2%AF’s 100% 100% 100% 100% CFin’s 75.0% 100% 100% 100% CFid’s 37.5% 50.0% 50.0% 100% CFst’s 50.0% 62.5% 62.5% 100%C ONCLUSION:MARCH tests are extensively being used today for functional testing of SRAM and DRAM technologies. They are more efficient then older classical pattern based tests with better fault coverage. With increase in density of semiconductor memories research is on for better pattern sequences and alternative strategies like DFT and BIST.References:1.Essentials of Electronic Testing, MichaelL.Bushnell, Vishwani D.Agarwal.2.Memory testing, Cheng-Wen Wu, Lab forreliable computing (LaRC), EE, NTHU.3.RAM Fault models and Memory Testing,Cheng-Wen Wu, Lab for realiablecomputing(LaRC), NTHU.ing MARCH tests to test SRAM’S, Van DeGoor, A.J.; Design & Test of Computers,IEEE Volume 10, Issue 1, March 1993Page(s):8 – 14.5.M.A. Breuer and A.D. Friedman, Diagnosisand Reliable Design of Digital Systems,Computer Science Press, WoodlandHills, Calif., 1976.6. A.J. van de Goor, Testing SemiconductorMemories, Theory and Practice, JohnWiley&Sons, Chichester, UK, 1991.7.S.M. Thatte and J.A. Abraham, ‘Testing ofSemiconductor Random Access Memories, ”hoc. Fault-Tolerant computer Symp. ~IEEEComputer Society Press, Los alamitos, Calif.,June 1977, pp. 81-87.8.R. Nair et al., “Efficient Algorithms forTesting Semiconductor Random AccessMemories,”lEEE Trans. Computers, Vol.C-28, NO. 3, 1978, pp. 572-576.9. D.S. Suk and S.M. Reddy, “A March Testfor Functional Faults in SemiconductorRandom-Access Memories,”lEEE Trans.Computers, Vol. C-30, No. 12, 1981, pp.982-985.10.M. Marinescu, “Simple and Efficient Agorithms for Functional RAM Testing,” &oc.lEEE lnt’l Test Conf, IEEE Computer society Press, 1982, pp. 23&239.11.Automatic Generation and compaction ofMARCH tests for memory arrays, KamranZarrineh, Member, IEEE, Shambhu J.Upadhyaya, Senior Member, IEEE, andSreejit Chakravarty, Senior Member, IEEE。