Spatial Big Data AAG_NIH_July2012_GeoFrontiers_Shekhar

格式：pdf
大小：2.16 MB
文档页数：25

下载文档原格式

大数据领域学术文章英语作文格式

大数据领域学术文章英语作文格式## Big Data: A Comprehensive Review.### Introduction.Big data refers to massive, complex, and rapidly generated datasets that are difficult to process using traditional data management tools. The advent of big data has revolutionized various industries, from healthcare to finance, transportation, and agriculture. In this paper, we present a comprehensive review of big data, including its characteristics, challenges, opportunities, and applications.### Characteristics of Big Data.Big data is often characterized by the following attributes:Volume: Big data datasets are massive, typicallyranging from terabytes to petabytes or even exabytes in size.Variety: Big data comes in various formats, including structured, semi-structured. and unstructured data.Velocity: Big data is generated rapidly and continuously, requiring real-time or near-real-time processing.Veracity: Big data quality can vary, and it isessential to address data cleansing and validation.### Challenges in Big Data Analytics.Big data analytics presents several challenges:Data storage and management: Storing and managing large and diverse datasets require efficient and scalable data storage solutions.Data processing: Traditional data processing tools areoften inadequate for handling big data, necessitating specialized big data processing techniques.Data analysis: Extracting meaningful insights from big data requires advanced analytics techniques and machine learning algorithms.Data security and privacy: Protecting big data from unauthorized access, breaches, and data loss is a significant challenge.### Opportunities of Big Data.Despite the challenges, big data presents numerous opportunities:Improved decision-making: Big data analytics enables data-driven decision-making, providing invaluable insights into customer behavior, market trends, and operational patterns.Predictive analytics: Big data allows for predictiveanalytics, identifying patterns and forecasting future events.Real-time analytics: Processing big data in near-real-time enables instant decision-making and rapid response to changing conditions.Innovation: Big data analytics drives innovation by fostering new products, services, and business models.### Applications of Big Data.Big data finds applications in numerous domains:Healthcare: Big data analytics helps improve patient diagnosis, treatment, and disease prevention.Finance: Big data is used for risk assessment, fraud detection, and personalized financial services.Transportation: Big data optimizes traffic flow, improves safety, and enhances the overall transportationsystem.Agriculture: Big data supports precision farming, crop yield prediction, and sustainable agriculture practices.Retail: Big data analytics enables personalized recommendations, customer segmentation, and supply chain optimization.### Conclusion.Big data has emerged as a transformative force in the modern world. Its vast volume, variety, velocity, and veracity present challenges but also offer unprecedented opportunities for data-driven decision-making, predictive analytics, real-time insights, and innovation. As the amount of data continues to grow exponentially, the role of big data analytics will only become more critical in shaping the future of various industries and sectors.### 中文回答：## 大数据，一个全面的评论。

《大数据审计》第1章大数据审计概述

中共中央总书记、国家主席、中央军委主席、中央审计委员会主任习近平2018年5月23日在主持召开的中央审计委员会第一次会议上指出“要坚持科技强审，加强审计信息化建设。
中国注册会计师协会2017年提出了研究大数据、人工智能等先进信息技术在注册会计师行业的落地应用，促进会计师事务所信息化。
《大数据审计》
第1章大数据审计概述
1-1
本章学习目标
理解开展大数据审计的重要性熟悉大数据审计产生的背景熟悉国外大数据审计应用情况熟悉国内大数据审计应用情况
《大数据审计》
1-2
本章主要内容
开展大数据审计的重要性大数据审计产生的背景国外大数据审计应用情况国内大数据审计应用情况
《大数据审计》
清华大学出版社，2020年陈伟．《大数据审计理论、方法与应用》，科学出版社，2019年陈伟．《计算机审计（第2版）》，中国人民大学出版社，2019年陈伟．《审计信息化》，高等教育出版社，2017年陈伟．《电子数据审计模拟实验》，清华大学出版社， 2016年陈伟．《联网审计技术方法与绩效评价》，清华大学出版社，2012年
1-3开展大数据审Fra bibliotek的重要性随着被审计单位信息化趋向普及，审计对象的信息化使得审计信息化成为必然。
随着信息技术的发展，大数据时代的到来使得审计工作将不得不面临被审计单位的大数据环境。
2015年12月8日，中共中央办公厅、国务院办公厅印发了《关于实行审计全覆盖的实施意见》指出“创新审计技术方法是实现审计全覆盖的一个重要手段，要求构建大数据审计工作模式，提高审计能力、质量和效率，扩大审计监督的广度和深度”。
《大数据审计》
1-12
国外大数据审计应用情况

基于重症支气管哮喘差异表达基因及其治疗中药筛选的生物信息学分析

第 50 卷第 2 期2024年 3 月吉林大学学报（医学版）Journal of Jilin University（Medicine Edition）Vol.50 No.2Mar.2024DOI：10.13481/j.1671‐587X.20240214基于重症支气管哮喘差异表达基因及其治疗中药筛选的生物信息学分析陈丽平1,2, 韩立1, 卞华1, 庞立业1（1. 南阳理工学院河南省张仲景方药与免疫调节重点实验室，河南南阳473004；2. 河南中医药大学呼吸疾病中医药防治省部共建协同创新中心，河南郑州450046）［摘要］目的目的：通过生物信息学方法探讨重症支气管哮喘［简称重症哮喘（SA）］的差异表达基因，分析其作用机制，并筛选潜在具有治疗作用的中药及活性成分。

方法：在高通量基因表达（GEO）数据库中选取GSE136587和GSE158752数据集，利用R软件对数据集进行差异分析获得差异表达基因，并进行蛋白-蛋白相互作用（PPI）网络分析，筛选核心基因，寻找关键通路和枢纽基因。

最后将核心基因提交至Coremine数据库筛选具有潜在治疗作用的中药，并通过《中华医典》检索相关中药方剂。

结果结果：共筛选出466个差异表达基因。

通过STRING平台构建PPI网络共筛选包括25 kDa 突触关联蛋白（SNAP25）、谷氨酸离子型受体2（GRIA2）、轴突蛋白1（NRXN1）、钾电压门控通道亚家族A成员1（KCNA1）、突触囊泡蛋白 1（SYT1）和嗜铬蛋白A（CHGA）等核心靶点25个。

基因本体（GO）功能富集显示SA的生物学过程与细胞趋化性和白细胞迁移等有重要关系，京都基因与基因组百科全书（KEGG）富集的通路主要涉及骨髓白细胞迁移、白细胞趋化性、细胞趋化性、白细胞迁移、对外部刺激反应的正向调节和骨髓白细胞活化等信号通路。

采用网络药理学方法基于核心靶点筛选得到具有潜在治疗SA作用的中药367种，其中人参、水牛角、全蝎和黄芪等中药涉及多个核心靶点，与SA具有高度相关性，在《中华医典》中检索具有高度相关性的中药，共得到17个潜在具有治疗效果的中药方剂。

大数据英文演讲 Big Data presention

Volunteered Geographic Information (VGI)
添加标题
VGI generates from emergence of online service platform providing geographical location. Main application field
@ Refinement of individual attributive data
Background
添加标题
Individual behavior and its spatio-temporal variation are main subjects and foundation in urban studies and planning practices. The following will particularly introduce some perspectives about them, as well as the main application fields of different types of big data.
Individual behavior; Spatial pattern of specific behavior; Visualization of social network; Connection intensity between cities; Urban spatial structure and function division
05
Open research issues
Open research issues
Big data
has become a very heated issue in the

1206面对大数据挑战和机遇敢问我国政府统计的数据采集方式的出路

大数据对官方统计带来的挑战：大数据对官方统计的数据采集方式带来挑战
大数据对官方统计的管理体系及运作方式带来挑战
大数据对官方统计的制度方法带来挑战
大数据对官方统计质量控制和数据分析的手段带来挑战
大数据对官方统计人员的综合素质提出新要求
二、我国政府统计数据采集方式的出路
面对大数据带来的机遇和挑战，敢问我国政府统计数据采集方式的出路何在？
（一）改革完善现有的政府统计调查体系（二）充分利用行政记录，逐步代替（辅助）
统计调查（三）从价格指数的编制入手，探索利用电
商或扫描数据编制CPI、PPI的路径，逐步拓展利用非结构数据的范围。
（一）改革完善现有的统计调查制度
1、我国政府统计调查体系的现状
近年来，逐步向统计法确定的政府统计调查体系目标发展。（1）在调查方法方面，抽样方法逐步得到推广使用，但统计报表制度仍占据主导地位。抽样调查和行政记录等方法的应用范围在逐步扩大。（2）在调查手段方面，不断应用高科技手段提高数据搜集的效率。从2011年开始在全国实施的“四大工程”是统计调查流程的系统再造。但也存在很多问题。
3、适时改革完善我国政府统计调查体系（1）真正贯彻落实周期性普查为基础，经常性
抽样调查为主体的规定。
第一，以抽样框的建立更新维护为纽带，建立周期性普查与经常性抽样调查之间的良性互动关系
第二，逐步缩小直至取消全面统计报表制度，使抽样调查取代全面统计报表制度真正成为政府统计调查的主体方法，同时充分利用行政记录。
有人做了统计，近些年全球每两年产生的数据量就相当于2000年以前数据生产量的总和。现在数据规模上了PB级别，根据麦肯锡的研究报告，2010年新增数据量在全球的大体分布是：

Telops Inc. 2014 IEEE GRSS Data Fusion Contest Des

The 2014 IEEE GRSS Data Fusion ContestDescription of the datasetsPresented to Image Analysis and Data Fusion Technical Committee IEEE Geoscience and Remote Sensing Society (GRSS)February 17th, 2014IntroductionFew datasets of long-wave infrared (LWIR) hyperspectral are currently publicly available. The interpretation of such measurements requires new approaches and it is a very current research topic, even before fusion.In May 2013, Telops made airborne measurements over a small city closed to an open pit mining area. The following data are made available for the contest:∙LWIR hyperspectral dataset of the city∙High resolution visible data of the same area, with sparse ground coverage∙Ground reference data (LWIR hyperspectral) at a ground sampling pointThetford Mines Area DescriptionThe region included in the mosaic includes a variety of natural and man-made objects. It is located in Black Lake area of Thetford Mines, province of Québec, Canada (46.047927N, 71.366893W). The image below illustrates the general region which was surveyed.The Thetford Mines region, a serpentine rich, rocky formation of oceanic origin, gives the region an international status in the extraction and export of chrysotile or white asbestos. The ground topography of this physiographic region is a little more rugged than the foothills, with hills whose altitudes generally vary between 400 to 600 m. Till deposits, which dominate the Appalachian Plateau are usually thick and undifferentiated.Maple trees account for about 50% of the wooded area. Other forests in this area are largely dominated by coniferous trees. Finally, the area is covered by the balsam fir- yellow birch. Throughout the territory, the stations with poor drainage (water and organic) are populated by forests of fir, cedar, black ash, black spruce, spruce or cedar grove.In Thetford Mines, asphalt is used to cover the roads. Concrete is the material most often used for the sidewalks. Roofs are generally made of asphalt shingles and flat roofs are often covered with an elastomeric membrane. In some cases as for garages or sheds, roofs are made of sheet metal. People have above ground and inground pools in their backyards which can be seen in the broadband airborne images.Datasets OverviewThe contest involves five datasets – a group of three airborne datasets and a group of two ground reference datasets.Table 1 gives a summary of the datasets for the 2014 Data Fusion contest.Table 1. Summary of datasetsAirborne Datasets (#1a & #1b)The two airborne collected datasets were acquired over a low-density urban area in the neighboring of an open pit mining area and forest areas.The first airborne data set (TelopsDatasetCityLWIR.img) has been acquired using the Telops’ Hyper-Cam, an airborne long-wave infrared hyperspectral imager, which is based on a Fourier-transform spectrometer (FTS). The hyperspectral imager was integrated to a gyro-stabilized platform inside a fixed-wing aircraft. A digital color camera (2 MegaPixels), mounted on the same platform, was also used to acquire simultaneously a visible imagery data set.The airborne LWIR hyperspectral imagery consists of 84 spectral bands in the 868 cm-1 to 1280 cm-1region (7.8 µm to 11.5 µm), at a spectral resolution of 6 cm-1(full-width-half-maximum). It has been calibrated to at-sensor spectral radiance units, in W/(m2 sr cm-1). The data in the dataset represents the “at-sensor spectral radiance” in W/(m2 sr cm-1) and is on a spectral grid in wavenumbers (cm-1). Both axes are fully calibrated.The airborne visible imagery (TelopsDatasetCityVisible.img) consists of uncalibrated, high spatial resolution, digital data with sparse ground coverage over the same area as the LWIR hyperspectral imagery.Both airborne data sets are georeferenced images that are mutually registered.The average height of both the LWIR and visible sensors above ground was 2650 ft (807m), resulting in an average spatial resolution (GSD) of approximately 1m for the LWIR hyperspectral imagery and 0.1m for the visible imagery. The visible data used for the contest have been sampled at 0.2m spatial resolution to partly reduce the resolution ratio between the two data sources.The two airborne data sets were acquired simultaneously on May 21, 2013, between the time 22:27:36 to 23:46:01 UTC. A nearby meteorological station, located at geographic coordinates 46°02’57.002” N, 71°15’58.004” W (430.0m elevation), has recorded the data that is presented in Table 2.Table 2. Environmental data during acquisition of datasets #1a and #1bFigure 1 gives an overview of the content of the dataset #1a (TelopsDatasetCityLWIR.img) by showing the spectrally-averaged image of bands #18 to #38 (952 cm-1 to 1052 cm-1 spectral range).Figure 2 presents an overview of the content of the dataset #1b (TelopsDatasetCityVisible.img).Figure 1. Spectrally-averaged mosaic of the airborne LWIR hyperspectral data set (#1a)Figure 2. Mosaic image of the airborne visible data set (#1b)Airborne Dataset (#1c)In addition to the mosaic, we provide the 74 individual hypercubes which are an orthorectified version of the original airborne hypercubes. The filenames are as follow: Ortho_MissionB_Lx_20130521_.img, where x represents the flight line number (4 to 8), and the is a numeric value (timestamp). An ENVI header accompanies each hypercube in a separate file (extension .hdr). These hypercubes were processed to produce the mosaic given as the dataset #1a.Ground Datasets (#2a & #2b)The ground datasets (#2a & #2b) have been acquired on October 3, 2013 at the time 21:15 UTC (17:15 local time).The dataset #2a (TelopsDataset2a_LWIRFTS.dat) has been obtained using a similar LWIR hyperspectral imager as the one used for the acquisition of the airborne data set. The measurement was done at a higher spectral resolution and with averaging to produce a better SNR.The hyperspectral imager was installed on a tripod on the ground, the line-of-sight of the instrument being approximately parallel to the horizon. A digital visible camera was taking visible images of the same scene. These imagers were installed at a geographic position (46.049053N,-71.363289W) which is located inside the ground area that is covered by the airborne datasets #1a and #1b.The ground-based LWIR hyperspectral imagery (#2a) consists of 177 spectral bands in the 877 cm-1 to 1317 cm-1 region (7.6 µm to 11.4 µm), at a spectral resolution of 3 cm-1 (full-width-half-maximum). It has been calibrated to at-sensor spectral radiance units, in W/(m2 sr cm-1).The ground-based visible imagery (#2b) consists of an uncalibrated digital picture with wider coverage of the same scene as the LWIR hyperspectral imagery (context camera).Figure 3 shows the thermal image (broadband) of the dataset #2a.Figure 3. Broadband thermal image of the ground LWIR hyperspectral dataset (#2a) The image shown in Figure 4 is an overview of the content of the ground visible dataset #2b (TelopsDataset2b_Visible.dat).Figure 4. Image of the ground visible dataset (#2b)NotesAll files of the datasets are ENVI readable. Equivalently, they can be read through common programming languages as conventional binary files (see Table 1 and the .hdr files for information on image sizes, data types, and interleaving). For instance, included with the data packa ge, there is a MATLAB® script file “demo_script_telops.m” showing an example of how to open the data within MATLAB®.For the sake of completeness, the data subset, which was released on January 22nd, 2014, and included subsets of the #1a and #1b airborne data sets, is also enclosed (IEEE_GRSS_IADFTC_Contest2014_subset.zip). The training map for the Classification Contest is included along with the data subset. The formats of the LWIR and visible subsets are analogous to those of the corresponding full data sets #1a and #1b, respectively. The training map is registered to the visible subset and is provided in both ENVI binary and GeoTIFF formats. Detailson the formats of the data subset and of the training map can be found in the “ReadMe.txt” file included in “IEEE_GRSS_IADFTC_Contest2014_subset.zip.”Further information can be found on the IEEE GRSS website:/community/technical-committees/data-fusion/data-fusion-contest/and the Contest website:http://cucciolo.dibe.unige.it/IPRS/IEEE_GRSS_IADFTC_2014_Data_Fusion_Contest.htm。

云计算技术及其应用

BigD413 "Building a Generic Platform for Big Sensor Data Application"
Chun-Hsiang Lee, David Birch, Chao Wu, Dilshan Silva, Orestis Tsinalis, Yang Li, Shulin Yan, Moustafa Ghanem, and Yike Guo
大数据：我们研究什么
中
南
大
学
信息科学与工程学院
大数据相关的CFP
1. Big Data Science and Foundations
a. Novel Theoretical Models for Big Data b. New Computational Models for Big Data c. Data and Information Quality for Big Data d. New Data Standards
h. Data management for Mobile and Pervasive Computing
i. Data Management in the Social Web j. Crowdsourcing k. Spatiotemporal and Stream Data Management
e. Programming Models and Environments for Cluster, Cloud, and Grid Computing to Support Big Data
f. Software Techniques and Architectures in Cloud/Grid/Stream Computing g. Big Data Open Platforms h. New Programming Models for Big Data beyond Hadoop/MapReduce, STORM i. Software Systems to Support Big Data Computing

大数据领域学术文章英语作文格式

大数据领域学术文章英语作文格式（中英文版）Title: Essay Format for Academic Articles in the Field of Big DataIn the realm of big data, the composition of academic articles should adhere to a structured format that effectively communicates complex ideas and findings.To facilitate clarity and precision, the following essay format is proposed, which combines both English and Chinese paragraphs to cater to a diverse readership.Paragraph 1:To begin an academic article in the big data domain, it is crucial to introduce the topic with a concise and engaging opening statement.For instance, one might write: "The advent of big data has revolutionized various industries, enabling organizations to gain valuable insights and make data-driven decisions." Following this, a Chinese paragraph can elaborate on the significance of big data in modern society.第一段：在撰写大数据领域的学术论文时，开篇应以简洁且引人入胜的语句介绍主题。

Remote-Sensing-of-Environment

ReviewA review of large area monitoring of land cover change using Landsat dataMatthew C.Hansen a ,⁎,Thomas R.Loveland ba South Dakota State University,Brookings,SD,United States bUnited States Geological Survey,Sioux Falls,SD,United Statesa b s t r a c ta r t i c l e i n f o Article history:Received 3May 2011Received in revised form 18August 2011Accepted 31August 2011Available online 23February 2012Keywords:Change detection Monitoring Large Area Landsat data Land coverLandsat data constitute the longest record of global-scale medium spatial resolution earth observation data.As a result,the current methods for large area monitoring of land cover change using medium spatial resolution imagery (10–50m)typically employ Landsat data.Most large area products quantify forest cover change.Forests are a comparatively easy cover type to map as well as a current focus of environmental monitoring concerning the global carbon cycle and biodiversity loss.Among existing change products,supervised or knowledge-based characterization methods predominate.Radiometric correction methods vary signi ﬁcantly,largely as a function of geographic/algorithmic scale.For instance,products created by mosaicking per scene characteriza-tions do not require radiometric normalization.On the other hand,methods that employ a single index or classi ﬁcation model over an entire study area do require radiometric normalization.Temporal updating of cover change varies between existing products as a function of regional acquisition frequency,cloud cover and seasonality.With the Landsat archive opened for free access to terrain-corrected data,future product generation will be more data intensive.Per scene,interactive analyses will no longer be viable.Coupling free and open access to large data volumes with improved processing power will result in automated image pre-processing and land cover characterization methods.Such methods will need to leverage high-performance computing capabilities in advancing the land cover monitoring discipline.Robust validation efforts will be required to quantify product accuracies in determining the optimal change characterization methodologies.©2012Elsevier Inc.All rights reserved.Contents 1.Introduction ...............................................................672.Product intercomparison .........................................................682.1.Operational versus experimental ..................................................682.2.Pre-processing ..........................................................682.2.1.Geometric corrections ...................................................682.2.2.Radiometric corrections ..................................................682.2.3.Geographic/algorithmic scale ...............................................692.2.4.Temporal scale ......................................................702.2.5.Algorithms ........................................................702.2.6.Change detection methods ................................................703.Discussion and recommendations .....................................................713.1.Improving data policies ......................................................713.2.The end of per scene analysis ...................................................713.3.Improving data processing and characterization ...........................................713.4.Advancing thematic outputs pleting the Landsat archive ..................................................734.Conclusion ...............................................................73References ....................... (73)Remote Sensing of Environment 122(2012)66–74⁎Corresponding author.Tel.:+13013142585.E-mail address:mhansen@ (M.C.Hansen).0034-4257/$–see front matter ©2012Elsevier Inc.All rights reserved.doi:10.1016/j.rse.2011.08.024Contents lists available at SciVerse ScienceDirectRemote Sensing of Environmentj o u r n a l h om e p a g e :ww w.e l s e v i e r.c o m /l o c a t e /r s e1.IntroductionWith the opening of the United States Geological Survey's(USGS) Landsat data archive(Woodcock et al.,2008)and future plans for open access policies for follow-on missions,there is the possibility for documenting global land surface change both retrospectively and prospectively.With this increased data availability,there will be a corresponding demand in improving our capacity to mass-process Landsat data in support of large area monitoring objectives. Fortunately,co-incident with the new Landsat data policy are new high-performance computing capabilities that enable the mining of the archive in ways heretofore not yet realized.The need for timely, accurate observations documenting global land change is more press-ing than ever,given the changing state of global climate,biodiversity, food andﬁber demand,and other critical environmental/ecosystem services.The improved data availability,advanced processing methods, and pressing need for information on environmental change will dictate a commensurate increase in our quantiﬁcation of global land change in the coming months and years.While each of these three aspects is necessary for realizing this improved monitoring capability,none is more critical than data availability.The expansion of land monitoring methods and systems to other regions of the world and to other themes of interest will be maximized best via a free and easy access data policy for global monitoring systems.There is an additional need to improve historical and annual global coverage of Landsat-class data, either through coordinated acquisitions to improve contemporary coverage,or through the consolidation of unique Landsat data held in international archives.This paper will review a number of large area medium spatial res-olution land cover monitoring products,all of which employ Landsat data as inputs.To date,forest change products dominate due to the topicality of forest change regarding carbon accounting,biodiversity monitoring,and other issues concerning forested landscapes.Another reason forests are the most common large area monitoring target using remotely sensed data is that forests are one of the most easily distinguished vegetation cover types when compared to other moni-toring targets,such as croplands or urbanized landscapes.The forest monitoring methods included in this review all employ digital image processing algorithms to quantify forest extent and change. As such,products that use heritage photointerpretative mapping methods,such as the Global Forest Survey of India(2008)and Indo-nesia's forest land use mapping efforts(Government of Indonesia/ World Bank,2000),are not reviewed.The objective of the paper is to assess the state of the practice in large area Landsat monitoring of land cover change,which most commonly centers on forest moni-toring products.A product on crop type mapping,produced annually by the U.S.Department of Agriculture,National Agricultural Statistical Service(NASS),is also included as an example of algorithm-driven monitoring of agriculture using medium spatial resolution observa-tions(National Agricultural Statistics Service,2011).The review is divided into a series of methodological intercompar-isons which include pre-processing,geographic/algorithmic and tem-poral scales,and the change detection algorithms themselves. Validation practices are not reviewed,but validation is recognized as the key to future determination of not only the quality of individual products,but as a way to assess the aforementioned methodological considerations in generating large area change products.Validation of static large area land cover products is difﬁcult,but change prod-ucts are even more of a challenge(Khorram,1999;Strahler et al., 2006).With the advent of large area land cover change products, the development of rigorous validation protocols is essential.The following review illustrates the wide variety of methods and the lack of consensus in large area monitoring approaches.Validation will be required to quantify the various strengths and weaknesses as the land cover and land cover change mapping community moves forward.Studies reviewed in this paper include land cover change products that employ digital image processing-based algorithms to quantify cover conversion at national scales or larger,typically greater than 1Mkm2in area.As will be shown there is no consensus approach to large area monitoring using medium resolution input data sets.A wide range of methods exist and are outlined in Table1for reference. Some commonalities are worth mentioning at the ndsat is the primary data source and is used in all studies;Indian Remote Sens-ing Advanced Wide Field Sensor(AWiFS)data are used as one of the inputs along with Landsat data in the NASS Cropland Data Layer(CDL) crop type maps.Forest cover and change is the primary thematic vari-able,excepting the NASS CDL crop type map(National Agricultural Statistics Service,2011).Methods are often employed interactively or iteratively in deriving map outputs.Supervised learning algorithms are favored in most examples.Exceptions include the PRODES product of Brazil's National Institute for Space Research(INPE),which uses multiple algorithms,including an analyst-driven linear mixture model,a segmentation model and an unsupervised clustering model (Shimabukuro et al.,1998).The North American disturbance product employed the Landsat Ecosystem Disturbance Adaptive Processing System(LEDAPS),a knowledge-based approach that used indices to mask non-forest and to map forest disturbance(Masek et al.,2008).In addition to the North America disturbance product,the Australia National Carbon Accounting System(NCAS)method is an index-based forest/non-forest mapping approach(Caccetta et al.,2007),and INPE's PRODES relies primarily on a shade fraction index to map deforestation (Shimabukuro et al.,1998).Despite these general commonalities, there is a remarkable amount of methodological variation between the existing large area change products.Table1General processing characteristics of products reviewed in this paper.LEDAPS—NASA Goddard Space Flight Center,University of Maryland and Oregon State University(Masek et al.,2008);SDSU—South Dakota State University(Broich,Hansen,Potapov,et al.,2011;Potapov et al.,2011,this issue);USDA NASS CDL—United States Department of Agriculture National Agricultural Statistical Service Cropland Data Layer(Johnson&Mueller,2010;NASS Cropland Data Layer,2010);CSIRO NCAS—Commonwealth Scientiﬁc and Industrial Re-search Organisation National Carbon Accounting System(Caccetta et al.,2007);INPE PRODES—National Institute for Space Research Amazon Deforestation Monitoring Project(Instituto Nacional de Pesquisas Espaciais,INPE,2010;Shimabukuro et al.,1998);USGS NLCD—United States Geological Survey National Land Cover Dataset(Xian et al.,2009;Xian&Homer, 2010);CI—Conservation International(Harper et al.,2007;Killeen et al.,2007;Leimgruber et al.,2005).Product Geographic/algorithmicscale RadiometricprocessingImage inputs percharacterizationChange detection approach Land cover theme(s)LEDAPS—North America Forest Disturbance index Continental Normalized data Single-date Post-characterization comparisonof disturbance indexForestSDSU—Congo/Indonesia/European Russia National/regional Normalized data Multi-date Classiﬁcation of forest cover loss ForestUSDA NASS CDL—Continental U.S.State Non-normalized data Multi data Annual classiﬁcation of crop types Crop typeCSIRO NCAS—Australia Ecozone Normalized data Single-date Time-series of forest classiﬁcationsusing joint probabilitiesForestINPE PRODES—Legal Amazon Per scene Non-normalized data Single-date Classiﬁcation of deforestation ForestUSGS NLCD—Continental U.S.Per scene Normalized data Single-date Change vector thresholds Forest andimpervious surface CI—Bolivia,Burma,Madagascar Per scene Non-normalized data Single-date Classiﬁcation of deforestation Forest 67M.C.Hansen,T.R.Loveland/Remote Sensing of Environment122(2012)66–742.Product intercomparison2.1.Operational versus experimentalThe most important distinction between existing large area medi-um spatial resolution land cover change products is whether they are made in an operational or experimental setting.By deﬁnition,stan-dards for product deliverables are more demanding for operational products than experimental ones.Operational products exist as deci-sion support systems for broader policy objectives and this requires close attention to product consistency and accuracy,and a practical approach to mapping in meeting product delivery dates.Research-based products have a higher tolerance for uncertainty whereas oper-ational ones need to meet quality and timeliness standards dictated by the administrative setting in which they operate.The PRODES product from INPE,the NCAS product from CSIRO,the NLCD from USGS,and the CDL from NASS represent operational products.The CI products are made in collaboration with national partners as monitoring priorities and opportunities are identiﬁed.The SDSU and LEDAPS methods are experimental in nature.2.2.Pre-processing2.2.1.Geometric correctionsIn order to perform time-series analyses of land cover change,a consistent geometric image set is required.Historically,geometric rectiﬁcation has been a limiting factor in mass-processing of Landsat data and costly in-house rectiﬁcation procedures have been employed to ready imagery for analysis.For example,the operational PRODES and NCAS data products use a reference set of base images to match new image sequences to.The NASS CDL project has included the purchase of rectiﬁed imagery,including the commercial provision of geometrically corrected AWiFS data.As the discipline moves forward,the realization that large area monitoring is dependent on consistent geometric corrections has led to the development of stan-dard orthorectiﬁed products,beginning with the Global Land Survey (Tucker et al.,2004)data set.The GLS data sets of orthorectiﬁed Landsat imagery were produced with global coverage for the1990, 2000and2005epochs.The North American LEDAPS disturbance mapping project of Masek et al.(2008)employed the GLS data sets for1990and2000.Consistent image geometry of Landsat scenes across epochs is achieved through the use of an improved digital elevation model and ground control network.The resulting orthorec-tiﬁed,terrain corrected image base is now used for all USGS Landsat processing.The opening of the 2.6million image USGS Landsat archive for no cost access to Landsat data(Woodcock et al.,2008)in late2008included automatically generated orthorectiﬁed data as the product deliverable,referred to as Level1Terrain-corrected data (L1T).The SDSU products employed this data product.The NASS CDL project has also integrated Landsat L1T data more recently into their data stream.Standard orthorectiﬁed imagery produced by data providers will be the wave of the future.For example,all data from the forthcoming Landsat Data Continuity Mission(LDCM)will be generated as stan-dard orthorectiﬁed imagery processed with the same inputs currently used to generate all other USGS Landsat products,reducing the amount of effort needed by separate projects to create rectiﬁed time-series datasets and ensuring geometric consistency.The auto-matic rectiﬁcation of imagery is relatively new and will enhance the implementation of more data-intensive multi-temporal studies.2.2.2.Radiometric correctionsA host of radiometric corrections are possible for pre-processing Landsat data.Table2outlines the corrections used by methods reviewed in this study.While other standard products such as albedo or net primary productivity require surface reﬂectance inputs,radiometric corrections are not required for land cover characteriza-tion.Radiometric corrections for land cover are only required when a given land cover characterization model is to be extrapolated beyond a single image,whether in space or time.This leads to an interesting mix of approaches,as shown in Table2.The radiometric corrections applied in these studies include top of atmosphere reﬂectance,surface reﬂectance,bi-directional reﬂectance distribution/view angle normalization,and terrain normalization.Top of atmosphere reﬂectance calculations are typically bulk corrections applied to all of the pixels in a given image to adjust primarily for sun angle and earth-sun distance.Since Landsat solar geometry is not pro-vided per pixel,TOA adjustment within a given scene does not typically vary per pixel.In these cases,TOA corrections do not enhance within sceneﬁdelity.TOA calibrations are principally employed as a step towards inter-scene normalization.From Table2,only the CDL,PRODES and CI products do not employ TOA corrections.For PRODES,analysts work at the per-scene scale,and no algorithm is applied across scenes whether in space or time.For the CDL maps,near exhaustive training data overcomes the need for any radiometric correction.The NLCD change product employs a TOA adjusted input,but this correction is not necessary for the map characterizations(Xian,m.).Surface reﬂectance is a robust and complicated correction that includes per pixel adjustments with the goal of creating consistent radiometric response both within and between images.The most ma-ture approach is that of LEDAPS(Masek et al.,2006),which estimates per pixel surface reﬂectance.LEDAPS accounts for scene spectral var-iations caused by atmospheric effects including ozone concentration, column water vapor,elevation and surface pressure,and aerosol optical thickness.The LEDAPS North American forest disturbance map employs a full surface correction application based on the MODIS6S radiative transfer approach(Masek et al.,2006;Vermote et al.,1997). The beneﬁts of removing atmospheric effects from input imagery are obvious,as consistent surface reﬂectances enable the application of standard models to all scenes,as in the LEDAPS product.Surface-related spectral variations that may require additional correction include bi-directional reﬂectance distribution function (BRDF)effects(Schaaf et al.,2002).For wide-view angle sensors, these effects can be pronounced and confound analyses.But,even narrower swath medium spatial resolution sensors such as Landsat evidence BRDF effects,largely a function of solar zenith and sensor view angles(Danaher et al.,2001;Toivonen et al.,2006).While BRDF effects are target-dependent,bulk empirical adjustments have been shown to improve radiometric response and cover characteriza-tions(Hansen et al.,2008).Fig.1shows an example of this for a scene in the Congo Basin where a coarse scale forest mask is used along with a modeled view angle parameter to adjust spectral response Table2Radiometric corrections applied per change product.Product Top ofatmospherereﬂectanceSurfacereﬂectanceBRDF/viewanglenormalizationTerrainnormalizationLEDAPS—NorthAmerica ForestdisturbanceindexSDSU—Congo/Indonesia/EuropeanRussiaUSDA NASS CDL—Continental U.S.CSIRO NCAS—AustraliaINPE PRODES—Legal AmazonUSGS NLCD—ContinentalU.S.CI—Bolivia,Burma,Madagascar68M.C.Hansen,T.R.Loveland/Remote Sensing of Environment122(2012)66–74per pixel.The SDSU and NCAS methods employ a per scene BRDF adjustment.Topographic effects on radiometric response result from differences in illumination of and re ﬂectance from sloped surfaces (Dymond &Shepherd,1999).Removing this source of spectral variation can improve discrimination of land cover types and their change over time.However,a generic and robust approach to correcting such effects is not yet widely available (Lu et al.,2008).Models range from simple ratios to complex physical modeling (Civco,1989;Holben &Justice,1980;Sandmeier &Itten,1997).The NCAS product employs a per scene correction using solar geometry,a digital elevation model,a C-factor correction (Teillet et al.,1982)and a ray-tracing algorithm to model and remove illumination variation due to terrain shading.Absolute radiometric conversion of imagery to at least top of atmosphere re ﬂectance values is preferred by a majority of applica-tions.As methods increase their use of multi-temporal imagery,and extend studies to multi-decadal time scales,radiometric consistency will only become more important regarding land cover monitoring (Wulder et al.,2008).For methods using Landsat data from different missions (Landsats 1–7)and instruments (MSS,TM,and ETM+),it is necessary to cross-calibrate sensor radiometries (Chander et al.,2009).As of late-2010,the USGS has cross-calibrated all Landsat data to a consistent radiometric standard,enabling multi-decadal studies using the Landsat platform.Moving forward,the interopera-bility of other medium spatial resolution sensors with Landsat will largely be a function of consistent radiometric calibration.2.2.3.Geographic/algorithmic scaleAlgorithm implementation can be performed per scene,per strat-i ﬁed sub-unit,or over the entire study area.The ability to apply the same rule-base over the entire study area has advantages in ensuring consistency in the characterization across space.However,signature extrapolation limitations can result,deleteriously impacting the char-acterization (Cihlar,2000).Additionally,spectral confusion between dissimilar cover types can result and similarly impact the quality of the characterization.A solution is to stratify the study area and apply per stratum algorithm runs.The drawback to this is potential inconsistency in the individual models and the creation of seams/inconsistencies at strata boundaries.Operational methods typically employ a strati ﬁed approach,whether per scene as with PRODES or per ecozone as with NCAS.The NLCD and CI change products were derived per scene.The CDL product employs a state-based strat-i ﬁcation that allows for a sub-strati ﬁcation based on the richness of the calibration data.The LEDAPS product performed a per scene char-acterization of non-forest before applying the disturbance index algo-rithm within forests.The more experimental products (LEDAPS and SDSU)employ a sin-gle model/index over the entire study area in quantifying change.This variance re ﬂects the fundamental difference between the operational and research-based products.Working within smaller geographic sub-sets provides greater control of outputs by the analysts,ensuring quality characterizations.From a research perspective,the ability to generically process data over large areas without analyst interference represents the ultimate evolution of processing methods in support of monitoring.Such a goal,if realized,may eventually translate to the operational domain and will be essential for global land change monitoring.Given that no operational systems function in this manner,the research com-munity has additional work to do in demonstrating that such methods are ready for operational implementation.The geographic/algorithmic scale is often tied to radiometric pro-cessing.Radiometric normalization is required to extrapolate signa-tures across multiple images,either through space or time.Given a stable radiometric feature space,for example a NDVI time-series,a standard model for mapping land cover and its change over time is enabled.An example of this type of application is found with the LEDAPS product,where a standardized disturbance algorithm was ap-plied to a continental mosaic of surface re ﬂectance-corrected Landsat data.Another example of this is from the SDSU projects,where a stan-dard radiometric correction is applied to Landsat data in determining per pixel quality assessment and the subsequent development of a study area-wide multi-temporal feature space.The PRODES,CDL and CI products do not have any radiometric corrections performed.In the case of PRODES and CI,analysts use digital numbers for annually collected cloud-free imagery;results are subsequently mosaicked together.For the Cropland Data Layer,an annual crop type map for the lower 48United States,NASSFig.1.Example per-scene BRDF normalization and cross-track radiometric responses for mature forests within the given Landsat scene from the Congo Basin (path/row 179/060,March 5,2000,r-g-b of raw digital numbers for bands 4-5-7on left and normalized bands 4-5-7on right.).A mature forest mask is used to adjust per band radiometric response as a function of cross-track pixel location,resulting in a spectrally consistent depiction of forest cover on the right that improves forest characterization (Hansen et al.,2008).69M.C.Hansen,T.R.Loveland /Remote Sensing of Environment 122(2012)66–74analysts rely on extensive training data consisting of perﬁeld labels of crop type.These labels are submitted by farmers to the Farm Service Agency near the beginning of the growing season.For some states, the participation rate of farmers in labeling theirﬁelds is nearly 100%.Field extents are held in a geographic information system data-base and the labels assigned per polygon and used as training data.In this case,the analysts create image stacks of non-normalized Indian Remote Sensing AWiFS and Landsat data where each image is held in a different layer without mosaicking.In other words,a virtual strat-iﬁcation is made where the number of image inputs varies greatly from place to place.In this case,the extensive training data map the particular local context and no standard state-level model is applied or even needed.This is a rare case of nearly exhaustive training data.Without such training information,radiometric normalization is required for large area mapping.Summarizing geographic/algorithmic scale,models either operate at the study area scale,or per a population of sub-units.In theﬁrst case,a reliance on standard radiometric responses is required.These signals are characterized using aﬁxed model applied to all pixels over space and through time.The second case relies on training data to model local to state-level radiometric responses in mapping large area land cover.The model tunes to local conditions and no standard rule relating spectral response to land cover exists;there are a host of models.A method in between these two extremes is that of the Australian NCAS national land cover data set produced by CSIRO.In their ap-proach,radiometric corrections,including surface reﬂectance,BRDF and topographic adjustments are made in a standardized manner. Thereafter,characterizations are performed in a stratiﬁed way per pre-deﬁned ecozone with models tuned to each setting.The rationale is that the stratiﬁcation identiﬁes areas where landcover types within a zone have similar spectral properties(Caccetta et al.,2007).The same method is applied in the USGS NLCD tree cover base map of 2000,where ecoregions were the scale of algorithm implementation. However,change was quantiﬁed per path/row.2.2.4.Temporal scaleTo date,annual or multi-year epochal studies have been performed with medium spatial resolution imagery.Sub-annual change quantiﬁ-cation has typically not been a reporting requirement for land cover change products.INPE's DETER product is an exception,employing MODIS data to provide a near-real time alarm product in monitoring Legal Amazon deforestation.No Landsat products to date operate in such a manner.For most monitoring objectives,annual change quanti-ﬁcation is sufﬁcient,and is in fact the goal of a number of global and national monitoring objectives(GCOS;U.S.Climate Change Science Program,2003).However,for many other applications,such as pasture-land or natural hazard monitoring,sub-annual monitoring is required and will likely be tested given the open Landsat archive.Large area updating of land cover using medium spatial resolution data sets is largely limited by the frequency of temporal repeat visits (Wulder et al.,2009).Compared to coarse resolution data sources, such as MODIS that have nominally daily image capture globally, Landsat has a repeat cycle of16days.However,only the United States is acquired automatically each overpass.This lack of coverage impacts the ability to quantify annual land change for many regions.In addi-tion to limited acquisitions,seasonality and cloud cover also reduce the viability of annual land cover updates.For example,persistent cloud cover over Indonesia has led to the use of MODIS to disaggre-gate5-year aggregate Landsat-scale change to annual estimates (Broich,Hansen,Stolle,et al.,2011b).For the PRODES and NCAS data sets,a seasonal cloud-free window each dry season enables annual updating.This is not the case for many other countries and regions that require more data intensive methods to remove clouds, and that may face likely fundamental data limitations to synoptic annual monitoring.Improved temporal updating at medium spatial resolutions will be solved only by more frequent image acquisitions.Ideas for improved acquisitions over tropical forests with persistent cloud cover include constellations of satellites with both polar and equatorial orbits to ensure that sufﬁcient data are collected in tropical areas along the equator(Hansen et al.,2006).The Disaster Monitoring Constellation (da Silva Curie et al.,2002)has such a capability,but does not operate within the context of a continuing global acquisition strategy compa-rable to that of Landsat.Regardless,improving the temporal revisit rate of the land surface at medium spatial resolution is the most pressing need regarding large area land cover and change monitoring.2.2.5.AlgorithmsPhysically-based modeling of land cover has not been demonstrated, and empirical methods are the rule.Traditionally,land cover character-ization algorithms have been divided into supervised and unsupervised methods.In terms of the approaches used in large area land cover mapping using medium spatial resolution inputs,supervised methods, or slight variations of supervised approaches are favored.All example products reviewed here,except for the LEDAPS and NLCD change prod-ucts,directly employ training data for model development.In the case of PRODES,an analyst deﬁnes the spectral endmembers principally for cre-ating a shade fraction layer that is subsequently input to segmentation and unsupervised clustering algorithms.For NCAS,training data are used with a spectral index,derived by reducing the feature space using a canonical variate analysis.A thresholding approach is applied to discriminate forest from non-forest.The method used for the SDSU and CI products employs training data and a decision tree algorithm. The NASS CDL uses one of the richest training datasets,labeled agricul-turalﬁeld polygons,also with a decision tree algorithm.The North America Disturbance Index method employed a knowledge-based disturbance index empirically tuned to be a relative measure between mature forest and bare soil.An index for creating a non-forest mask was also developed.2.2.6.Change detection methodsThe most common approach to change detection is bi-temporal change analysis,the direct comparison of pairs of images or charac-terizations(Coppin et al.,2004).Change is quantiﬁed via spectral (image)or thematic(characterization)contrast.Such comparisons can often employ a thematic base map to better isolate or target the change dynamic of interest.Most of the large area mapping products reviewed here use some kind of variation of bi-temporal change detection.Another approach is multi-temporal change detection that employs more imagery to enhance the feature space beyond a time1 to time2construct.Again,a variety of approaches and mixes of these basic constructs exist in current products.All methods rely on the use of a base map within which change is identiﬁed.The direct comparison of individual characterizations through time, or post-characterization comparison,is used in the NCAS and LEDAPS products.Post-characterization methods are susceptible to the conﬂa-tion of errors in the individual characterizations.The NCAS approach ameliorates this problem through the use of a time-series and spatial modeling of joint probabilities(Caccetta et al.,2007).The LEDAPS forest disturbance and NLCD impervious surface products use a continuous index as a basis of comparison.Such an approach reduces overall error when compared to a discrete post-characterization comparison of,for example,forest/non-forest classes.The SDSU and CI methods directly characterize forest loss using all spectral inputs.In the case of CI and the SDSU European Russian product(Potapov et al.,2011),a combined time1and time2image feature space is used to map forest cover loss.The CI methods employs optimal single date imagery for both time1and time2periods.The SDSU European Russia derived composite time1and time2imagery from over7000image inputs.For the SDSU Congo and Indonesia products,a multi-temporal feature space derived from thousands of70M.C.Hansen,T.R.Loveland/Remote Sensing of Environment122(2012)66–74。

地理大数据背景下空间分析课程教学改革探索

地理信息世界GEOMATICS WORLD 第28卷第2期2021年4月2021.4Vol.28 No.2地理大数据背景下空间分析课程教学改革探索引文格式：陈逸敏，刘小平，齐志新，等. 地理大数据背景下空间分析课程教学改革探索[J].地理信息世界,2021,28(2):12-15.【摘要】地理大数据的兴起对地理信息科学学科发展和人才培养带来了机遇和挑战。

空间分析课程教学是地理信息科学人才培养的重要环节，但现有的课程教学模式存在局限性，难以适应大数据时代的人才培养目标和需求。

本文重点阐述了在地理大数据背景下的空间分析课程教学改革设想，包括对学科基础知识教学的强化、前沿发展动态的吸收，以及加强对学生科学思维和技术实现能力的培养等，同时介绍了目前已经初步落实的教学改革措施。

通过这些措施的实践，有助于利用大数据发展带来的新机遇，为地理信息科学领域培养更多具备地理素养和大数据分析能力的综合人才。

【关键词】地理大数据；空间分析教学；教学改革【中图分类号】G4 【文献标识码】A 【文章编号】1672-1586（2021）02-0012-04Exploration and Practice of Geographic Information Science Based on MOOCsand Flipped Classroom Teaching ModelAbstract: The emergence of big geodata has provided new opportunities and challenges to the development ofgeographical information science and the cultivation of talents. Spatial analysis course is fundamental to the cultivation of GIScience talents. However, the current mode of spatial analysis teaching has some limitations for the achievement of cultivation objectives. This article discusses potential designs of teaching reform for the spatial analysis course in the context of big geodata. These may include improving basic knowledge teaching, introducing more frontline information, and enhancing scientific thinking and practicing. Additionally, this article recommends some relevant actions of reform that have been already conducted in spatial analysis teaching. It is expected that these actions can effectively improve the teaching of spatial analysis, and hence achieve the objectives of talents cultivation in the context of big geodata.Key words: big geodata; spatial analysis teaching; teaching reformCHEN Yimin, LIU Xiaoping , QI Zhixin, LIU Kai, SHI Qian(School of Geography and Remote Sensing, Guangzhou University, Guangzhou 510275, China)基金项目：中山大学教学改革项目（中山大学教务〔2020〕72号）、国家自然科学基金（41871306）、中央高校基本科研业务费（20lgzd09）资助作者简介：陈逸敏（1985—），男，广东潮安人，副教授，博士，主要从事城市计算与情景模拟等方向的研究。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Unfortunately, the institutions and culture of science remain rooted in that pre-electronic era. Taking full advantage of electronic data will require a great deal of additional infrastructure, both technical and cultural
SIGMOD Workshop on Data Engineering for Wireless and Mobile Access, 2012.
Research Theme 1: Spatial Databases
Evacutation Route Planning
Parallelize Range Queries
Spatial Big Data Definitions
• Spatial datasets exceeding capacity of current computing systems
• To manage, process, or analyze the data with reasonable effort • Due to Volume, Velocity, Variety, …
Morning 7am – 12am Afternoon 12noon – 5pm Evening 5pm – 12pm Midnight 12midnight – 7pm Total
Work
Farm
Home Work Club Farm Total
10 19 4
2 20 5
15 10 4 1
29 1
Nest locations Distance to open water
Spatial outliers: sensor (#9) on I-35
Vegetation durability
Water depth
Co-locatctions
Outline
11
Graphs SBDs: Temporally Detailed
Spatial Graphs, e.g., Roadmaps, Electric grid, Supply Chains, …
Temporally detailed roadmaps [Navteq]
Use cases: Accessibility by time of week, Best start time, Best route at different start-times
6
Traditional Spatial Data
Spatial attribute:
Neighborhood and extent Geo-Reference: longitude, latitude, elevation
Spatial data genre Raster: geo-images e.g., Google Earth Vector: point, line, polygons Graph, e.g., roadmap: node, edge, path
• • • • •
Motivation What is Spatial Big Data (SBD)? SBD and Science SBD Analytics Conclusions
4
Big Data
Mining and analyzing these big new data sets can open the door to a new wave of innovation, accelerating productivity and economic growth. Some economists, academics and business executives see an opportunity to move beyond the payoff of the first stage of the Internet, which combined computing and low-cost communications to automate all kinds of commercial transactions. Estimated Value >Usd 1 Trillion per year by 2020 Location-based service: usd 600 B Health Informatics: usd 300 B Manufacturing: …
only in old plan Only in new plan In both plans
Shortest Paths
Storing graphs in disk blocks
Theme 2 : Spatial Data Mining
Location prediction: nesting sites
• SBD Components
• Data-intensive Computing: Cloud Computing • Middleware, e.g., Map-Reduce, Pregel, Big-Table, … • Big-Data analytics, e.g., data mining, machine learning, computational statistics, … • Big Data science and societal applications • Ex. Social media datasets, e.g., Google Flu Trend • Which patterns may be detected in these datasets? • Flu outbreaks ?
(Courtsey: Prof. V. Kumar)
8
Use Case: Patterns of Life, e.g., activity space
Weekday GPS track for 3 months
Patterns of life Activity Space: Usual places and visits Rare places, Rare visits
10
Persistent Surveillance at American Red Cross
• Even before cable news outlets began reporting the tornadoes that ripped through Texas on
Tuesday, a map of the state began blinking red on a screen in the Red Cross' new social media monitoring center, alerting weather watchers that something was happening in the hard-hit area. (AP, April 16th, 2012)
Spatial Big Data
Shashi Shekhar
McKnight Distinguished University Professor Department of Computer Science and Engineering, University of Minnesota /~shekhar AAG-NIH Symp. on Enabling a National Geospatial Cyberinfrastructure for Health Research (July 2012) More details in S. Shekhar et al., Spatial Big Data Challenges Intersecting Mobility and Cloud Computing, ACM
Geo-videos from UAVs, security cameras Satellite Imagery (periodic scan), LiDAR, … Geo-sensor networks Climate simulation, EPA Air Quality
Example use cases
Raster Data for UMN Campus Courtesy: UMN
Graph Data for UMN Campus Courtesy: Bing
Vector Data for UMN Campus Courtesy: MapQuest
7
Raster SBD
Data Sets >> Google Earth
Use cases: Persistent Surveillance
Outbreaks of disease, Disaster, Unrest, Crime, … Hot-spots, emerging hot-spots Spatial Correlations: co-location, teleconnection
12
Outline
• • • • •
Motivation What is Spatial Big Data (SBD)? SBD and Science SBD Analytics Conclusions
13
Big Data and Science
Science in the Petabyte Era –
54 50 15 1
Home
Club
30
30
30
30
120
9
Vector SBD from Geo-Social Media

Spatial Big Data AAG_NIH_July2012_GeoFrontiers_Shekhar

合集下载

大数据领域学术文章英语作文格式

《大数据审计》第1章大数据审计概述

基于重症支气管哮喘差异表达基因及其治疗中药筛选的生物信息学分析

大数据英文演讲 Big Data presention

1206面对大数据挑战和机遇敢问我国政府统计的数据采集方式的出路

Telops Inc. 2014 IEEE GRSS Data Fusion Contest Des

云计算技术及其应用

大数据领域学术文章英语作文格式

Remote-Sensing-of-Environment

地理大数据背景下空间分析课程教学改革探索

文档推荐

最新文档

Spatial Big Data AAG_NIH_July2012_GeoFrontiers_Shekhar

合集下载

大数据领域学术文章英语作文格式

《大数据审计》第1章 大数据审计概述

基于重症支气管哮喘差异表达基因及其治疗中药筛选的生物信息学分析

大数据英文演讲 Big Data presention

1206面对大数据挑战和机遇敢问我国政府统计的数据采集方式的出路

Telops Inc. 2014 IEEE GRSS Data Fusion Contest Des

云计算技术及其应用

大数据领域学术文章英语作文格式

Remote-Sensing-of-Environment

地理大数据背景下空间分析课程教学改革探索

文档推荐

最新文档

《大数据审计》第1章大数据审计概述