Abstract Spatio-temporal information coding in the cuneate nucleus
- 格式:pdf
- 大小:538.95 KB
- 文档页数:7
chapter5semantics语言学语义学是对语言单位,尤其是词和句子意义的研究。
1.“意义”的意义G.Leech提出7种意义:概念意义,内涵意义,社会意义,感情意义,反射意义,搭配意义,主题意义。
G.Leech的概念意义包括两个方面:涵义和指称。
涵义和指称的区别类似内涵与外延:前者指一个实体的抽象属性,后者指拥有这些属性的具体实体。
每个单词都有涵义,即概念意义,否则他们无法使用或理解,但并非每个单词都有指称。
2.指称论(命名论):该理论把词语意义与词所指或词所代表的事物联系起来。
该理论对于解释专有名词或在现实中有所指的名词时很有效。
但其无法指称抽象概念。
有时同一东西会有不同词语的表达。
3.概念论。
代表是语义三角说。
该理论认为,词和所指事物之间没有直接关系,而是以抽象的概念为中介。
4,语境论认为应该在具体语境中研究词的意义. 语境包括情景语境和上下文两种。
5.行为主义理论认为词的意义是说话者说话得情景及听话人的反应6.意义关系词语词之间的主要意义关系:相同关系,相反关系,包含关系a.同义关系。
完全同义关系很少,所谓的同一都依赖语境,并总在某方面不同。
(方言,内涵,文体等)b.反义关系主要包括:等级反义关系,互补反义关系,关系反义关系。
1)等级反义的特点:第一,否定一方并不必然是肯定另一方,还有中间状态;第二,没有绝对评判标准,标准随对象而改变。
第三,通常用其中表示较高程度的词来覆盖整个量级。
覆盖性词被称为“无标记的”,即一般性的;被覆盖词被称为“有标记的”,即特殊的。
一般使用覆盖性词语。
一旦使用被覆盖词语,表示有某种特殊的、不一般的情况。
第四,可用very修饰,可有比较级最高级2)互补反义关系,第一,肯定一方意味着否定另一方。
反之亦然。
第二,不用very修饰,没有比较级最高级。
第三,评判标准绝对。
没有覆盖性词语3)关系(反向)反义关系,表现两个实体间的一种反向关系,不构成肯否定对立。
一个预设着另一个的存在。
中南民族大学硕士学位论文闪烁滞后效应及其认知机制姓名:***申请学位级别:硕士专业:生物医学工程指导教师:***20070520中南民族大学硕士学位论文摘要在视觉环境中,一些物体不发生变化,而另外一些物体则发生连续变化。
通常认为,即使刺激物的视网膜像的移动速度很快,人类的视觉系统也能够高度精准地判断这些物体的相对空间位置。
然而,令人惊奇的是,越来越多的证据表明,一个物体的位置知觉受到该物体运动的影响。
例如,当一个闪烁物体与一个运动物体在空间上对齐,观察者往往知觉到发生闪烁的位置滞后于运动物体。
滞后的幅度可以通过朝滞后相反的方向调整闪烁的位置使之在知觉上与运动物体对齐的方法来测量。
这一现象被称作闪烁滞后效应(flash-lag effect)。
在最近的几十年里,出现了一些不同的假说来解释该效应的机制,有视觉暂留差异假说、运动外推假说、知觉潜伏期差异假说和后测假说。
总之,这些关于闪烁滞后效应的不同的解释可以根据它们假设的基础分为两类。
第一类解释援引了尚未得到确认的基于运动的神经加工机制。
运动外推假说和知觉潜伏期假说模型都显示出了探测快速运动物体以及准确判断它的位置进而提高截断性动作成功率的可能性的生物学意义。
在这类假说中,闪烁刺激仅仅是一个除了神经延迟和视觉暂留而没有其它特性的时间或空间标记。
第二类解释援引了尚未被确定的基于闪烁的神经加工机制。
后测假设认为闪烁刺激的发生重置了运动物体的整合加工。
然而,近期的心理学实验结果表明这些解释都不是完全可行的。
究其原因,本文认为以前的研究往往是从空间或时间某一个方面提出假说,没有把时间和空间的信息加工作为一个整体来考察闪烁滞后效应。
刺激物最初的感觉表征编码和储存的问题在很大程度上被忽略。
众所周知,当作用于视觉感觉器官的图像刺激迅速移去后,图像随即在视觉感觉通道内被登记,并保留一瞬间的记忆。
这一暂留被称为视觉感觉记忆。
通常认为视觉感觉记忆的特点是它的记忆容量大、信息保持时间非常短暂,感觉记忆中的信息是未经任何心理加工的。
第2豐寻期地珈信息科聿欣嚮蔦。
2;1于1刀Journal of Geo-information Science=八"n.,zuz1引用格式:张正方,闫振军,王增杰,等•基于Bayes网络的多粒度时空对象地理过程演化建模一以新安江模型为例卩]・地球信息科学学报,2021,23(1):124-133.[Zhang Z F,Yan Z J,Wang Z J,et al.Modeling of geographical process evolution of spatio-temporal objects of multi-granularity based on Bayesian network:A case study of the Xin'anjiang model[J],Journal of Geo-information Science,2021,23(1):124-133.]DOI:10.12082/dqxxkx.2021.200426基于Bayes网络的多粒度时空对象地理过程演化建模——以新安江模型为例张正方匕闫振军匕王增杰匕傅蓉匕罗文心:俞肇元⑴1.南京师范大学虚拟地理环境教育部重点实验室,南京210023;2.江苏省地理环境演化国家重点实验室培育建设点,南京210023;3.江苏省地理信息资源开发与利用协同创新中心,南京210023Modeling of Geographical Process Evolution of Spatio-temporal Objects ofMulti-granularity based on Bayesian Network:A Case Study of the Xin'an Jiang Model ZHANG Zhengfang'2,YAN Zhenjun12,WANG Zengjie1'2,FU Rong12,LUO Wen'w,,YU Zhaoyuan®1.Key Laboratory of Virtual Geographic Environment of The Ministry of Education(Nanjing Normal University),Nanjing210023, China;2.Cultivation Base of State Key Laboratory of Geographical Environment Evolution,Jiangsu Province,Nanjing210023, China;3.Jiangsu Provincial Center for Collaborative Innovation in Geographical Information Resource Development and Application,Nanjing210023,ChinaAbstract:Spatio-temporal objects of multi-granularity have the characteristics of multi-granularity,multi-type, multi-form,multi-reference system,multi-relation,multi-dimensional dynamics,and multi-energy autonomy.It can be used to directly describe the real world from micro to macro.Based on the spatio-temporal objects modeling theory,constructing the integrated expression of the coupled evolution of multi-scale geographic objects is the key to supporting geographic analysis and modeling with spatio・temporal objects of multi・granularity model.Based on spatio・temporal objects of multi-granularity modeling theory,this paper develops a Bayesian network・based geographic process evolution expression and modeling method on the basis of probability diagrams and conditional probability tables.This method uses spatio-temporal objects of multi・granularity as Bayesian network nodes,and constructs Bayesian network according to the association relationship between spatio-temporal objects of multi-granularity.It uses Bayesian probability to express the strength of the relationship between spatio・temporal objects of multi-granularity.And it describes the dynamic changes of the feature state of the elements through the update operator and the probability graph model.Based on this method,the Xin'anjiang Model is selected to conduct the modeling and simulation experiment of the geographic process of spatio-temporal objects of multi・granularity.This paper uses the hydrological data of Chengcun Village from1989to1995as training data,and the hydrological data of1996as simulated ing precipitation surface,evaporation surface,runoff surface and confluence surface to construct Bayesian network收稿日期:2020-07-31;修回日期:2020-12-25.基金项目:国家重点研发计划项目课题(2016YFB0502301);国家自然科学基金项目(41976186)。
Spatio-Temporal LSTM with Trust Gates for3D Human Action Recognition817 respectively,and utilized a SVM classifier to classify the actions.A skeleton-based dictionary learning utilizing group sparsity and geometry constraint was also proposed by[8].An angular skeletal representation over the tree-structured set of joints was introduced in[9],which calculated the similarity of these fea-tures over temporal dimension to build the global representation of the action samples and fed them to SVM forfinal classification.Recurrent neural networks(RNNs)which are a variant of neural nets for handling sequential data with variable length,have been successfully applied to language modeling[10–12],image captioning[13,14],video analysis[15–24], human re-identification[25,26],and RGB-based action recognition[27–29].They also have achieved promising performance in3D action recognition[30–32].Existing RNN-based3D action recognition methods mainly model the long-term contextual information in the temporal domain to represent motion-based dynamics.However,there is also strong dependency between joints in the spatial domain.And the spatial configuration of joints in video frames can be highly discriminative for3D action recognition task.In this paper,we propose a spatio-temporal long short-term memory(ST-LSTM)network which extends the traditional LSTM-based learning to two con-current domains(temporal and spatial domains).Each joint receives contextual information from neighboring joints and also from previous frames to encode the spatio-temporal context.Human body joints are not naturally arranged in a chain,therefore feeding a simple chain of joints to a sequence learner can-not perform well.Instead,a tree-like graph can better represent the adjacency properties between the joints in the skeletal data.Hence,we also propose a tree structure based skeleton traversal method to explore the kinematic relationship between the joints for better spatial dependency modeling.In addition,since the acquisition of depth sensors is not always accurate,we further improve the design of the ST-LSTM by adding a new gating function, so called“trust gate”,to analyze the reliability of the input data at each spatio-temporal step and give better insight to the network about when to update, forget,or remember the contents of the internal memory cell as the representa-tion of long-term context information.The contributions of this paper are:(1)spatio-temporal design of LSTM networks for3D action recognition,(2)a skeleton-based tree traversal technique to feed the structure of the skeleton data into a sequential LSTM,(3)improving the design of the ST-LSTM by adding the trust gate,and(4)achieving state-of-the-art performance on all the evaluated datasets.2Related WorkHuman action recognition using3D skeleton information is explored in different aspects during recent years[33–50].In this section,we limit our review to more recent RNN-based and LSTM-based approaches.HBRNN[30]applied bidirectional RNNs in a novel hierarchical fashion.They divided the entire skeleton tofive major groups of joints and each group was fedSpatio-Temporal LSTM with Trust Gates for3D Human Action RecognitionJun Liu1,Amir Shahroudy1,Dong Xu2,and Gang Wang1(B)1School of Electrical and Electronic Engineering,Nanyang Technological University,Singapore,Singapore{jliu029,amir3,wanggang}@.sg2School of Electrical and Information Engineering,University of Sydney,Sydney,Australia******************.auAbstract.3D action recognition–analysis of human actions based on3D skeleton data–becomes popular recently due to its succinctness,robustness,and view-invariant representation.Recent attempts on thisproblem suggested to develop RNN-based learning methods to model thecontextual dependency in the temporal domain.In this paper,we extendthis idea to spatio-temporal domains to analyze the hidden sources ofaction-related information within the input data over both domains con-currently.Inspired by the graphical structure of the human skeleton,wefurther propose a more powerful tree-structure based traversal method.To handle the noise and occlusion in3D skeleton data,we introduce newgating mechanism within LSTM to learn the reliability of the sequentialinput data and accordingly adjust its effect on updating the long-termcontext information stored in the memory cell.Our method achievesstate-of-the-art performance on4challenging benchmark datasets for3D human action analysis.Keywords:3D action recognition·Recurrent neural networks·Longshort-term memory·Trust gate·Spatio-temporal analysis1IntroductionIn recent years,action recognition based on the locations of major joints of the body in3D space has attracted a lot of attention.Different feature extraction and classifier learning approaches are studied for3D action recognition[1–3].For example,Yang and Tian[4]represented the static postures and the dynamics of the motion patterns via eigenjoints and utilized a Na¨ıve-Bayes-Nearest-Neighbor classifier learning.A HMM was applied by[5]for modeling the temporal dynam-ics of the actions over a histogram-based representation of3D joint locations. Evangelidis et al.[6]learned a GMM over the Fisher kernel representation of a succinct skeletal feature,called skeletal quads.Vemulapalli et al.[7]represented the skeleton configurations and actions as points and curves in a Lie group c Springer International Publishing AG2016B.Leibe et al.(Eds.):ECCV2016,Part III,LNCS9907,pp.816–833,2016.DOI:10.1007/978-3-319-46487-950。
Is What You See, What You Get? GeospatialVisualizations Address Scale and UsabilityAashishChaudhary and Jeff BaumesUnlimited geospatial information now is at everyone’s fingertips with the proliferation of GPS-embedded mobile devices and large online geospatial databases. To fully understand these data and make wise decisions, more people are turning to informatics and geospatial visualization, which are used to solve many real-world problems.To effec tively gather information from data, it’s critical to address scalability and intuitive user interactions and visualizations. New geospatial analysis and visualization techniques are being used in fields such as video analysis for national defense, urban planning and hydrology.Why Having Data Isn’t Good Enough AnymorePeople are realizing that data are only useful if they can find the relevant pieces of data to make better decisions. This has broad applicability, from finding a movie to watch to elected officials deciding how much funding to allocate for an aging bridge. Information can easily be obtained, but how can it be sorted, organized, made sense of and acted on? The field of informatics solves this challenge by taking large amounts of data and processing them into meaningful, truthful insights.In informatics, two main challenges arise when computers try to condense information down to meaningful concepts: disorganization and size. Some information is available in neat, organized tables, ready for users to pull out the needed pieces, but most is scattered across and hidden in news articles, blog posts and poorly organized lists.Researchers are feverishly working on new ways to retrieve key ideas and facts from these types of messy data sources. For example, services such as Google News use computers that constantly "read" news articles and posts worldwide, and then automatically rank them by popularity, group them by topic, or organize them based on what the computer thinks is important to viewers. Researchers at places such as the University of California, Irvine, and Sandia National Laboratories are investigating the next approaches to sort through large amounts of documents using powerful supercomputers.The other obstacle is the sheer vo lume of data. It’s difficult to use informatics techniques that only work on data of limited size. Facebook, Google and Twitter have data centers that constantly process huge quantities of information to deliver timely and relevant information and advertisements to each person currently logged on..Figure 1. A collection of videos are displayed without overlap (top). The outline color represents how close each video matches a query. An alternate view (bottom) places thevideos on top of each other in a stack, showing only the strongest match result.Informatics is a key tool, but it’s not enough to simply find these insights that explain the data. Geospatial visualization bridges the gap from computer number-crunching to human understanding. If informatics is compared to finding the paths in a forest, visualization is like creating a visual map of those paths so a person can navigate through the forest with ease.Most people today are familiar with basic geospatial visualizations such as weather maps and Web sites for driving directions. The news media are starting to test more-complex geospatial visualizations such as online interactive maps to help navigate politicians’ stances on issues, exit polls and precinct reports during election times. People are just beginning to see the impact that well-designed geospatial visualizations have on their understanding of the world..Geospatial Visualization in the Real WorldPeople have been looking at data for decades, but the relevant information that accompanies the data has changed in recent years. In late 1999, Esri released a new software suite, ArcGIS, that could use data from various sources. ArcGIS provides an easy-to-use interface for visualizing 2-D and 3-D data in a geospatial context. In 2005, Google Earth launched and made geospatial visualization available to the general public.Geospatial visualization is becoming more significant and will continue to grow as it allows people to look at the totality of the data, not just one aspect. This enables better understanding and comprehension, because it puts the data in context with their surroundings. The following three cases demonstrate geospatial visualization use in real-world scenarios:1. Urban PlanningPlanners use geomodeling and geovisualization tools to explore possible scenarios and communicate their design decisions to team members or the general public. For example, urban planners may look at the presence of underground water and the terrain’s surrounding topology before deciding to build a new suburb. This is relevant for areas around Phoenix, for example, where underground water presence and proximity to a knoll or hill can determine the suitability of a location for construction.Figure 2. Videos from the same location are partially visible, resembling a stack of cards. Each video is outlined by the color representing the degree to which it matches the query.Looking at a 3-D model of a house with its surroundings gives a completely different perspective than just looking at the model of a house by itself. This also can help provide clear solutions to problems, such as changing the elevation of a building’s base to make it stand better.Urban planning is one of the emerging applications of computer-generated simulation. Cities’ rapid growth places a strain on natural resources that sustain growth. Water management, in particular, becomes a critical issue.The East Valley Water Forum is a regional cooperative of water providers east of Phoenix, and it’s designing a water-management plan for the next 100 years. Water resources in this region come from the Colorado River, the Salt River Project, groundwater, and other local and regional water resources. These resources are affected directly and indirectly by local and global factors such as population, weather, topography, etc.To best understand the relationship among water resources and various factors, the Arizona Department of Water Resources analyzes hydrologic data in the region using U.S. Geological Survey MODFLOW software, which simulates the status of underground water resources in the region. For better decision making and effective water management, a comprehensive scientific understanding of the inputs, outputs and uncertainties is needed. These uncertainties include local factors such as drought and urban growth.Looking at numbers or 2-D graphs to understand the complex relationship between input, output and other factors is insufficient in most cases. Integrating geospatial visualizations with MODFLOW simulations, for example, creates visuals that accurately represent the model inputs and outputs in ways that haven’t been previously presented.For such visualizations, two water surfaces are positioned side-by-side—coming from two different simulations—with contour lines drawn on top. In this early prototype, a simple solution—providing a geospatial plane that can be moved vertically—brings the dataset into a geospatial context. This plane includes a multi-resolution map with transparency. Because these water layers are drawn in geospatial coordinates, it matches exactly with the geospatial plane. This enables researchers to quickly see the water supplies of various locations.2. Image and Video AnalysisDefense Advanced Research Projects Agency launched a program, Video and Image Retrieval nd Analysis Tool (VIRAT), for understanding large video collections. The project’s core requirement is to add video-analysis capabilities that perform the following:• Filter and prioritize massive amounts of archived and strea ming video based on events.• Present high-value intelligence content clearly and intuitively to video analysts.• Reduce analyst workload while increasing quality and accuracy of intelligence yield.Visualization is an integral component of the VIRAT system, which uses geospatial metadata and video descriptors to display results retrieved from a database.Analysts may want to look at retrieval result sets from a specific location or during a specific time range. The results are short clips containing the object of interest and its recent trajectory. By embedding these results in a larger spatiotemporal context, analysts can determine whether a retrieved result is important.3. Scientific VisualizationU.S. Army Corps of Engineers’ research organ ization, the Engineer Research and Development Center, is working to extend the functionality of the Computational Model Builder (CMB) environment in the area of simulation models for coastal systems, with an emphasis on the Chesapeake and Delaware bays.The CMB environment consists of a suite of applications that provide the capabilities necessary to define a model (consisting of geometry and attribute information) that’s suitable for hydrological simulation. Their simulations are used to determine the impact that environmental conditions, such as human activities, have on bodies of water.Figure 3. Google Earth was used to display Chesapeake Bay’s relative salt (top) and oxygen (bottom) content (higher concentrations in red).One goal is to visualize simulation data post-processed by CMB tools. Spatiotemporal information, for example, is included in oxygen content and salinity data. Drawing data in geospatial context lets users or analysts see which locations are near certain features, giving the data orientation and scale that can easily be understood. Figure 3 shows the oxygen and salt content of Chesapeake Bay, where red shows higher concentrations and blue shows lower concentrations.Moving ForwardVisualizations that can be understood at all levels will be key in politics, economics, national security, urban planning and countless other fields. As information becomes increasingly complex, it will be harder for computers to extract and display those insights in ways people can understand.More research must be done in new geospatial analysis and visualization capabilities before we drown in our own data. And it’s even more important to educate people in how to use and interpret the wealth of analysis tools already available, extending beyond the basic road map.High schools, colleges and the media should push the envelope with new types of visuals and animations that show data in richer ways. The price of explaining these new views will be repaid when audiences gain deeper insights into the real issues otherwise hidden by simple summaries. Progress isn’t limited by the volume of available information, but by the ability to consume it.翻译:你所看到的,你得到了什么?地理空间可视化的处理规模和可用性作者:AashishChaudhary和包密斯·杰夫无限的空间信息现在就在每个人的指尖,其与扩散的嵌入式GPS移动设备和大型网上地理空间数据库。
第13卷㊀第12期Vol.13No.12㊀㊀智㊀能㊀计㊀算㊀机㊀与㊀应㊀用IntelligentComputerandApplications㊀㊀2023年12月㊀Dec.2023㊀㊀㊀㊀㊀㊀文章编号:2095-2163(2023)12-0120-05中图分类号:TP183文献标志码:A融合时空信息个性化旅游兴趣点推荐算法潘㊀兰,魏嘉银,卢友军,干㊀霞(贵州民族大学数据科学与信息工程学院,贵阳550025)摘㊀要:针对个性化旅游兴趣点推荐算法中存在的问题,如忽视序列图中节点间的时空信息及未能充分利用空间相关性,本文提出了一种融合时空信息个性化旅游兴趣点推荐算法㊂运用自注意力机制获取用户的动态信息,将其作为图神经网络中用户和兴趣点的时空特征,并参与领域信息的聚合㊂实验表明,该算法具有可行性,能够有效提升推荐性能㊂关键词:自注意力机制;图神经网络;个性化旅游兴趣点推荐Recommendpersonalizedtouristattractionsbasedonspatial-temporalinformationPANLan,WEIJiayin,LUYoujun,GANXia(SchoolofDataScienceandInformationEngineering,GuizhouMinzuUniversity,Guiyang550025,China)Abstract:Aimingtoaddresstheissuesinpersonalizedtourismpoint-of-interestrecommendationalgorithms,specificallytheneglectofspatio-temporalinformationbetweennodesinsequencegraphsandtheinadequateutilizationofspatialcorrelation,thispaperintroducesarecommendedpersonalizedtouristattractionsbasedonspatial-temporalinformationalgorithmthatisfoundedontheself-attentionmechanism.Thisalgorithmemploystheattentionmechanismtocapturethedynamicinformationofusers,treatingtheseasthespatio-temporalfeaturesofbothusersandpointsofinterestwithinthegraphneuralnetwork,andengagesintheaggregationofdomain-specificinformation.Experimentalresultsdemonstratetheviabilityofthismethodanditsabilitytosignificantlyenhancerecommendationperformance.Keywords:self-attentionmechanism;graphneuralnetwork;personalizedtourisminterestpointrecommendation.基金项目:贵州省教育厅自然科学研究项目(黔教技[2022]015号);贵州省省级科技计划项目资助(黔科合基础[2018]1082,黔科合基础[2019]1159号);贵州省科技计划项目(QKHJCZK2022YB195,QKHJCZK2023YB143,QKHPTRCZCKJ2021007)㊂作者简介:潘㊀兰(1997-),女,硕士研究生,主要研究方向:海量数据统计与分析㊁推荐算法;卢友军(1987-),男,博士,副教授,主要研究方向:复杂系统与大数据分析;干㊀霞(1997-),女,硕士研究生,主要研究方向:海量数据统计与分析㊂通讯作者:魏嘉银(1986-),男,博士,副教授,主要研究方向:大数据分析与处理㊁推荐算法设计与分析㊂Email:weijiayin05@sina.com收稿日期:2023-10-180㊀引㊀言基于位置服务的普及,用户在社交平台上分享旅游兴趣点(Point-of-Interest,POI)的签到和评论已成为一种流行趋势[1]㊂丰富的用户签到数据推动了兴趣点推荐系统的发展,该系统可模拟用户访问偏好并预测最可信的下一个POI,历史签到数据为服务商提供了宝贵信息,揭示了用户的行为模式,该系统可帮助用户决定下一个目的地和计划行程㊂序列效应在旅游兴趣点推荐中至关重要,现有研究主要针对序列转换,已逐渐被基于神经网络的方法取代㊂Wang等[2]提出了全局时空感知图神经网络模型,捕捉全局时空关系;Liu等[3]考虑POI动态时效性,提出了一种交互增强且时间感知的图卷积网络模型,用于连续的POI推荐;Capanema等[4]结合循环神经网络(RNN)和图神经网络(GNN)预测下一个POI类别;Wang等[5]使用图神经网络(GNN)和用户与POI之间的复杂相关性进行推荐;Zhang等[6]提出深度卷积和多头自注意力位置网络模型,模型用于位置的智能推荐;Tsai等[7]利用用户生成内容推荐游览序列㊂虽然基于时空信息的兴趣点已被广泛研究,但仍存在空间相关性利用不足的问题㊂Cao等[8]提出轨迹感知动态图卷积网络,捕获局部空间相关性;Lai等[9]提出多视图时空增强超图网络进行下一个POI推荐;OuJ等[10]使用增强时序卷积网络学习顺序转换相关性,进行下一个POI推荐;LiQ等[11]提出基于注意力的时空门控图神经网络模型进行序列推荐;LiH等[12]提出时空意向学习自我意向网络,捕捉用户长期偏好,识别特定时间重访特定POI的意向㊂1㊀个性化旅游兴趣点推荐算法模型1.1㊀时空特征捕获层为了更好地考虑轨迹中两次访问之间的不同空间距离和时间间隔,时空特征捕获层旨在聚合相关POI并更新访问表示,通过引入自注意层捕捉长期依赖并为每次访问分配权重㊂将时空上下文纳入序列建模,可提升模型对局部POI的关注和推荐结果的可解释性㊂将用户㊁旅游兴趣点和时间戳集合分别表示为U={u1,u2, ,uU},P={p1,p2, ,pP}和T={t1,t2, ,tT},序列Su={cu1,cu2, ,cuSu}其中cuj是用户u的第j次签到记录㊂通过不同的参数矩阵WQ,WK,WVɪℝdˑd进行转换获得新的序列Su,式(1):㊀Su=attention(EuWQ,EuWK,EuWV,E(Δ),M)(1)㊀㊀自注意力机制函数定义,式(2):hTu=softmax(M∗(θKTd0+EΔ))V(2)㊀㊀其中,hTu表示注意力输出嵌入矩阵;d0=d'h,h是注意力头数;d0是尺度因子,用于避免因点积归一化过大而导致的消失梯度;θ,K,Vɪℝlˑd'表示序列的查询㊁键和值向量,在自注意力中θ=K=V;EΔ表示时空上下文矩阵AD插值嵌入的输出;Wang等[13]将矩阵Mɪℝlˑl,θKTd0ɪℝlˑl其上三角元素填满 -¥ ,则元素与元素相乘;softmax函数用于将这些分数归一化为注意力权重㊂最终的空间用户输入嵌入Tp使用和Tu同样的计算方法㊂使用多头注意力从不同的潜在视角来捕获时空信息,并输入到前馈神经网络中,最终用户和POI的时空信息输出,式(3)和式(4):hu=FFN(hTu1 hTui hTuk)(3)hp=FFN(hTp1 hTpi hTpk)(4)㊀㊀其中,k表示注意力函数的数量㊂1.2㊀区域子图设置模块为优化推荐算法,本文设置了一个区域子图设置模块,以加强相似用户间的影响并减弱不相似用户间的影响㊂每个用户由特征向量表示,包括图空间特征和地理空间特征㊂本文为使用归一化的经纬度数据来确定地理空间特征,并利用特征向量将兴趣相似的用户分组到同一子图中㊂在子图构建中,不相似用户的连接会被弱化或断开,以降低其负面影响㊂该模块结合了图空间和地理空间信息㊂用户特征向量可以表示为式(5):Featureu=σ(W1(e(1)u+eup)+b1)(5)㊀㊀其中,e(1)u表示第一层图卷积后的用户嵌入,即通过聚合一阶相邻POI获得的图空间结构;eup表示用户最频繁访问的POI地理位置;σ(㊃)是激活函数;W1和b1分别表示权重矩阵和偏置矢量㊂获得用户特征向量后,使用三层神经网络投影获得用户特征,U表示用户投影得到分类预测向量,式(6):U=W4(W3Featureu+b3)+b4(6)㊀㊀其中,W3,W4和b3,b4分别表示权重矩阵和偏置矢量㊂相似用户归入同一区域子图㊂确定子图数量后,用户只收集所在子图内的邻近信息,降低不相似用户间的影响㊂1.3㊀时空图神经网络模块根据用户数据构造用户矩阵二分图签到序列二分图G=(Q,E),Q={qui}|Q|i=1表示签到数据的集合,E表示序列图中两个相邻节点之间的边集,表示访问旅游兴趣点Pur后的下一个兴趣点为Put+1㊂为了进一步利用结构化POI的空间邻近性,本文采用谱图卷积网络(GCN),该网络能够挖掘隐藏在图的拓扑信息中的非结构化信息㊂为了更好地捕捉POI与动态空间之间的相关性,构建归一化拉普拉斯矩阵L的邻接矩阵,式(7):L=(D+I)-1(A+I)(7)㊀㊀其中,D㊁A㊁I分别表示度矩阵㊁邻接矩阵和单位矩阵㊂每个卷积层只处理一阶邻域信息,包括自度矩阵和对邻接矩阵的归一化运算,则GCN的逐层传播规则被定义式(8):H(l)=σ(LH(l-1)W(l))(8)㊀㊀其中,H(l-1)是节点第l层的输出结果;W(l)表示线性变换矩阵;σ是非线性激活函数㊂GCN学习节点表示,通过聚合邻接节点信息生成中间表示,再经线性投影和非线性激活更新所有节点㊂通过构建区域子图,利用空间结构和地理特征将兴趣相似的用户分类㊂区域子图数可用集合R={r1,r2, ,ri}表示,i表示区域子图,同类用户归入同一子图,并将直接相连的POI也归入该子图㊂同一POI可能出现在多个子图中,但每个用户只属于121第12期潘兰,等:融合时空信息个性化旅游兴趣点推荐算法一个子图㊂合并用户和POI的初始嵌入,通过一阶图卷积得到用户和POI的签到关系,本文将所有用户和POI的初始嵌入进行合并,并利用一阶图卷积运算进行处理,式(9)和式(10):e(1)u=ðiɪNu1NuNpe(0)p(9)e(1)p=ðuɪNp1NuNpe(0)u(10)㊀㊀其中,Nu表示用户访问的POI的集合;Np表示已经访问POI的用户的集合;1NuNp用于实现对称归一化㊂在图卷积中,用户节点归属一个子图,POI分布在与其相关的子图,POI嵌入是所有含该POI的子图中嵌入之和㊂经l-1层图卷积传播后得到,式(11) 式(13):e(l)u=ðprɪNu1NuNpe(l-1)pi(11)e(l)pi=ðuɪNip1NpNue(l-1)u(12)e(k)p=ðsɪRe(k)pi(13)㊀㊀其中,R为POI所在的每个区域子图的集合,epi为POI在区域子图ri中的嵌入表示㊂2㊀实验结果及分析2.1㊀数据集描述基于位置的社交网络拥有大量用户数字足迹,用户通过签到分享位置㊂为验证融合时空信息个性化旅游兴趣点推荐算法的有效性,本文选用Foursquare和Gowalla这两个公开㊁广泛使用的LBSN数据集进行实验,数据集包含用户㊁POI㊁时间戳㊁经度㊁纬度等信息㊂数据处理时,删除了访问或登记次数少于5次的数据,并随机按7:3划分训练集和测试集㊂数据集信息见表1㊂表1㊀数据集描述Table1㊀Descriptionofthedatasets数据集用户数兴趣点数签到数Gowalla19541329813117914Foursquare176422848327962322.2㊀评估指标为了评估模型的泛化能力,本文使用召回率(Recall@K)来衡量精准推荐的旅游兴趣点比例,用归一化折损累计增益(NormalizedCumulativeLossGain,NDCG@K)来衡量排名表现,并通过这两个指标来逐步优化提出的算法㊂召回率的计算公式(14):Recall@K=SuKK(14)㊀㊀其中,SuK为用户感兴趣的POI个数㊂归一化折损累计增益考虑了每个旅游兴趣点的实际相关性,式(15):DGG@K=ðKi=12reli-1log(i+1)(15)㊀㊀IDCG则表示推荐系统给某一用户返回的最好推荐结果列表,即最相关的结果(目标旅游兴趣点)放在最前面,式(16):IDCG@K=ð|REL|Ki=12reli-1log(i+1)(16)㊀㊀其中,reli表示位置i的推荐结果的相关性㊂一般设置用户给出正反馈的旅游兴趣点的值为1,其余旅游兴趣点的值为0㊂用每个用户的DCG与IDCG之比作为每个用户归一化后的分值,即NDCG,使不同用户之间的NDCG值有可比性,式(17):NDCG@K=DCG@KIDCG@K(17)2.3㊀性能分析为了验证本文提出的融合时空信息个性化旅游兴趣点推荐算法有效性,本文选取了ST-GGNN[14]㊁SR-GNN[15]㊁ST-LSTM[16]等相关算法与本文提出的融合时空信息个性化旅游兴趣点推荐算法在两个公开的数据集上进行对比实验,实验结果见表2㊂可见本文提出的推荐模型明显优于其它基线模型,本文提出的模型比最优的基线模型ST-GGNN在召回率Recall@5㊁Recall@10㊁Recall@20和归一化折损累计增益NDCG@5㊁NDCG@10㊁NDCG@20分别提高了6.5%㊁31.9%㊁21.4%㊁24.1%㊁24.1%㊁40.8%,表明本文提出的方法能有效提升推荐性能㊂为了更好地评估模型,本文基于所提出的方法,进行了消融实验,分别去除了时空特征捕获层(Ours-TD)和区域子图(Ours-R)用来评估所提出的模型中的核心部分对实验性能的影响,实验结果如图1和图2所示,可见去除了时空特征捕获层(Ours-TD)和区域子图设置模块(Ours-R),算法都表现出了较弱的性能,说明了这两种设计所包含的组件在一定程度上都有助于获取用户的偏好㊂221智㊀能㊀计㊀算㊀机㊀与㊀应㊀用㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀第13卷㊀表2㊀模型精度对比表Table2㊀Comparisonofmodelaccuracy算法Gowalla数据集Recall@5Recall@10Recall@20nDCG@5nDCG@10nDCG@20SR-GNN0.04430.04980.05310.04470.05010.0568ST-LSTM0.05520.06420.07830.05690.05940.0667ST-GGNN0.06990.08910.13350.06430.07260.0773Ours0.07450.11760.16210.07980.09010.10890.160.140.120.100.080.060.040.020O u r s -R O u r s -T D O u r sO u r s -RO u r s -T D O u r sR e c a l l @5R e c a l l @10R e c a l l @200.120.100.080.060.040.020R e c a l l @5R e c a l l @10R e c a l l @20R e c a l lN D C G㊀㊀㊀㊀(a)召回率㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀(b)归一化折损累计增益图1㊀在Gowalla数据实验结果Fig.1㊀TheresultsofexperimentsinGowalladata0.160.140.120.100.080.060.040.020O u r s -R O u r s -T D O u r sO u r s -R O u r s -T D O u r sR e c a l l @5R e c a l l @10R e c a l l @200.120.100.080.060.040.020R e c a l l @5R e c a l l @10R e c a l l @20R e c a l lN D C G㊀㊀㊀㊀(a)召回率㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀(b)归一化折损累计增益图2㊀在Foursquare数据实验结果Fig.2㊀TheresultsofexperimentsinFoursquaredata3㊀结束语现有算法忽略了时空信息和空间相关性,而用户的空间偏好可从签到序列中推断㊂本文提出了融合时空信息个性化旅游兴趣点推荐算法,获取用户动态信息,参与领域信息聚合,捕捉时空相关性,动态更新序列图节点,提供用户可能感兴趣的下一个旅游POI列表㊂在两个公开的社交网站的数据上进行了实验验证,实验结果表明此方法提升了推荐性能,为个性化旅游推荐提供了一定的借鉴㊂参考文献[1]LIUX,LIUY,ABERERK,etal.Personalizedpoint-of-interestrecommendationbyminingusersᶄpreferencetransition[C]//Proceedingsofthe22ndACMInternationalConferenceonInformation&KnowledgeManagement.2013:733-738.[2]VEGETABILEBG,STOUT-OSWALDSA,DAVISEP,etal.EstimatingtheentropyrateoffiniteMarkovchainswithapplicationtobehaviorstudies[J].JournalofEducationalandBehavioralStatistics,2019,44(3):282-308.[3]QIL,LIUY,ZHANGY,etal.Privacy-awarepoint-of-interestcategoryrecommendationininternetofthings[J].IEEEInternetofThingsJournal,2022,9(21):21398-21408.[4]RAHMANIHA,ALIANNEJADIM,AHMADIANS,etal.LGLMF:localgeographicalbasedlogisticmatrixfactorizationmodelforPOIrecommendation[C]//Proceedingsofthe15thAsiaInformationRetrievalSocietiesConferenceonInformationRetrievalTechnology,AIRS2019.HongKong,China:SpringerInternationalPublishing,2020:66-78.[5]XUZ,HUZ,ZHENGX,etal.Amatrixfactorizationrecommendationmodelfortourismpointsofinterestbasedoninterestshiftanddifferentialprivacy[J].JournalofIntelligent&FuzzySystems,2023(Preprint):1-15.[6]WANGQ,YINH,CHENT,etal.Nextpoint-of-interestrecommendationonresource-constrainedmobiledevices[C]//ProceedingsoftheWebConference2020.2020:906-916.(下转第128页)321第12期潘兰,等:融合时空信息个性化旅游兴趣点推荐算法识别效果㊂由此可见,加入注意力机制对于小麦锈病的识别和病症判断是有效的㊂在AT-ResNet100网络模型中由于网络层数较高,所需要的模型参数也相应增加,识别准确率也相对其他模型表现更好㊂综合考虑网络性能和训练次数等因素,本文选择AT-ResNet100网络模型作为小麦锈病识别和病症判断的最终网络模型㊂3 结束语针对小麦锈病检测问题,本文提出一种基于注意力ResNet网络模型的小麦锈病检测方法㊂为验证本文网络模型的有效性和鲁棒性,以自定义数据集Wheat-data为实验对象,首先通过对不同算法的识别准确率和F1⁃score性能指标进行分析,可知本文网络模型在自定义数据集上均达到良好的效果;其次在自定义数据集上针对ResNet网络模型和基于注意力ResNet网络模型在识别准确率和F1⁃score方面进行对比,本文网络模型在自定义数据集上平均识别准确率和F1⁃score均高于非注意力ResNet网络模型,表明该网络模型在小麦锈病检测方面的有效性和鲁棒性;最后对网络模型参数和识别准确率进行对比分析,可知AT-ResNet100网络模型具有优良的表现性能㊂参考文献[1]李玉.农业植物病理学[M].长春:吉林科学技术出版社.1992.[2]冷伟锋,马占鸿.基于热红外遥感的小麦条锈病菌越夏区精准勘界[J].植物保护学报,2018,45(1):118-123.[3]雷雨,韩德俊,曾庆东,等.基于高光谱成像技术的小麦条锈病病害程度分级方法[J].农业机械学报,2018,49(5):226-232.[4]BOHNENKAMPD,KUSKAMT,MAHLEINAK,etal.Utilisingpurefungalsporespectraasreferenceforahyperspectralsignaldecompositionandsymptomdetectionofwheatrustdiseasesonleafscale[J].PlantPathol,2019,68(6):1188-1195.[5]祝诗平,卓佳鑫,黄华,等.基于CNN的小麦籽粒完整性图像检测系统[J].农业机械学报,2020,51(5):36-42.[6]林点,潘理,易平.面向图像识别的卷积神经网络鲁棒性研究进展[J].网络与信息安全学报,2022,8(3):111-122.[7]HEKM,ZHANGXY,RENSQ,etal.Deepresiduallearningforimagerecognition[C]//ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition.LasVegas:IEEE,2016:770-778.[8]ABEDINIAO,AMJADYN,ZAREIPOURH.Anewfeatureselectiontechniqueforloadandpriceforecastofelectricalpowersystems[J].IEEETransactionsonPowerSystems,2017,32(1):62-74.[9]张万玉.计算机视觉注意力机制建模研究[D].西安:西安工业大学,2022.[10]王彦彪,陈振勇,郭文萍,等.基于双注意力机制优化CNN架构的GIS局部放电模式识别[J].电力科学与技术学报,2022,37(2):22-29.[11]石磊,王毅,成颖,等.自然语言处理中的注意力机制研究综述[J].数据分析与知识发现,2020,4(5):1-14.[12]张晓凤,陈付彬,罗欢.矩阵Hadamard积与Fan积的特征值新界[J].西南师范大学学报(自然科学版),2022,47(7):1-6.[13]孙瑞安,张云华.结合AdaBERT的TextCNN垃圾弹幕识别和过滤算法[J].智能计算机与应用,2021,11(4):9-13.[14]赵静,李志铭,鲁力群,等.基于无人机多光谱遥感图像的玉米田间杂草识别[J].中国农业科学,2020,53(8):1545-1555.[15]焦计晗,张帆,张良.基于改进AlexNet模型的油菜种植面积遥感估测[J].计算机测量与控制,2018,26(2):186-189.[16]BALLESTERP,ARAUJOR.OntheperformanceofGoogLeNetandAlexNetappliedtosketches[C]//ProceedingsoftheThirtiethAAAIConferenceonArtificialIntelligence.Phoenix,AZ,USA.Phoenix,AI,USA.2016:1124-1128.[17]杨琪威.线性模型平均中惩罚因子选择的交叉验证法[D].上海:华东师范大学,2022.(上接第123页)[7]WANGJ,YANGB,LIUH,etal.Globalspatio-temporalawaregraphneuralnetworkfornextpoint-of-interestrecommendation[J].AppliedIntelligence,2023,53(13):16762-16775.[8]LIUY,WUH,REZAEEK,etal.Interaction-enhancedandtime-awaregraphconvolutionalnetworkforsuccessivepoint-of-interestrecommendationintravelingenterprises[J].IEEETransactionsonIndustrialInformatics,2022,19(1):635-643.[9]CAPANEMACGS,SILVAFA,SILVATRMB,etal.POI-RGNN:Usingrecurrentandgraphneuralnetworkstopredictthecategoryofthenextpointofinterest[C]//Proceedingsofthe18thACMSymposiumonPerformanceEvaluationofWirelessadHoc,Sensor,&UbiquitousNetworks.2021:49-56.[10]WANGD,WANGX,XIANGZ,etal.Attentivesequentialmodelbasedongraphneuralnetworkfornextpoirecommendation[J].WorldWideWeb,2021,24(6):2161-2184.[11]XUS,HUANGQ,ZOUZ.Spatio-TemporalTransformerRecommender:NextLocationRecommendationwithAttentionMechanismbyMiningtheSpatio-TemporalRelationshipbetweenVisitedLocations[J].ISPRSInternationalJournalofGeo-Information,2023,12(2):64-79.[12]TSAICY,CHENYJ,PEÑAAS,etal.Avisitingsequencerecommendationframework:enhancedbydynamiclandmarkandstaytime[J].ExpertSystemswithApplications,2023,230:120649-120662.[13]WANGE,JIANGY,XUY,etal.Spatial-temporalintervalawaresequentialPOIrecommendation[C]//Proceedingsof2022IEEE38thInternationalConferenceonDataEngineering(ICDE).IEEE,2022:2086-2098.[14]OUJ,JINH,WANGX,etal.STA-TCN:Spatial-temporalAttentionoverTemporalConvolutionalNetworkforNextPoint-of-interestRecommendation[J].ACMTransactionsonKnowledgeDiscoveryfromData,2023,17(9):1-19.[15]LIQ,XUX,LIUX,etal.AnAttention-BasedSpatiotemporalGGNNforNextPOIRecommendation[J].IEEEAccess,2022,10:26471-26480.[16]LIH,YUEP,LIS,etal.Spatio-temporalintentionlearningforrecommendationofnextpoint-of-interest[J].Geo-spatialInformationScience,2023:1-14.DOI:10.1080/10095020.2023.2179428821智㊀能㊀计㊀算㊀机㊀与㊀应㊀用㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀第13卷㊀。
作者姓名:阿布都瓦斯提·吾拉木论文题目:基于n维光谱特征空间的农田干旱遥感监测作者简介:阿布都瓦斯提·吾拉木,男,1975年2月出生,于2006年7月获北京大学理学博士学位。
2006年12月至今任美国圣路易斯大学环境科学中心Geospatial Analyst/Research Professor。
中文摘要农田生态系统是一个水分、土壤、植被、大气等诸多因素耦合的复杂系统(SPAC,Soil-Plant-Atmosphere Continuum)。
在农田生态系统水循环中,水分亏缺的积累使农田供水量在一定的时间段内不能满足作物需水量,导致农田干旱的发生。
农田干旱直接和间接地影响人类生存、社会稳定、农业生产、资源与环境可持续发展。
正确评价或预防农田干旱,对促进农业生产和区域可持续发展具有重要的现实意义。
遥感具有客观反映农田水分时空变化的监测能力。
国内外农田遥感干旱监测研究表明:在复杂地表环境下,单纯采用可见光、近红外、热红外或微波波段都无法全面、准确反映农田水分信息,其方法在农田水分监测中暴露出诸多问题,如水分监测的滞后效应、模型复杂、参数的不确定性和过度依赖于田间和气象观测资料等,不能适应全面、动态的农田干旱监测与农田水分信息提取的迫切需求。
利用定量遥感方法,实现准确的农田干旱信息提取一直是遥感应用领域亟待解决的重要科学问题之一。
基于多维光谱特征空间的农田干旱信息提取,可以综合多源遥感的优势,为干旱监测提供更丰富、更高分辨率的农田水分信息,有望去除以往的遥感干旱模型带来的监测效果滞后、模型复杂、参数的不确定性等问题,形成农田干旱遥感监测新方法。
本论文以可见光近红外2维光谱空间干旱建模为切入点,通过加入短波红外,进一步拓宽遥感干旱监测的波段和地表生态物理参数,构建了反演土壤水分、叶片/冠层含水量(EWT)和叶片/冠层相对含水量(FMC)等参数的遥感模型,针对农田干旱最关键的两个指标土壤水分和叶片/冠层含水量,建立了多个干旱监测模型,形成了以n维光谱特征空间为基础的农田遥感干旱监测的新方法。
第38卷第6期 计算机应用与软件Vol 38No.62021年6月 ComputerApplicationsandSoftwareJun.2021基于时空相关性的LSTM算法及PM2.5浓度预测应用赵彦明(河北民族师范学院数学与计算机科学学院 河北承德067000)收稿日期:2019-09-12。
河北省社科基金项目(HB18TJ004)。
赵彦明,副教授,主研领域:脑计算理论,图像分析,生物特征识别。
摘 要 现阶段空气污染物粒子浓度演进过程模拟与预测算法忽视了粒子浓度的空间相关性,且没有实现粒子浓度的时间依赖性与空间相关性融合。
对此,提出基于时空相关性的LSTM算法(TS_LSTM)并应用于PM2.5浓度预测。
该算法提出空间相关性及其相关因子计算方法;将局部区域相关性因子与LSTM算法的遗忘门和记忆门融合,建立基于局部地理信息的LSTM算法(LTS_LSTM);融合LTS_LSTM算法学习结果与全局空间相关性因子,构造基于全局地理信息时空相关的LSTM算法(GTS_LSTM)。
模拟全局与局部的空气污染物粒子浓度演进过程,并实现离子浓度预测。
在全局与局部数据集上,将该算法与回归算法、支持向量机、模糊神经网络、LSTM神经网络、GC LSTM神经网络、DL LSTM神经网络比较研究,结果表明:在空气粒子浓度预测上,该算法的预测性能优于各种传统预测算法,接近深度LSTM算法。
关键词 长短时记忆网络 空气污染物浓度预测 循环神经网络 时空相关性 PM2.5中图分类号 TP3 文献标志码 A DOI:10.3969/j.issn.1000 386x.2021.06.040LSTMALGORITHMBASEDONSPATIO TEMPORALCORRELATIONANDITSAPPLICATIONOFPM2.5CONCENTRATIONPREDICTIONZhaoYanming(SchoolofMathematicsandComputerScience,HebeiNormalUniversityforNationalities,Chengde067000,Hebei,China)Abstract Atpresent,thesimulationandpredictionalgorithmoftheevolutionprocessofairpollutantparticleconcentrationignoresthespatialcorrelationofparticleconcentration,anddoesnotrealizethefusionoftime dependentandspatialcorrelationofparticleconcentration.Basedonthis,aLSTMalgorithmbasedonspatio temporalcorrelation(TS_LSTM)anditsapplicationofPM2.5concentrationpredictionisproposed.Itproposedthecalculationmethodofcorrelationfactor;thefactoroflocalspatialinformationcorrelationwasfusedwithforgettinggateandmemorygateofLSTMalgorithmtoestablishtheLSTMalgorithmofthelocalgeographicinformation(LTS_LSTM);thealgorithmfusedthelearningresultofLTS_LSTMalgorithmwiththeglobalspatialcorrelationfactortoestablishaglobalspatio temporalcorrelationLSTMalgorithm(GTS_LSTM).Itsimulatedglobalandlocalevolutionprocessofairpollutantparticleconcentration,andrealizedionconcentrationprediction.Onthelocalandglobaldataset,thisalgorithmiscomparedwithregressionalgorithm,supportvectormachine,fuzzyneuralnetwork,LSTMneuralnetwork,GC LSTMneuralnetworkandDL LSTMneuralnetwork.Theresultsshowthat:intheairparticleconcentrationprediction,thepredictionperformanceofthisalgorithmisbetterthanvarioustraditionalpredictionalgorithms,closetothedepthLSTMalgorithm.Keywords LSTM Predictionofairpollutantconcentration Recurrentneuralnetwork Spatio temporalcorrelation PM2.5250 计算机应用与软件2021年0 引 言空气污染物对人类健康的威胁与日俱增。
上海交通大学硕士学位论文电影胶片斑块损伤修复技术的研究姓名:***申请学位级别:硕士专业:通信与信息系统指导教师:***20081201电影胶片斑块损伤修复技术的研究摘要由于长期存放和多次播放,电影胶片存在着灰尘、污垢、霉斑、掉色、图像抖动、划伤、闪烁、噪声、变色、模糊等各种问题。
本文首先详细描述了老电影胶片中存在的各种损伤,并介绍了这些损伤的产生原因及其在视频序列中的表现特征。
对于其中的斑块损伤,本文进行了重点研究。
通过分析传统的斑块检测算法,可以发现视频中由于噪声的影响,在确定斑块位置、细化斑块边缘以及去伪取真方面是一个较大的难点。
本文提出了一种基于改进SROD的斑块检测算法。
算法中首先对传统的SROD算法进行改进,有效降低了算法对阈值设置的敏感程度,从而提高了抗噪性能,并得到初步的检测结果。
进而提出EM估计算法框架对初步检测结果进行后处理,即先后对候选斑块进行灰度分级和面积分级,以进一步消除噪声影响,确定斑块位置以及边缘信息。
传统的斑块修复方法通常仅仅利用时域上或空域上的相关信息进行数据重建,或者一些算法虽然考虑了时域和空域的结合,但计算复杂度较高。
本文提出了基于空时结合图像修复技术的斑块修复算法。
现有的图像修复算法都是二维的,我们在模型中加入了时间域的信息,充分利用了视频序列中空域和时域的信息。
在此基础上,我们提出了空域双向叠加修复和时域多候选区叠加修复的方法,取得了较好的斑块修复效果。
本文采用实际电影胶片胶转磁后的视频序列和人为加入了斑块的高清数字电影序列作为实验素材,分别对传统算法和提出的算法进行实验和仿真。
通过实验结果的对比可以发现,本文所提出的算法可以有效的对斑块进行检测和修复,而且模型复杂度较低,具有一定的实用价值。
关键词:电影胶片修复,斑块检测,斑块修复,图像修复RESEARCH ON BLOTCH REMOV AL IN OLD FILMSABSTRACTDigital film archives are usually damaged due to aging and frequently playing, which may cause different artifacts on the films, such as dust spots, dirt, blotches, film unsteadiness, line scratches, flicker, noises, color variations and blurs. In this thesis, respective causes and characteristics of several typical artifacts are firstly described in details, as well as the existing systems for artifacts removal. This thesis then focused on blotch detection and blotch removal techniques, trying to find out more automated and simple ways for blotch recovery.After studying the traditional detection techniques, we can find that it’s very difficult to determine the location of blotches, thinning its edge and reduce false alarms due to noise. This thesis proposed a blotch detection method based on improved SROD algorithm. First, the SROD algorithm is improved by taking into account the influence of noise, thus enhancing the anti-noise performance and reducing the sensitivity of its threshold setting. Then an EM postprocessing framework, composed of gray classification and area classification, is introduced to determine the location of blotches and thinning its edge. According to the experiments, the method proposed in this thesis get satisfied results. Blotch detection results are significantlyimproved comparing with traditional techniques.Some traditional blotch removal techniques use only the spatio or temporal information for data interpolation. The others which make use of both directions, however, usually imply a high computational complexity. Considering taking advantage of both spatio and temporal information with a lower complexity, we extend the image inpainting model from 2-D to 3-D. On this basis, an inpainting-based spatio-temporal algorithm for removing blotch is proposed. This algorithm is realized by introducing the Bi-directional method in current frame and the Multi-candidate Areas method in different frames. The results show that the method is effective for removing blotches.KEY WORDS:Digital film restoration, blotch detection, blotch removal, Inpainting图片目录图1-1 视频记录,存储,转化和数字化过程中可能产生的失真[1] (3)图1-2在Charlie Chaplin的电影中连续三个有瑕疵的帧(a, c, e),右图为左图的局部放大,分别显示的是(b)噪声(d)灰尘和大斑点(f)垂直划痕[1] (4)图1-3 数字电影修复系统[5] (9)图1-4 老电影胶片数字修复流程框图1[3] (11)图1-5 老电影胶片数字修复流程框图2 [4] (11)图1-6 整体的磁盘到磁盘的修复系统[4] (12)图2-1(a) 斑块检测与修复同步进行方法 (b) 模块化方法 (16)图2-2 (a) ROD检测器像素选取 (b) ROD检测器像素排序 (19)图2-3 MMF子滤波器模板 (23)图2-4 插值原理图 (25)图3-1 Postprocessing检测后处理步骤 (34)图3-2 带后处理模块的斑块去除系统 (35)图3-3 改进的SROD检测器像素选取方法 (35)图3-4 EM算法检测后处理框架 (37)图3-5 不同检测算法的检测性能[1] (a) Western序列 (b) MobCal序列 (c) Manege序列 (d) Tunnel序列 (39)图3-6 待修复序列 (40)图3-7 待修复序列(a)第二帧中的斑块位置 (41)图3-8 采用不同阈值时SROD和ISROD的检测结果 (42)图3-9 序列(a)斑块检测结果 (43)图3-10 序列(b)斑块检测结果 (44)图4-1 图像修复示意[31] (47)图4-2 待修复区域示意图 (48)图4-3 基于空时结合图像修复技术的斑块修复算法实现 (50)图4-4 八种修复匹配模板 (51)图4-5 双向叠加修复的扫描顺序 (51)图4-6 待修复区域的空间关系 (52)图4-7 图3-6(a)序列斑块修复主观结果比较 (57)图4-8 图3-6(b)序列斑块修复主观结果比较 (58)表格目录表4-1客观修复结果比较 (59)上海交通大学学位论文原创性声明本人郑重声明:所呈交的学位论文,是本人在导师的指导下,独立进行研究工作所取得的成果。
Spatio-temporal information coding in the cuneate nucleusJ. Navarro1, A. Canedo1 and E. Sánchez21Departamento de Fisioloxía, Facultade de Medicina.Universidade de Santiago de Compostela15782 Santiago de Compostela, Spainfsjna@usc.es, fsancala@usc.es2Grupo de Sistemas Intelixentes (GSI)Departamento de Electrónica e Ciencias da Computación, Facultade de Física.Universidade de Santiago de Compostela15782 Santiago de Compostela, Spaineduardos@usc.esAbstractThe dorsal column nuclei, cuneatus and gracilis, receive somesthetic information impingingon projection cells and local inhibitory interneurons. The presence of these interneuronsallows spatio-temporal progressive coding of information that can be modelled (Sánchez etal., 2004) using their known synaptic connections with projection cells (Mariño et al., 1999;Aguilar et al., 2002, 2003). Here we explore the dependency of the processing timerequired to complete the progressive coding with regard to cutaneous stimuli varying in sizeand contras.Keywords: Dorsal Column nuclei, Somatosensory System, Computational Models,Information Coding1. IntroductionThe dorsal middle region of the dorsal column nuclei (DCN) is constituted for two classes of neurons, glutamatergic cells projecting into the contralateral medial lemniscus and local interneurons releasing GABA, glycine or both neurotransmitters (Popratiloff et al., 1996). The cat’s DCN receive cortical input from the primary somatosensory cortex (Chambers and Liu, 1957; Walberg, 1957; Rustioni and Hayes, 1981; Martinez et al., 1995) and primary glutamatergic afferents topographically aligned (Berkley et al., 1986; Conti et al., 1989; Rustioni and Weinberg, 1989; Kharazia et al., 1996).Recent studies using intracellular as well as extracellular recording combined with microiontophoresis have revealed that: i) the cuneate neurons projecting to the medial lemniscus present a center-surround antagonism (Canedo and Aguilar, 2000), ii) the internal circuitry of the cutaneous sector of the cat’s cuneate nucleus is such that the projecting cells with matched receptive fields monosynaptically activate each other through recurrent collaterals re-entering the nucleus, while inhibiting other projection neurons with different RFs (Aguilar et al., 2002), and iii) the cortico-cuneate cells (Aguilar et al., 2003) and primary afferents (Soto et al., 2004) with matched RFs activate and disinhibit aligned cuneo-lemniscal neurons and inhibit other neighbouring projection neurons with unmatched RFs. The activation at the centre of the RF is produced through NMDA and non-NMDA glutamate receptors, the lateral inhibition is produced through GABAergic interneurons and the disinhibition is mediated by serial glycinergic-GABAerg ic-projection cells interactions (Aguilar et al., 2002,2003; Soto et al., 2004).The above results are the basis to determine the influences over each projecting neuron and were used to develop a computational model for the cuneate nucleus (Sánchez et al., 2004). Both projection neurons and interneurons are represented as MacCulloch-Pits processing units. Concretely, the activity of the processing units representing the projection neurons is under the modulating influence of primary afferent, collateral recurrent and corticocuneate inputs affecting these cells as described above. The different weight values w ji model the synaptic interactions among the distinct classes of neurons and are grouped into matrixes whose values allow for adjusting the contribution of each neuronal class to the network representing the cuneate nucleus.2. MethodsIn this work we explore the behaviour of the computational model proposed by Sanchez et al. (2004). The model consists of 40,000 units distributed over three main layers representing: (1) projection or cuneolemniscal (CL) neurons, (2) GABaergic recurrent interneurons, and (3) glycinergic interneurons. CL units show an excitatory centre - inhibitory surround afferent 3x3 RF, as well as recurrent inhibition mediated through GABaergic interneurons. These units have a 7x7 ring-shaped RF derived from CL cellsthat are second-order neighbours. Finally, the glycinergic interneurons present a fully excitatory 9x9 RF deriving from CL cells with overlapped RFs. The RF´s sizes were selected such that their combination gives the more stable results. In addition, the interneurons produce shunting inhibition on CL neurons thus achieving robust edge detection against stimulus intensity. The computational simulations initially update units in layer 1 and 2, then units in layer 3, and finally those located in layer 4. Each stage in the update process is called iteration.Experiments were performed with stimuli of different forms, sizes and textures over a white background. Both stimulus and output intensity are represented in grey scale, thus taking values from 0 to 255. Stimulus textures are obtained from .bmp files produced with GIMP, a Linux image-processing application. Quasi-random frames with a repetitive pattern of hexagonal tiles were generated, and then combined to build a mosaic. The GIMP function gaussian blur has been used to modify the degree of stimulus contrast. Network responses were characterized based on two main features: robustness of the edge detection process and processing time required to reach a stationary state.3. ResultsIn general, when a stimulus is presented to the network, three main elements in the output are clearly observed:(1) stimulus edge detection through the excitatory centre - inhibitory surround generated by primary afferents, (2) an oscillatory response reaching a stable state and determined by recurrent inhibition, and (3) a progressive coding starting from higher contrast regions and finishing with lower contrast ones and that is induced by the inhibitory action of glycinergic interneurons over GABAergic interneurons. This last element could be viewed as a type of fill-in effect.We have initially tested the model with a non-blurred stimulus composed by hexagonal tiles. Figure 1 shows the stimulus (first image) as well as the network output corresponding to iterations 1, 3, 6 and 9. The edges, the oscillatory response and the fill-in progressive coding of some tiles can be observed. The fill-in process is fast and the stationary state is reached at iteration number 6.Figure 1. Fill-in effect for non-blurred stimulus. The stimulus (first image) is made up of hexagonal tiles of size 20. The rest of images show the network output for iterations 1, 3, 6 and 9.In order to study the relationship between stimulus contrast and processing time of the fill-in effect, we have repeated the previous experiments with the same input, but different degrees of gaussian blur. When this parameter was set to 5 (moderate blur), the stationary state is reached later, at iteration number 18. Figure 2 shows the stimulus (first image) and the network responses at iterations 1, 3, 6, 9, 12, 15, 18 and 21. Due to the blur transformation, the fill-in effect affects a larger area, thus probably demanding more computational power, i.e processing time, to complete the effect. This trend is stressed when the gaussian blur parameter is set to 16 (high blur), as shown in Figure 3. The stimulus (first image) now requires 30 iterations to reach the stationary state. The fill-in effect now covers the whole area of the stimulus and progressive coding is much slower than in previous cases. The last example is presented in Figure 4, where the stimulus is a square with uniform texture and same size as before. The first image illustrates the stimulus while the other ones represent the output at iterations 1, 3, 6, 9, 12, 15, 21 and 25. The stationary state is reached after iteration 30. Although the stimulus is far simpler than those used in previous figures,Figure 2. Fill-in effect for moderate blurred stimulus (blur parameter = 5). Stimulus (first image) and network output (following images) for iterations 1, 3, 6, 9, 12, 15, 18, and 21, is shown.Figure 3. Fill-in effect for high blurred stimulus (blur parameter = 16). Stimulus (first image) and network output (following images) for iterations 1, 3, 6, 9, 12, 15, 21 and 24, is shown.Figure 4. Fill-in effect for non-blurred stimulus with uniform texture. Stimulus (first image) and network output (following images) for iterations 1, 3, 6, 9, 12, 15, 21 and 25, is shown.the main output elements are again presented: edge detection, oscillatory response and progressive fill-in coding.A comparison between the previous experiments is shown in Figure 5. Two parameters, named “num ber of zeros” and “global output”, are introduced to characterize the progressive coding on each iteration. The first one represents the number of neurons that are excited by the stimulus but that do not reach threshold and hence do not fire. The second one is the sum of the activation function at those units receiving afferent excitation. Both parameters show an oscillatory pattern that decreases in amplitude over time, Figure 5. Parameters “number of zeros” and “global output” represented over time. In the top plot, the first trace describes the network response to the stimulus shown in Figure 1. The rest of traces correspond to figures 2, 3 and 4, respectively. In the bottom plot, the ordering is just the inverse. In both plots, the degree of stimulus contrast determines the oscillatory duration, the oscillatory amplitude and the residual oscillations at the stationary state. All stimuli have the same size and form.meaning that the stationary state is reached in all cases. The degree of stimulus contrast determines the evolution of the parameters over time. Lower contrast implies: (1) longer duration of the fill-in effect before reaching the stationary state, (2) larger amplitudes of the oscillatory patterns during the fill-in effect, and (3) larger amplitudes of the residual oscillations when the stationary state is reached. The opposite can be applied for higher contrast. In the Discussion section, we provide a possible interpretation of these findings.Additional experiments were performed with stimulus with textures made up with different tile sizes. Figure 6 confirms that the relationship between stimulus contrast and processing time is maintained. In all cases, the required processing time increases when contrast decreases (gaussian blur increases).To complete this section, we have analyzed the fill-in effect dependency upon stimulus size and processing time. Results are shown in Figure 7 with round-shaped stimulus and uniform textures. Again, edge detection and fill-in progressive coding is observed. The plot illustrates that the processing time linearly increases with the stimulus size, expressed in terms of the stimulus diameter.Figure 6. Relationship between processing time, degree of blur and stimulus tile size. For each tile size, the relationship between processing time and the degree of blur is shown. A sigmoid-like function describes the dependency of processing time and the degree of blur.Figure 7. Relationship between stimulus size and processing time. Network output at iteration 21 after the presentation of a round-shaped stimulus with uniform texture (top). The relationship between processing time, until reaching the stationary state, and stimulus diameter follows a linear function (bottom).4. DiscussionBased on the results, the network behaviour is robust as it performs edge detection and fill-in progressive coding under a variety of presented stimuli. However, the network processing varies depending on the stimulus contrast and size. According to the results, this processing seems highly predictable and in some cases can be easily quantified. The explanation of this complex behaviour lies on the network architecture, which was constructed based on experimental data obtained from projection neurons of the cat’s cuneate nucleus. The interplay between excitatory and inhibitory influences is the key to explain the oscillatory response and the fill-in effect.On the other hand, the model is consistent with the behaviour expected for a structure, like the cuneate nucleus, where the first processing of somatosensory information is performed. The function of tactile and pressure receptors in the skin is intended to get all possible information from the outside world. Such information is determined by the texture and size of objects around us. When the skin contacts surfaces of reduced size or highly-contrasted, the information is rapidly processed and compactly transmitted over time to higher processing areas, like the Thalamus, which also receives an “end of transmission” signal as the network reaches its stationary state. Furthermore, bigger or blur stimulus, i.e low-contrasted surfaces, induce either longer duration of the fill-in effect (longer oscillatory patterns) or residual oscillations when the stationary state is reached. Such coding can be understood as the need to further accomplish exploratory motor actions. The detection and classification of these encoded signals would require specific decoders, like the local oscillators proposed by Ahissar and Vaadia (1990), at higher cognitive structures.The combination of the fill-in progressive coding discussed in this paper with appropriate decoders would allow the nervous system to evaluate the result of an exploratory action, to choose the best perception strategy, and broadly speaking, to manage its computational resources in a more efficient way.ReferencesAguilar, J., Rivadulla, C., Soto, C. & Canedo, A. (2003) New corticocuneate cellular mechanisms underlying the modulation of cutaneous ascending transmission in anesthetized cats. J. Neurophysiol., 89: 3328-3339.Aguilar, J., Soto, C., Rivadulla, C. & Canedo, A. (2002) The lemniscal-cuneate recurrent excitation is suppressed by strychnine and enhancedd by GABA A antagonists in the anesthetized cat. Eur. J. Neurosci., 16, 1697-1704.Ahissar, E. $ Vaadia EE. (1990). Oscillatory activity of single units in a somatosensory cortex of an awake monkey and their possible role in texture analysis. Proc. Natl. Acad. Sci., 87, 8935-8939.Berkley, K.J., Budell, R.J., Blomqvist, A. & Bull, M. (1986) Output systems of the dorsal column nuclei in the cat. Brain Res. Rev.,11, 199-225.Canedo, A & Aguilar, J. (2000) Spatial and cortical influences exerted on cuneothalamic and thalamocortical neurons of the cat. Eur. J. Neurosci., 12, 2515-2533.Chambers, WW. & Liu, CN. (1957) Cortico-spinal tract of the cat. An attempt to correlate the pattern of degeneration with deficits in reflex activity following neocortical lesions. J. Comp. Neurol., 108, 23-55. Conti, F., De Felipe, J., Fariñas, I. & Manzoni, T. (1989) Glutamate-positive neurons and axon terminals in cat sensory cortex: a correlative light and electron microscopio study. J. Comp. Neurol.,290, 141-153.Kharazia, V.N., Phend, K.D., Weinberg, R.J. & Rustioni, A. (1996) Excitatory amino acids in corticofugal projections: microscopic evidence. In: Excitatory amino acids and the cerebral cortex. Conti, F., Hicks, T.P. (eds). Cambridge, MA: MIT Press/Bradford Books, pp. 127-135.Mariño, J., Martinez, L. & Canedo, A. (1999) Sensorimotor integration at the dorsal column nuclei. NIPS 14, 231-237.Martinez, L., Lamas, JA.& Canedo, A. (1995) Pyramidal tract and corticospinal neurons with branching axons to the dorsal column nuclei of the cat. Neuroscience, 68, 195-206.Popratiloff, A., Valtschanoff, J.G., Rustioni, A. & Weinberg, R.J. (1996) Colocalization of GABA and glycine in the rat dorsal column nuclei. Brain Res., 706, 308-312.Rustioni, A. & Hayes, NL., (1981) Corticospinal tract collaterals to the dorsal column nuclei of cats. Exp. Brain Res., 43, 237-245.Rustioni, A. & Weinberg, R.J. (1989) The somatosensory system. In Björklund, A., Hökfelt, T. & Swanson, L.W. (eds), Handbook of Chemical Neuroanatmy: Integrated Systems of the CNS. Elsevier, Amsterdam, pp. 219-321.Sánchez, E., Aguilar, J., Rivadulla, C. & Canedo, A. (2004) The role of Glycinergic Interneurons in the Dorsal Column Nuclei. Neurocomputing. (To appear in June 2004).Soto, C., Aguilar, J., Martín-Cora, F., Rivadulla, C. & Canedo, A. (2004) Intracuneate mechanisms underlying primary afferent cutaneous processing in anesthetized cats. Eur. J. Neurosci.19, (in press). Walberg, F. (1957) Corticofugal fibres to the nuclei of the dorsal columns. An experimental study in the cat. Brain, 80, 273-287.。