Decentralizing UNIX abstractions in the exokernel architecture

格式：pdf
大小：96.38 KB
文档页数：42

下载文档原格式

人工智能领域中英文专有名词汇总

名词解释中英文对比<using_information_sources> social networks 社会网络abductive reasoning 溯因推理action recognition(行为识别)active learning(主动学习)adaptive systems 自适应系统adverse drugs reactions(药物不良反应)algorithm design and analysis(算法设计与分析) algorithm(算法)artificial intelligence 人工智能association rule(关联规则)attribute value taxonomy 属性分类规范automomous agent 自动代理automomous systems 自动系统background knowledge 背景知识bayes methods(贝叶斯方法)bayesian inference(贝叶斯推断)bayesian methods(bayes 方法)belief propagation(置信传播)better understanding 内涵理解big data 大数据big data(大数据)biological network(生物网络)biological sciences(生物科学)biomedical domain 生物医学领域biomedical research(生物医学研究)biomedical text(生物医学文本)boltzmann machine(玻尔兹曼机)bootstrapping method 拔靴法case based reasoning 实例推理causual models 因果模型citation matching (引文匹配)classification (分类)classification algorithms(分类算法)clistering algorithms 聚类算法cloud computing(云计算)cluster-based retrieval (聚类检索)clustering (聚类)clustering algorithms(聚类算法)clustering 聚类cognitive science 认知科学collaborative filtering (协同过滤)collaborative filtering(协同过滤)collabrative ontology development 联合本体开发collabrative ontology engineering 联合本体工程commonsense knowledge 常识communication networks(通讯网络)community detection(社区发现)complex data(复杂数据)complex dynamical networks(复杂动态网络)complex network(复杂网络)complex network(复杂网络)computational biology 计算生物学computational biology(计算生物学)computational complexity(计算复杂性) computational intelligence 智能计算computational modeling(计算模型)computer animation(计算机动画)computer networks(计算机网络)computer science 计算机科学concept clustering 概念聚类concept formation 概念形成concept learning 概念学习concept map 概念图concept model 概念模型concept modelling 概念模型conceptual model 概念模型conditional random field(条件随机场模型) conjunctive quries 合取查询constrained least squares (约束最小二乘) convex programming(凸规划)convolutional neural networks(卷积神经网络) customer relationship management(客户关系管理) data analysis(数据分析)data analysis(数据分析)data center(数据中心)data clustering (数据聚类)data compression(数据压缩)data envelopment analysis (数据包络分析)data fusion 数据融合data generation(数据生成)data handling(数据处理)data hierarchy (数据层次)data integration(数据整合)data integrity 数据完整性data intensive computing(数据密集型计算)data management 数据管理data management(数据管理)data management(数据管理)data miningdata mining 数据挖掘data model 数据模型data models(数据模型)data partitioning 数据划分data point(数据点)data privacy(数据隐私)data security(数据安全)data stream(数据流)data streams(数据流)data structure( 数据结构)data structure(数据结构)data visualisation(数据可视化)data visualization 数据可视化data visualization(数据可视化)data warehouse(数据仓库)data warehouses(数据仓库)data warehousing(数据仓库)database management systems(数据库管理系统)database management(数据库管理)date interlinking 日期互联date linking 日期链接Decision analysis(决策分析)decision maker 决策者decision making (决策)decision models 决策模型decision models 决策模型decision rule 决策规则decision support system 决策支持系统decision support systems (决策支持系统) decision tree(决策树)decission tree 决策树deep belief network(深度信念网络)deep learning(深度学习)defult reasoning 默认推理density estimation(密度估计)design methodology 设计方法论dimension reduction(降维) dimensionality reduction(降维)directed graph(有向图)disaster management 灾害管理disastrous event(灾难性事件)discovery(知识发现)dissimilarity (相异性)distributed databases 分布式数据库distributed databases(分布式数据库) distributed query 分布式查询document clustering (文档聚类)domain experts 领域专家domain knowledge 领域知识domain specific language 领域专用语言dynamic databases(动态数据库)dynamic logic 动态逻辑dynamic network(动态网络)dynamic system(动态系统)earth mover's distance(EMD 距离) education 教育efficient algorithm(有效算法)electric commerce 电子商务electronic health records(电子健康档案) entity disambiguation 实体消歧entity recognition 实体识别entity recognition(实体识别)entity resolution 实体解析event detection 事件检测event detection(事件检测)event extraction 事件抽取event identificaton 事件识别exhaustive indexing 完整索引expert system 专家系统expert systems(专家系统)explanation based learning 解释学习factor graph(因子图)feature extraction 特征提取feature extraction(特征提取)feature extraction(特征提取)feature selection (特征选择)feature selection 特征选择feature selection(特征选择)feature space 特征空间first order logic 一阶逻辑formal logic 形式逻辑formal meaning prepresentation 形式意义表示formal semantics 形式语义formal specification 形式描述frame based system 框为本的系统frequent itemsets(频繁项目集)frequent pattern(频繁模式)fuzzy clustering (模糊聚类)fuzzy clustering (模糊聚类)fuzzy clustering (模糊聚类)fuzzy data mining(模糊数据挖掘)fuzzy logic 模糊逻辑fuzzy set theory(模糊集合论)fuzzy set(模糊集)fuzzy sets 模糊集合fuzzy systems 模糊系统gaussian processes(高斯过程)gene expression data 基因表达数据gene expression(基因表达)generative model(生成模型)generative model(生成模型)genetic algorithm 遗传算法genome wide association study(全基因组关联分析) graph classification(图分类)graph classification(图分类)graph clustering(图聚类)graph data(图数据)graph data(图形数据)graph database 图数据库graph database(图数据库)graph mining(图挖掘)graph mining(图挖掘)graph partitioning 图划分graph query 图查询graph structure(图结构)graph theory(图论)graph theory(图论)graph theory(图论)graph theroy 图论graph visualization(图形可视化)graphical user interface 图形用户界面graphical user interfaces(图形用户界面)health care 卫生保健health care(卫生保健)heterogeneous data source 异构数据源heterogeneous data(异构数据)heterogeneous database 异构数据库heterogeneous information network(异构信息网络) heterogeneous network(异构网络)heterogenous ontology 异构本体heuristic rule 启发式规则hidden markov model(隐马尔可夫模型)hidden markov model(隐马尔可夫模型)hidden markov models(隐马尔可夫模型) hierarchical clustering (层次聚类) homogeneous network(同构网络)human centered computing 人机交互技术human computer interaction 人机交互human interaction 人机交互human robot interaction 人机交互image classification(图像分类)image clustering (图像聚类)image mining( 图像挖掘)image reconstruction(图像重建)image retrieval (图像检索)image segmentation(图像分割)inconsistent ontology 本体不一致incremental learning(增量学习)inductive learning (归纳学习)inference mechanisms 推理机制inference mechanisms(推理机制)inference rule 推理规则information cascades(信息追随)information diffusion(信息扩散)information extraction 信息提取information filtering(信息过滤)information filtering(信息过滤)information integration(信息集成)information network analysis(信息网络分析) information network mining(信息网络挖掘) information network(信息网络)information processing 信息处理information processing 信息处理information resource management (信息资源管理) information retrieval models(信息检索模型) information retrieval 信息检索information retrieval(信息检索)information retrieval(信息检索)information science 情报科学information sources 信息源information system( 信息系统)information system(信息系统)information technology(信息技术)information visualization(信息可视化)instance matching 实例匹配intelligent assistant 智能辅助intelligent systems 智能系统interaction network(交互网络)interactive visualization(交互式可视化)kernel function(核函数)kernel operator (核算子)keyword search(关键字检索)knowledege reuse 知识再利用knowledgeknowledgeknowledge acquisitionknowledge base 知识库knowledge based system 知识系统knowledge building 知识建构knowledge capture 知识获取knowledge construction 知识建构knowledge discovery(知识发现)knowledge extraction 知识提取knowledge fusion 知识融合knowledge integrationknowledge management systems 知识管理系统knowledge management 知识管理knowledge management(知识管理)knowledge model 知识模型knowledge reasoningknowledge representationknowledge representation(知识表达) knowledge sharing 知识共享knowledge storageknowledge technology 知识技术knowledge verification 知识验证language model(语言模型)language modeling approach(语言模型方法) large graph(大图)large graph(大图)learning(无监督学习)life science 生命科学linear programming(线性规划)link analysis (链接分析)link prediction(链接预测)link prediction(链接预测)link prediction(链接预测)linked data(关联数据)location based service(基于位置的服务) loclation based services(基于位置的服务) logic programming 逻辑编程logical implication 逻辑蕴涵logistic regression(logistic 回归)machine learning 机器学习machine translation(机器翻译)management system(管理系统)management( 知识管理)manifold learning(流形学习)markov chains 马尔可夫链markov processes(马尔可夫过程)matching function 匹配函数matrix decomposition(矩阵分解)matrix decomposition(矩阵分解)maximum likelihood estimation(最大似然估计)medical research(医学研究)mixture of gaussians(混合高斯模型)mobile computing(移动计算)multi agnet systems 多智能体系统multiagent systems 多智能体系统multimedia 多媒体natural language processing 自然语言处理natural language processing(自然语言处理) nearest neighbor (近邻)network analysis( 网络分析)network analysis(网络分析)network analysis(网络分析)network formation(组网)network structure(网络结构)network theory(网络理论)network topology(网络拓扑)network visualization(网络可视化)neural network(神经网络)neural networks (神经网络)neural networks(神经网络)nonlinear dynamics(非线性动力学)nonmonotonic reasoning 非单调推理nonnegative matrix factorization (非负矩阵分解) nonnegative matrix factorization(非负矩阵分解) object detection(目标检测)object oriented 面向对象object recognition(目标识别)object recognition(目标识别)online community(网络社区)online social network(在线社交网络)online social networks(在线社交网络)ontology alignment 本体映射ontology development 本体开发ontology engineering 本体工程ontology evolution 本体演化ontology extraction 本体抽取ontology interoperablity 互用性本体ontology language 本体语言ontology mapping 本体映射ontology matching 本体匹配ontology versioning 本体版本ontology 本体论open government data 政府公开数据opinion analysis(舆情分析)opinion mining(意见挖掘)opinion mining(意见挖掘)outlier detection(孤立点检测)parallel processing(并行处理)patient care(病人医疗护理)pattern classification(模式分类)pattern matching(模式匹配)pattern mining(模式挖掘)pattern recognition 模式识别pattern recognition(模式识别)pattern recognition(模式识别)personal data(个人数据)prediction algorithms(预测算法)predictive model 预测模型predictive models(预测模型)privacy preservation(隐私保护)probabilistic logic(概率逻辑)probabilistic logic(概率逻辑)probabilistic model(概率模型)probabilistic model(概率模型)probability distribution(概率分布)probability distribution(概率分布)project management(项目管理)pruning technique(修剪技术)quality management 质量管理query expansion(查询扩展)query language 查询语言query language(查询语言)query processing(查询处理)query rewrite 查询重写question answering system 问答系统random forest(随机森林)random graph(随机图)random processes(随机过程)random walk(随机游走)range query(范围查询)RDF database 资源描述框架数据库RDF query 资源描述框架查询RDF repository 资源描述框架存储库RDF storge 资源描述框架存储real time(实时)recommender system(推荐系统)recommender system(推荐系统)recommender systems 推荐系统recommender systems(推荐系统)record linkage 记录链接recurrent neural network(递归神经网络) regression(回归)reinforcement learning 强化学习reinforcement learning(强化学习)relation extraction 关系抽取relational database 关系数据库relational learning 关系学习relevance feedback (相关反馈)resource description framework 资源描述框架restricted boltzmann machines(受限玻尔兹曼机) retrieval models(检索模型)rough set theroy 粗糙集理论rough set 粗糙集rule based system 基于规则系统rule based 基于规则rule induction (规则归纳)rule learning (规则学习)rule learning 规则学习schema mapping 模式映射schema matching 模式匹配scientific domain 科学域search problems(搜索问题)semantic (web) technology 语义技术semantic analysis 语义分析semantic annotation 语义标注semantic computing 语义计算semantic integration 语义集成semantic interpretation 语义解释semantic model 语义模型semantic network 语义网络semantic relatedness 语义相关性semantic relation learning 语义关系学习semantic search 语义检索semantic similarity 语义相似度semantic similarity(语义相似度)semantic web rule language 语义网规则语言semantic web 语义网semantic web(语义网)semantic workflow 语义工作流semi supervised learning(半监督学习)sensor data(传感器数据)sensor networks(传感器网络)sentiment analysis(情感分析)sentiment analysis(情感分析)sequential pattern(序列模式)service oriented architecture 面向服务的体系结构shortest path(最短路径)similar kernel function(相似核函数)similarity measure(相似性度量)similarity relationship (相似关系)similarity search(相似搜索)similarity(相似性)situation aware 情境感知social behavior(社交行为)social influence(社会影响)social interaction(社交互动)social interaction(社交互动)social learning(社会学习)social life networks(社交生活网络)social machine 社交机器social media(社交媒体)social media(社交媒体)social media(社交媒体)social network analysis 社会网络分析social network analysis(社交网络分析)social network(社交网络)social network(社交网络)social science(社会科学)social tagging system(社交标签系统)social tagging(社交标签)social web(社交网页)sparse coding(稀疏编码)sparse matrices(稀疏矩阵)sparse representation(稀疏表示)spatial database(空间数据库)spatial reasoning 空间推理statistical analysis(统计分析)statistical model 统计模型string matching(串匹配)structural risk minimization (结构风险最小化) structured data 结构化数据subgraph matching 子图匹配subspace clustering(子空间聚类)supervised learning( 有support vector machine 支持向量机support vector machines(支持向量机)system dynamics(系统动力学)tag recommendation(标签推荐)taxonmy induction 感应规范temporal logic 时态逻辑temporal reasoning 时序推理text analysis(文本分析)text anaylsis 文本分析text classification (文本分类)text data(文本数据)text mining technique(文本挖掘技术)text mining 文本挖掘text mining(文本挖掘)text summarization(文本摘要)thesaurus alignment 同义对齐time frequency analysis(时频分析)time series analysis( 时time series data(时间序列数据)time series data(时间序列数据)time series(时间序列)topic model(主题模型)topic modeling(主题模型)transfer learning 迁移学习triple store 三元组存储uncertainty reasoning 不精确推理undirected graph(无向图)unified modeling language 统一建模语言unsupervisedupper bound(上界)user behavior(用户行为)user generated content(用户生成内容)utility mining(效用挖掘)visual analytics(可视化分析)visual content(视觉内容)visual representation(视觉表征)visualisation(可视化)visualization technique(可视化技术) visualization tool(可视化工具)web 2.0(网络2.0)web forum(web 论坛)web mining(网络挖掘)web of data 数据网web ontology lanuage 网络本体语言web pages(web 页面)web resource 网络资源web science 万维科学web search (网络检索)web usage mining(web 使用挖掘)wireless networks 无线网络world knowledge 世界知识world wide web 万维网world wide web(万维网)xml database 可扩展标志语言数据库附录 2 Data Mining 知识图谱（共包含二级节点15 个，三级节点93 个）间序列分析)监督学习)领域二级分类三级分类。

系统分析师复习重点

系统分析师复习重点一、综合知识 (2)（一）面向对象技术 (2)（二）网络与安全信息化 (3)（三）知识产权与标准化 (6)（四）计算机系统与配置 (7)（五）软件工程 (11)（六）信息化基础知识 (17)（七）数据库系统 (19)（八）操作系统 (20)（九）经济、管理与数学知识 (22)（十）多媒体技术 (24)（十一）计算机网络技术 (25)二、案例分析 (28)（一）系统分析常用工具 (28)（二）系统分析与建模(需求分析、系统建模、系统开发方法) (28)（三）系统设计与维护(系统测试、系统运行) (31)（四）系统开发项目管理(质量管理、成本管理、进度管理、组织管理) (32)（五）网络与信息化建设(网络规划、电子政务、电子商务) (35)（六）数据库系统及其管理(备份、恢复与容灾、性能分析) (38)（七）中间件 (43)（八）数据仓库 (44)（九）数据挖掘 (45)（十）RUP（统一开发过程） (45)（十一）敏捷方法 (46)（十二）O/R映射(O BJECT/R ELATION) (47)（十三）软件架构 (47)（十四）面向服务体系架构(SOA) (49)（十五）S TRUTS+S PRING+H IBERNATE开源框架 (50)（十六）软件成熟度模型(CMM) (50)（十七）软件产品线 (51)（十八）RIA富互联网应用 (52)（十九）AJAX技术 (53)（二十）M ASHUP (53)（二十一）数据联邦 (54)（二十二）云计算、P2P对等网络计算、网格计算、普适计算 (55)（二十三）电子政务信息共享整合 (60)（二十四）分区技术...................................................................................... 错误!未定义书签。

（二十五）物联网 . (62)一、综合知识（一）面向对象技术1.JacksonBooch 和UML2.类：是一组具有相同属性、操作、、关系、和语义的对象描述接口：是描述类或构件的一个服务的操作构件：是遵从一组接口规范且付诸实现的物理的、可替换的软件模块包：用于把元素组织成组节点：运行时的物理对象，代表一个计算机资源，通常至少有存储空间和执行能力3.4.UML5.传统的程序流程图与UML活动图区别在于：程序流程图明确指定了每个活动的先后程序，而活动图仅描述了活动和必要的工作程序。

兰姆达演算法

兰姆达演算法兰姆达演算法（Lambda Calculus）是一种基于数学逻辑的形式系统，用于描述和研究计算过程。

它由逻辑学家阿隆佐·邱奇（Alonzo Church）在20世纪30年代提出，被认为是计算机科学的基础之一。

兰姆达演算法具有简洁而强大的表达能力，被广泛应用于函数式编程语言、类型理论以及计算机科学理论研究中。

1. 什么是兰姆达演算法？兰姆达演算法是一种形式系统，它描述了一种抽象的计算模型。

与图灵机不同，兰姆达演算法没有存储器或状态的概念，它仅通过对表达式进行变换来进行计算。

这使得兰姆达演算法成为一个非常简单且优雅的计算模型。

在兰姆达演算法中，所有的计算都通过函数应用来实现。

它由三个基本元素组成：变量、抽象和应用。

变量表示一个符号或名字；抽象定义了一个函数；应用表示函数对参数的调用。

2. 兰姆达演算法的基本语法在兰姆达演算法中，表达式由变量、抽象和应用构成。

下面是一些基本的语法规则：•变量：一个变量由一个字母或字符串表示，例如x、y或foo。

•抽象：抽象定义了一个函数。

它由一个λ符号后跟一个变量和一个点组成，然后是函数体。

例如λx.x表示一个接受参数x并返回x的函数。

•应用：应用表示函数对参数的调用。

它由两个表达式连在一起组成，左边是函数，右边是参数。

例如(λx.x) y表示将y作为参数传递给函数λx.x。

3. 兰姆达演算法的基本操作兰姆达演算法通过一系列的规则进行计算和变换。

这些规则定义了如何对表达式进行求值和简化。

3.1 β-归约β-归约是兰姆达演算法中最重要的操作之一。

它描述了如何将函数应用到参数上，并将其简化为结果。

β-归约规则如下：(λx.E) V → E[x:=V]其中，(λx.E)表示一个抽象，V表示一个值（可以是变量或其他表达式），E[x:=V]表示将E中的所有自由出现的变量x替换为V。

这个规则表示，当一个函数应用到参数上时，它的函数体中的变量将被替换为参数，从而得到一个新的表达式。

MIS简介

Initially in businesses and other organizations, internal reporting was produced manually and only periodically, as a by-product of the accounting system and with some additional statistic(s), and gave limited and delayed information on management performance. Data was organized manually according to the requirements and necessity of the organization. As computational technology developed, information began to be distinguished from data and systems were developed to produce and organize abstractions, summaries, relationships and generalizations based on the data.Early business computers were used for simple operations such as tracking sales or payroll data, with little detail or structure. Over time, these computer applications became morecomplex, hardware storage capacities grew, and technologies improved for connectingpreviously isolated applications. As more and more data was stored and linked, managers sought greater detail as well as greater abstraction with the aim of creating entire management reports from the raw, stored data. The term "MIS" arose to describe such applications providing managers with information about sales, inventories, and other data that would help in managing the enterprise. Today, the term is used broadly in a number of contexts and includes (but is not limited to): decision support systems, resource and people management applications, enterprise resourceplanning (ERP), enterprise performance management (EPM), supply chainmanagement (SCM), customer relationship management (CRM),project management and database retrieval applications.The successful MIS supports a business's long range plans, providing reports based upon performance analysis in areas critical to those plans, with feedback loops that allow for titivation of every aspect of the enterprise, including recruitment and training regimens. MIS not only indicate how things are going, but why and where performance is failing to meet the plan. These reports include near-real-time performance of cost centers and projects with detail sufficient for individual accountabillityKenneth and Jane Laudon identif y five eras of MIS evolution corresponding to five phases in the development of computing technology: 1) mainframe and minicomputer computing, 2) personal computers, 3) client/server networks, 4) enterprise computing, and 5) cloud computing.[3].The first (mainframe and minicomputer) era was ruled by IBM and their mainframe computers, these computers would often take up whole rooms and require teams to run them, IBM supplied the hardware and the software. As technology advanced these computers were able to handle greater capacities and therefore reduce their cost. Smaller, more affordable minicomputers allowed larger businesses to run their own computing centers in-house.The second (personal computer) era began in 1965 as microprocessors started to compete with mainframes and minicomputers and accelerated the process of decentralizing computing power from large data centers to smaller offices. In the late 1970s minicomputer technology gave way to personal computers and relatively low cost computers were becoming mass market commodities, allowing businesses to provide their employees access to computing power that ten years before would have cost tens of thousands of dollars. This proliferation of computers created a ready market for interconnecting networks and the popularization of the Internet.As the complexity of the technology increased and the costs decreased, the need to share information within an enterprise also grew, giving rise to the third (client/server) era in which computers on a common network were able to access shared information on a server. This allowed for large amounts of data to be accessed by thousands and even millions of people simultaneously. The fourth (enterprise) era enabled by high speed networks, tied all aspects of the business enterprise together offering rich information access encompassing the complete management structure.The fifth and latest (cloud computing) era of information systems employs networking technology to deliver applications as well as data storage independent of the configuration, location or nature of the hardware. This, along with high speed cellphone and wifi networks, led to new levels of mobility in which managers access the MIS from most anywhere with laptops, tablet pcs, and smartphones.Most management information systems specialize in particular commercial and industrial sectors, aspects of the enterprise, or management substructure.▪Management information systems (MIS), per se, produce fixed, regularly scheduled reports based on data extracted and summarized from the firm’s underlying transactionprocessing systems[4] to middle and operational level managers to identify and informstructured and semi-structured decision problems.▪Decision support systems (DSS) are computer program applications used by middle management to compile information from a wide range of sources to support problem solving and decision making.▪Executive information systems (EIS) is a reporting tool that provides quick access to summarized reports coming from all company levels and departments such as accounting, human resources and operations.▪Marketing information systems are MIS designed specifically for managing the marketing aspects of the business.▪Office automation systems (OAS) support communication and productivity in the enterprise by automating work flow and eliminating bottlenecks. OAS may be implemented at any and all levels of management.AdvantagesThe following are some of the benefits that can be attained for different types of management information systems.[5]▪The company is able to highlight their strength and weaknesses due to the presence of revenue reports, employee performance records etc. The identification of these aspects can help the company to improve their business processes and operations.▪Giving an overall picture of the company and acting as a communication and planning tool. ▪The availability of the customer data and feedback can help the company to align their business processes according to the needs of the customers. The effective management of customer data can help the company to perform direct marketing and promotion activities.▪Information is considered to be an important asset for any company in the modern competitive world. The consumer buying trends and behaviors can be predicted by the analysis of sales and revenue reports from each operating region of the company.Enterprise applications▪Enterprise systems, also known as enterprise resource planning (ERP) systems provide an organization with integrated software modules and a unified database which enable efficientplanning, managing, and controlling of all core business processes across multiple locations.Modules of ERP systems may include finance, accounting, marketing, human resources,production, inventory management and distribution.▪Supply chain management (SCM) systems enable more efficient management of the supply chain by integrating the links in a supply chain. This may include suppliers,manufacturer, wholesalers, retailers and final customers.▪Customer relationship management (CRM) systems help businesses manage relationships with potential and current customers and business partners across marketing, sales, and service.▪Knowledge management system (KMS) helps organizations facilitate the collection, recording, organization, retrieval, and dissemination of knowledge. This may includedocuments, accounting records, and unrecorded procedures, practices and skills. Developing Information Systems"The actions that are taken to create an information system that solves an organizational problem are called system development (Laudon & Laudon, 2010)". These include system analysis, system design, programming, testing, conversion, production and finally maintenance. These actions usually take place in that specified order but some may need to repeat or be accomplished concurrently.System analysis is accomplished on the problem the company is facing and is trying to solve with the information system. Whoever accomplishes this step will identify the problem areas and outlines a solution through achievable objectives. This analysis will include a feasibility study, which determines the solutions feasibility based on money, time and technology. Essentially the feasibility study determines whether this solution is a good investment. This process also lays out what the information requirement will be for the new system.System design shows how the system will fulfill the requirements and objectives laid out in the system analysis phase. The designer will address all the managerial, organizational and technological components the system will address and need. It is important to note that user information requirements drive the building effort. The user of the system must be involved in the design process to ensure the system meets the users need and operations.Programming entails taking the design stage and translating that into software code. This is usually out sourced to another company to write the required software or company’s buy existing software that meets the systems needs. The key is to make sure the software is user friendly and compatible with current systems.Testing can take on many different forms but is essential to the successful implementation of the new system. You can conduct unit testing, which tests each program in the system separately or system testing which tests the system as a whole. Either way there should also be acceptance testing, which provides a certification that the system is ready to use. Also, regardless of the test a comprehensive test plan should be developed that identifies what is to be tested and what the expected outcome should be.Conversion is the process of changing or converting the old system into the new. This can be done in four ways:Parallel strategy – Both old and new systems are run together until the new one functions correctly (this is the safest approach since you do not lose the old system until the new one is “bug” free). Direct cutover – The new system replaces the old at an appointed time.Pilot study – Introducing the new system to a small portion of the operation to see how it fares. If good then the new system expands to the rest of the company.Phased approach – New system is introduced in stages.Anyway you implement the conversion you must document the good and bad during the process to identify benchmarks and fix problems. Conversion also includes the training of all personnel that are required to use the system to perform their job.Production is when the new system is officially the system of record for the operation and maintenance is just that. Maintain the system as it performs the function it was intended to meet.。

《神经网络与深度学习综述DeepLearning15May2014

Draft:Deep Learning in Neural Networks:An OverviewTechnical Report IDSIA-03-14/arXiv:1404.7828(v1.5)[cs.NE]J¨u rgen SchmidhuberThe Swiss AI Lab IDSIAIstituto Dalle Molle di Studi sull’Intelligenza ArtiﬁcialeUniversity of Lugano&SUPSIGalleria2,6928Manno-LuganoSwitzerland15May2014AbstractIn recent years,deep artiﬁcial neural networks(including recurrent ones)have won numerous con-tests in pattern recognition and machine learning.This historical survey compactly summarises relevantwork,much of it from the previous millennium.Shallow and deep learners are distinguished by thedepth of their credit assignment paths,which are chains of possibly learnable,causal links between ac-tions and effects.I review deep supervised learning(also recapitulating the history of backpropagation),unsupervised learning,reinforcement learning&evolutionary computation,and indirect search for shortprograms encoding deep and large networks.PDF of earlier draft(v1):http://www.idsia.ch/∼juergen/DeepLearning30April2014.pdfLATEX source:http://www.idsia.ch/∼juergen/DeepLearning30April2014.texComplete BIBTEXﬁle:http://www.idsia.ch/∼juergen/bib.bibPrefaceThis is the draft of an invited Deep Learning(DL)overview.One of its goals is to assign credit to those who contributed to the present state of the art.I acknowledge the limitations of attempting to achieve this goal.The DL research community itself may be viewed as a continually evolving,deep network of scientists who have inﬂuenced each other in complex ways.Starting from recent DL results,I tried to trace back the origins of relevant ideas through the past half century and beyond,sometimes using“local search”to follow citations of citations backwards in time.Since not all DL publications properly acknowledge earlier relevant work,additional global search strategies were employed,aided by consulting numerous neural network experts.As a result,the present draft mostly consists of references(about800entries so far).Nevertheless,through an expert selection bias I may have missed important work.A related bias was surely introduced by my special familiarity with the work of my own DL research group in the past quarter-century.For these reasons,the present draft should be viewed as merely a snapshot of an ongoing credit assignment process.To help improve it,please do not hesitate to send corrections and suggestions to juergen@idsia.ch.Contents1Introduction to Deep Learning(DL)in Neural Networks(NNs)3 2Event-Oriented Notation for Activation Spreading in FNNs/RNNs3 3Depth of Credit Assignment Paths(CAPs)and of Problems4 4Recurring Themes of Deep Learning54.1Dynamic Programming(DP)for DL (5)4.2Unsupervised Learning(UL)Facilitating Supervised Learning(SL)and RL (6)4.3Occam’s Razor:Compression and Minimum Description Length(MDL) (6)4.4Learning Hierarchical Representations Through Deep SL,UL,RL (6)4.5Fast Graphics Processing Units(GPUs)for DL in NNs (6)5Supervised NNs,Some Helped by Unsupervised NNs75.11940s and Earlier (7)5.2Around1960:More Neurobiological Inspiration for DL (7)5.31965:Deep Networks Based on the Group Method of Data Handling(GMDH) (8)5.41979:Convolution+Weight Replication+Winner-Take-All(WTA) (8)5.51960-1981and Beyond:Development of Backpropagation(BP)for NNs (8)5.5.1BP for Weight-Sharing Feedforward NNs(FNNs)and Recurrent NNs(RNNs)..95.6Late1980s-2000:Numerous Improvements of NNs (9)5.6.1Ideas for Dealing with Long Time Lags and Deep CAPs (10)5.6.2Better BP Through Advanced Gradient Descent (10)5.6.3Discovering Low-Complexity,Problem-Solving NNs (11)5.6.4Potential Beneﬁts of UL for SL (11)5.71987:UL Through Autoencoder(AE)Hierarchies (12)5.81989:BP for Convolutional NNs(CNNs) (13)5.91991:Fundamental Deep Learning Problem of Gradient Descent (13)5.101991:UL-Based History Compression Through a Deep Hierarchy of RNNs (14)5.111992:Max-Pooling(MP):Towards MPCNNs (14)5.121994:Contest-Winning Not So Deep NNs (15)5.131995:Supervised Recurrent Very Deep Learner(LSTM RNN) (15)5.142003:More Contest-Winning/Record-Setting,Often Not So Deep NNs (16)5.152006/7:Deep Belief Networks(DBNs)&AE Stacks Fine-Tuned by BP (17)5.162006/7:Improved CNNs/GPU-CNNs/BP-Trained MPCNNs (17)5.172009:First Ofﬁcial Competitions Won by RNNs,and with MPCNNs (18)5.182010:Plain Backprop(+Distortions)on GPU Yields Excellent Results (18)5.192011:MPCNNs on GPU Achieve Superhuman Vision Performance (18)5.202011:Hessian-Free Optimization for RNNs (19)5.212012:First Contests Won on ImageNet&Object Detection&Segmentation (19)5.222013-:More Contests and Benchmark Records (20)5.22.1Currently Successful Supervised Techniques:LSTM RNNs/GPU-MPCNNs (21)5.23Recent Tricks for Improving SL Deep NNs(Compare Sec.5.6.2,5.6.3) (21)5.24Consequences for Neuroscience (22)5.25DL with Spiking Neurons? (22)6DL in FNNs and RNNs for Reinforcement Learning(RL)236.1RL Through NN World Models Yields RNNs With Deep CAPs (23)6.2Deep FNNs for Traditional RL and Markov Decision Processes(MDPs) (24)6.3Deep RL RNNs for Partially Observable MDPs(POMDPs) (24)6.4RL Facilitated by Deep UL in FNNs and RNNs (25)6.5Deep Hierarchical RL(HRL)and Subgoal Learning with FNNs and RNNs (25)6.6Deep RL by Direct NN Search/Policy Gradients/Evolution (25)6.7Deep RL by Indirect Policy Search/Compressed NN Search (26)6.8Universal RL (27)7Conclusion271Introduction to Deep Learning(DL)in Neural Networks(NNs) Which modiﬁable components of a learning system are responsible for its success or failure?What changes to them improve performance?This has been called the fundamental credit assignment problem(Minsky, 1963).There are general credit assignment methods for universal problem solvers that are time-optimal in various theoretical senses(Sec.6.8).The present survey,however,will focus on the narrower,but now commercially important,subﬁeld of Deep Learning(DL)in Artiﬁcial Neural Networks(NNs).We are interested in accurate credit assignment across possibly many,often nonlinear,computational stages of NNs.Shallow NN-like models have been around for many decades if not centuries(Sec.5.1).Models with several successive nonlinear layers of neurons date back at least to the1960s(Sec.5.3)and1970s(Sec.5.5). An efﬁcient gradient descent method for teacher-based Supervised Learning(SL)in discrete,differentiable networks of arbitrary depth called backpropagation(BP)was developed in the1960s and1970s,and ap-plied to NNs in1981(Sec.5.5).BP-based training of deep NNs with many layers,however,had been found to be difﬁcult in practice by the late1980s(Sec.5.6),and had become an explicit research subject by the early1990s(Sec.5.9).DL became practically feasible to some extent through the help of Unsupervised Learning(UL)(e.g.,Sec.5.10,5.15).The1990s and2000s also saw many improvements of purely super-vised DL(Sec.5).In the new millennium,deep NNs haveﬁnally attracted wide-spread attention,mainly by outperforming alternative machine learning methods such as kernel machines(Vapnik,1995;Sch¨o lkopf et al.,1998)in numerous important applications.In fact,supervised deep NNs have won numerous of-ﬁcial international pattern recognition competitions(e.g.,Sec.5.17,5.19,5.21,5.22),achieving theﬁrst superhuman visual pattern recognition results in limited domains(Sec.5.19).Deep NNs also have become relevant for the more generalﬁeld of Reinforcement Learning(RL)where there is no supervising teacher (Sec.6).Both feedforward(acyclic)NNs(FNNs)and recurrent(cyclic)NNs(RNNs)have won contests(Sec.5.12,5.14,5.17,5.19,5.21,5.22).In a sense,RNNs are the deepest of all NNs(Sec.3)—they are general computers more powerful than FNNs,and can in principle create and process memories of ar-bitrary sequences of input patterns(e.g.,Siegelmann and Sontag,1991;Schmidhuber,1990a).Unlike traditional methods for automatic sequential program synthesis(e.g.,Waldinger and Lee,1969;Balzer, 1985;Soloway,1986;Deville and Lau,1994),RNNs can learn programs that mix sequential and parallel information processing in a natural and efﬁcient way,exploiting the massive parallelism viewed as crucial for sustaining the rapid decline of computation cost observed over the past75years.The rest of this paper is structured as follows.Sec.2introduces a compact,event-oriented notation that is simple yet general enough to accommodate both FNNs and RNNs.Sec.3introduces the concept of Credit Assignment Paths(CAPs)to measure whether learning in a given NN application is of the deep or shallow type.Sec.4lists recurring themes of DL in SL,UL,and RL.Sec.5focuses on SL and UL,and on how UL can facilitate SL,although pure SL has become dominant in recent competitions(Sec.5.17-5.22). Sec.5is arranged in a historical timeline format with subsections on important inspirations and technical contributions.Sec.6on deep RL discusses traditional Dynamic Programming(DP)-based RL combined with gradient-based search techniques for SL or UL in deep NNs,as well as general methods for direct and indirect search in the weight space of deep FNNs and RNNs,including successful policy gradient and evolutionary methods.2Event-Oriented Notation for Activation Spreading in FNNs/RNNs Throughout this paper,let i,j,k,t,p,q,r denote positive integer variables assuming ranges implicit in the given contexts.Let n,m,T denote positive integer constants.An NN’s topology may change over time(e.g.,Fahlman,1991;Ring,1991;Weng et al.,1992;Fritzke, 1994).At any given moment,it can be described as aﬁnite subset of units(or nodes or neurons)N= {u1,u2,...,}and aﬁnite set H⊆N×N of directed edges or connections between nodes.FNNs are acyclic graphs,RNNs cyclic.Theﬁrst(input)layer is the set of input units,a subset of N.In FNNs,the k-th layer(k>1)is the set of all nodes u∈N such that there is an edge path of length k−1(but no longer path)between some input unit and u.There may be shortcut connections between distant layers.The NN’s behavior or program is determined by a set of real-valued,possibly modiﬁable,parameters or weights w i(i=1,...,n).We now focus on a singleﬁnite episode or epoch of information processing and activation spreading,without learning through weight changes.The following slightly unconventional notation is designed to compactly describe what is happening during the runtime of the system.During an episode,there is a partially causal sequence x t(t=1,...,T)of real values that I call events.Each x t is either an input set by the environment,or the activation of a unit that may directly depend on other x k(k<t)through a current NN topology-dependent set in t of indices k representing incoming causal connections or links.Let the function v encode topology information and map such event index pairs(k,t)to weight indices.For example,in the non-input case we may have x t=f t(net t)with real-valued net t= k∈in t x k w v(k,t)(additive case)or net t= k∈in t x k w v(k,t)(multiplicative case), where f t is a typically nonlinear real-valued activation function such as tanh.In many recent competition-winning NNs(Sec.5.19,5.21,5.22)there also are events of the type x t=max k∈int (x k);some networktypes may also use complex polynomial activation functions(Sec.5.3).x t may directly affect certain x k(k>t)through outgoing connections or links represented through a current set out t of indices k with t∈in k.Some non-input events are called output events.Note that many of the x t may refer to different,time-varying activations of the same unit in sequence-processing RNNs(e.g.,Williams,1989,“unfolding in time”),or also in FNNs sequentially exposed to time-varying input patterns of a large training set encoded as input events.During an episode,the same weight may get reused over and over again in topology-dependent ways,e.g.,in RNNs,or in convolutional NNs(Sec.5.4,5.8).I call this weight sharing across space and/or time.Weight sharing may greatly reduce the NN’s descriptive complexity,which is the number of bits of information required to describe the NN (Sec.4.3).In Supervised Learning(SL),certain NN output events x t may be associated with teacher-given,real-valued labels or targets d t yielding errors e t,e.g.,e t=1/2(x t−d t)2.A typical goal of supervised NN training is toﬁnd weights that yield episodes with small total error E,the sum of all such e t.The hope is that the NN will generalize well in later episodes,causing only small errors on previously unseen sequences of input events.Many alternative error functions for SL and UL are possible.SL assumes that input events are independent of earlier output events(which may affect the environ-ment through actions causing subsequent perceptions).This assumption does not hold in the broaderﬁelds of Sequential Decision Making and Reinforcement Learning(RL)(Kaelbling et al.,1996;Sutton and Barto, 1998;Hutter,2005)(Sec.6).In RL,some of the input events may encode real-valued reward signals given by the environment,and a typical goal is toﬁnd weights that yield episodes with a high sum of reward signals,through sequences of appropriate output actions.Sec.5.5will use the notation above to compactly describe a central algorithm of DL,namely,back-propagation(BP)for supervised weight-sharing FNNs and RNNs.(FNNs may be viewed as RNNs with certainﬁxed zero weights.)Sec.6will address the more general RL case.3Depth of Credit Assignment Paths(CAPs)and of ProblemsTo measure whether credit assignment in a given NN application is of the deep or shallow type,I introduce the concept of Credit Assignment Paths or CAPs,which are chains of possibly causal links between events.Let usﬁrst focus on SL.Consider two events x p and x q(1≤p<q≤T).Depending on the appli-cation,they may have a Potential Direct Causal Connection(PDCC)expressed by the Boolean predicate pdcc(p,q),which is true if and only if p∈in q.Then the2-element list(p,q)is deﬁned to be a CAP from p to q(a minimal one).A learning algorithm may be allowed to change w v(p,q)to improve performance in future episodes.More general,possibly indirect,Potential Causal Connections(PCC)are expressed by the recursively deﬁned Boolean predicate pcc(p,q),which in the SL case is true only if pdcc(p,q),or if pcc(p,k)for some k and pdcc(k,q).In the latter case,appending q to any CAP from p to k yields a CAP from p to q(this is a recursive deﬁnition,too).The set of such CAPs may be large but isﬁnite.Note that the same weight may affect many different PDCCs between successive events listed by a given CAP,e.g.,in the case of RNNs, or weight-sharing FNNs.Suppose a CAP has the form(...,k,t,...,q),where k and t(possibly t=q)are theﬁrst successive elements with modiﬁable w v(k,t).Then the length of the sufﬁx list(t,...,q)is called the CAP’s depth (which is0if there are no modiﬁable links at all).This depth limits how far backwards credit assignment can move down the causal chain toﬁnd a modiﬁable weight.1Suppose an episode and its event sequence x1,...,x T satisfy a computable criterion used to decide whether a given problem has been solved(e.g.,total error E below some threshold).Then the set of used weights is called a solution to the problem,and the depth of the deepest CAP within the sequence is called the solution’s depth.There may be other solutions(yielding different event sequences)with different depths.Given someﬁxed NN topology,the smallest depth of any solution is called the problem’s depth.Sometimes we also speak of the depth of an architecture:SL FNNs withﬁxed topology imply a problem-independent maximal problem depth bounded by the number of non-input layers.Certain SL RNNs withﬁxed weights for all connections except those to output units(Jaeger,2001;Maass et al.,2002; Jaeger,2004;Schrauwen et al.,2007)have a maximal problem depth of1,because only theﬁnal links in the corresponding CAPs are modiﬁable.In general,however,RNNs may learn to solve problems of potentially unlimited depth.Note that the deﬁnitions above are solely based on the depths of causal chains,and agnostic of the temporal distance between events.For example,shallow FNNs perceiving large“time windows”of in-put events may correctly classify long input sequences through appropriate output events,and thus solve shallow problems involving long time lags between relevant events.At which problem depth does Shallow Learning end,and Deep Learning begin?Discussions with DL experts have not yet yielded a conclusive response to this question.Instead of committing myself to a precise answer,let me just deﬁne for the purposes of this overview:problems of depth>10require Very Deep Learning.The difﬁculty of a problem may have little to do with its depth.Some NNs can quickly learn to solve certain deep problems,e.g.,through random weight guessing(Sec.5.9)or other types of direct search (Sec.6.6)or indirect search(Sec.6.7)in weight space,or through training an NNﬁrst on shallow problems whose solutions may then generalize to deep problems,or through collapsing sequences of(non)linear operations into a single(non)linear operation—but see an analysis of non-trivial aspects of deep linear networks(Baldi and Hornik,1994,Section B).In general,however,ﬁnding an NN that precisely models a given training set is an NP-complete problem(Judd,1990;Blum and Rivest,1992),also in the case of deep NNs(S´ıma,1994;de Souto et al.,1999;Windisch,2005);compare a survey of negative results(S´ıma, 2002,Section1).Above we have focused on SL.In the more general case of RL in unknown environments,pcc(p,q) is also true if x p is an output event and x q any later input event—any action may affect the environment and thus any later perception.(In the real world,the environment may even inﬂuence non-input events computed on a physical hardware entangled with the entire universe,but this is ignored here.)It is possible to model and replace such unmodiﬁable environmental PCCs through a part of the NN that has already learned to predict(through some of its units)input events(including reward signals)from former input events and actions(Sec.6.1).Its weights are frozen,but can help to assign credit to other,still modiﬁable weights used to compute actions(Sec.6.1).This approach may lead to very deep CAPs though.Some DL research is about automatically rephrasing problems such that their depth is reduced(Sec.4). In particular,sometimes UL is used to make SL problems less deep,e.g.,Sec.5.10.Often Dynamic Programming(Sec.4.1)is used to facilitate certain traditional RL problems,e.g.,Sec.6.2.Sec.5focuses on CAPs for SL,Sec.6on the more complex case of RL.4Recurring Themes of Deep Learning4.1Dynamic Programming(DP)for DLOne recurring theme of DL is Dynamic Programming(DP)(Bellman,1957),which can help to facili-tate credit assignment under certain assumptions.For example,in SL NNs,backpropagation itself can 1An alternative would be to count only modiﬁable links when measuring depth.In many typical NN applications this would not make a difference,but in some it would,e.g.,Sec.6.1.be viewed as a DP-derived method(Sec.5.5).In traditional RL based on strong Markovian assumptions, DP-derived methods can help to greatly reduce problem depth(Sec.6.2).DP algorithms are also essen-tial for systems that combine concepts of NNs and graphical models,such as Hidden Markov Models (HMMs)(Stratonovich,1960;Baum and Petrie,1966)and Expectation Maximization(EM)(Dempster et al.,1977),e.g.,(Bottou,1991;Bengio,1991;Bourlard and Morgan,1994;Baldi and Chauvin,1996; Jordan and Sejnowski,2001;Bishop,2006;Poon and Domingos,2011;Dahl et al.,2012;Hinton et al., 2012a).4.2Unsupervised Learning(UL)Facilitating Supervised Learning(SL)and RL Another recurring theme is how UL can facilitate both SL(Sec.5)and RL(Sec.6).UL(Sec.5.6.4) is normally used to encode raw incoming data such as video or speech streams in a form that is more convenient for subsequent goal-directed learning.In particular,codes that describe the original data in a less redundant or more compact way can be fed into SL(Sec.5.10,5.15)or RL machines(Sec.6.4),whose search spaces may thus become smaller(and whose CAPs shallower)than those necessary for dealing with the raw data.UL is closely connected to the topics of regularization and compression(Sec.4.3,5.6.3). 4.3Occam’s Razor:Compression and Minimum Description Length(MDL) Occam’s razor favors simple solutions over complex ones.Given some programming language,the prin-ciple of Minimum Description Length(MDL)can be used to measure the complexity of a solution candi-date by the length of the shortest program that computes it(e.g.,Solomonoff,1964;Kolmogorov,1965b; Chaitin,1966;Wallace and Boulton,1968;Levin,1973a;Rissanen,1986;Blumer et al.,1987;Li and Vit´a nyi,1997;Gr¨u nwald et al.,2005).Some methods explicitly take into account program runtime(Al-lender,1992;Watanabe,1992;Schmidhuber,2002,1995);many consider only programs with constant runtime,written in non-universal programming languages(e.g.,Rissanen,1986;Hinton and van Camp, 1993).In the NN case,the MDL principle suggests that low NN weight complexity corresponds to high NN probability in the Bayesian view(e.g.,MacKay,1992;Buntine and Weigend,1991;De Freitas,2003), and to high generalization performance(e.g.,Baum and Haussler,1989),without overﬁtting the training data.Many methods have been proposed for regularizing NNs,that is,searching for solution-computing, low-complexity SL NNs(Sec.5.6.3)and RL NNs(Sec.6.7).This is closely related to certain UL methods (Sec.4.2,5.6.4).4.4Learning Hierarchical Representations Through Deep SL,UL,RLMany methods of Good Old-Fashioned Artiﬁcial Intelligence(GOFAI)(Nilsson,1980)as well as more recent approaches to AI(Russell et al.,1995)and Machine Learning(Mitchell,1997)learn hierarchies of more and more abstract data representations.For example,certain methods of syntactic pattern recog-nition(Fu,1977)such as grammar induction discover hierarchies of formal rules to model observations. The partially(un)supervised Automated Mathematician/EURISKO(Lenat,1983;Lenat and Brown,1984) continually learns concepts by combining previously learnt concepts.Such hierarchical representation learning(Ring,1994;Bengio et al.,2013;Deng and Yu,2014)is also a recurring theme of DL NNs for SL (Sec.5),UL-aided SL(Sec.5.7,5.10,5.15),and hierarchical RL(Sec.6.5).Often,abstract hierarchical representations are natural by-products of data compression(Sec.4.3),e.g.,Sec.5.10.4.5Fast Graphics Processing Units(GPUs)for DL in NNsWhile the previous millennium saw several attempts at creating fast NN-speciﬁc hardware(e.g.,Jackel et al.,1990;Faggin,1992;Ramacher et al.,1993;Widrow et al.,1994;Heemskerk,1995;Korkin et al., 1997;Urlbe,1999),and at exploiting standard hardware(e.g.,Anguita et al.,1994;Muller et al.,1995; Anguita and Gomes,1996),the new millennium brought a DL breakthrough in form of cheap,multi-processor graphics cards or GPUs.GPUs are widely used for video games,a huge and competitive market that has driven down hardware prices.GPUs excel at fast matrix and vector multiplications required not only for convincing virtual realities but also for NN training,where they can speed up learning by a factorof50and more.Some of the GPU-based FNN implementations(Sec.5.16-5.19)have greatly contributed to recent successes in contests for pattern recognition(Sec.5.19-5.22),image segmentation(Sec.5.21), and object detection(Sec.5.21-5.22).5Supervised NNs,Some Helped by Unsupervised NNsThe main focus of current practical applications is on Supervised Learning(SL),which has dominated re-cent pattern recognition contests(Sec.5.17-5.22).Several methods,however,use additional Unsupervised Learning(UL)to facilitate SL(Sec.5.7,5.10,5.15).It does make sense to treat SL and UL in the same section:often gradient-based methods,such as BP(Sec.5.5.1),are used to optimize objective functions of both UL and SL,and the boundary between SL and UL may blur,for example,when it comes to time series prediction and sequence classiﬁcation,e.g.,Sec.5.10,5.12.A historical timeline format will help to arrange subsections on important inspirations and techni-cal contributions(although such a subsection may span a time interval of many years).Sec.5.1brieﬂy mentions early,shallow NN models since the1940s,Sec.5.2additional early neurobiological inspiration relevant for modern Deep Learning(DL).Sec.5.3is about GMDH networks(since1965),perhaps theﬁrst (feedforward)DL systems.Sec.5.4is about the relatively deep Neocognitron NN(1979)which is similar to certain modern deep FNN architectures,as it combines convolutional NNs(CNNs),weight pattern repli-cation,and winner-take-all(WTA)mechanisms.Sec.5.5uses the notation of Sec.2to compactly describe a central algorithm of DL,namely,backpropagation(BP)for supervised weight-sharing FNNs and RNNs. It also summarizes the history of BP1960-1981and beyond.Sec.5.6describes problems encountered in the late1980s with BP for deep NNs,and mentions several ideas from the previous millennium to overcome them.Sec.5.7discusses aﬁrst hierarchical stack of coupled UL-based Autoencoders(AEs)—this concept resurfaced in the new millennium(Sec.5.15).Sec.5.8is about applying BP to CNNs,which is important for today’s DL applications.Sec.5.9explains BP’s Fundamental DL Problem(of vanishing/exploding gradients)discovered in1991.Sec.5.10explains how a deep RNN stack of1991(the History Compressor) pre-trained by UL helped to solve previously unlearnable DL benchmarks requiring Credit Assignment Paths(CAPs,Sec.3)of depth1000and more.Sec.5.11discusses a particular WTA method called Max-Pooling(MP)important in today’s DL FNNs.Sec.5.12mentions aﬁrst important contest won by SL NNs in1994.Sec.5.13describes a purely supervised DL RNN(Long Short-Term Memory,LSTM)for problems of depth1000and more.Sec.5.14mentions an early contest of2003won by an ensemble of shallow NNs, as well as good pattern recognition results with CNNs and LSTM RNNs(2003).Sec.5.15is mostly about Deep Belief Networks(DBNs,2006)and related stacks of Autoencoders(AEs,Sec.5.7)pre-trained by UL to facilitate BP-based SL.Sec.5.16mentions theﬁrst BP-trained MPCNNs(2007)and GPU-CNNs(2006). Sec.5.17-5.22focus on ofﬁcial competitions with secret test sets won by(mostly purely supervised)DL NNs since2009,in sequence recognition,image classiﬁcation,image segmentation,and object detection. Many RNN results depended on LSTM(Sec.5.13);many FNN results depended on GPU-based FNN code developed since2004(Sec.5.16,5.17,5.18,5.19),in particular,GPU-MPCNNs(Sec.5.19).5.11940s and EarlierNN research started in the1940s(e.g.,McCulloch and Pitts,1943;Hebb,1949);compare also later work on learning NNs(Rosenblatt,1958,1962;Widrow and Hoff,1962;Grossberg,1969;Kohonen,1972; von der Malsburg,1973;Narendra and Thathatchar,1974;Willshaw and von der Malsburg,1976;Palm, 1980;Hopﬁeld,1982).In a sense NNs have been around even longer,since early supervised NNs were essentially variants of linear regression methods going back at least to the early1800s(e.g.,Legendre, 1805;Gauss,1809,1821).Early NNs had a maximal CAP depth of1(Sec.3).5.2Around1960:More Neurobiological Inspiration for DLSimple cells and complex cells were found in the cat’s visual cortex(e.g.,Hubel and Wiesel,1962;Wiesel and Hubel,1959).These cellsﬁre in response to certain properties of visual sensory inputs,such as theorientation of plex cells exhibit more spatial invariance than simple cells.This inspired later deep NN architectures(Sec.5.4)used in certain modern award-winning Deep Learners(Sec.5.19-5.22).5.31965:Deep Networks Based on the Group Method of Data Handling(GMDH) Networks trained by the Group Method of Data Handling(GMDH)(Ivakhnenko and Lapa,1965; Ivakhnenko et al.,1967;Ivakhnenko,1968,1971)were perhaps theﬁrst DL systems of the Feedforward Multilayer Perceptron type.The units of GMDH nets may have polynomial activation functions imple-menting Kolmogorov-Gabor polynomials(more general than traditional NN activation functions).Given a training set,layers are incrementally grown and trained by regression analysis,then pruned with the help of a separate validation set(using today’s terminology),where Decision Regularisation is used to weed out superﬂuous units.The numbers of layers and units per layer can be learned in problem-dependent fashion. This is a good example of hierarchical representation learning(Sec.4.4).There have been numerous ap-plications of GMDH-style networks,e.g.(Ikeda et al.,1976;Farlow,1984;Madala and Ivakhnenko,1994; Ivakhnenko,1995;Kondo,1998;Kord´ık et al.,2003;Witczak et al.,2006;Kondo and Ueno,2008).5.41979:Convolution+Weight Replication+Winner-Take-All(WTA)Apart from deep GMDH networks(Sec.5.3),the Neocognitron(Fukushima,1979,1980,2013a)was per-haps theﬁrst artiﬁcial NN that deserved the attribute deep,and theﬁrst to incorporate the neurophysiolog-ical insights of Sec.5.2.It introduced convolutional NNs(today often called CNNs or convnets),where the(typically rectangular)receptiveﬁeld of a convolutional unit with given weight vector is shifted step by step across a2-dimensional array of input values,such as the pixels of an image.The resulting2D array of subsequent activation events of this unit can then provide inputs to higher-level units,and so on.Due to massive weight replication(Sec.2),relatively few parameters may be necessary to describe the behavior of such a convolutional layer.Competition layers have WTA subsets whose maximally active units are the only ones to adopt non-zero activation values.They essentially“down-sample”the competition layer’s input.This helps to create units whose responses are insensitive to small image shifts(compare Sec.5.2).The Neocognitron is very similar to the architecture of modern,contest-winning,purely super-vised,feedforward,gradient-based Deep Learners with alternating convolutional and competition lay-ers(e.g.,Sec.5.19-5.22).Fukushima,however,did not set the weights by supervised backpropagation (Sec.5.5,5.8),but by local un supervised learning rules(e.g.,Fukushima,2013b),or by pre-wiring.In that sense he did not care for the DL problem(Sec.5.9),although his architecture was comparatively deep indeed.He also used Spatial Averaging(Fukushima,1980,2011)instead of Max-Pooling(MP,Sec.5.11), currently a particularly convenient and popular WTA mechanism.Today’s CNN-based DL machines proﬁta lot from later CNN work(e.g.,LeCun et al.,1989;Ranzato et al.,2007)(Sec.5.8,5.16,5.19).5.51960-1981and Beyond:Development of Backpropagation(BP)for NNsThe minimisation of errors through gradient descent(Hadamard,1908)in the parameter space of com-plex,nonlinear,differentiable,multi-stage,NN-related systems has been discussed at least since the early 1960s(e.g.,Kelley,1960;Bryson,1961;Bryson and Denham,1961;Pontryagin et al.,1961;Dreyfus,1962; Wilkinson,1965;Amari,1967;Bryson and Ho,1969;Director and Rohrer,1969;Griewank,2012),ini-tially within the framework of Euler-LaGrange equations in the Calculus of Variations(e.g.,Euler,1744). Steepest descent in such systems can be performed(Bryson,1961;Kelley,1960;Bryson and Ho,1969)by iterating the ancient chain rule(Leibniz,1676;L’Hˆo pital,1696)in Dynamic Programming(DP)style(Bell-man,1957).A simpliﬁed derivation of the method uses the chain rule only(Dreyfus,1962).The methods of the1960s were already efﬁcient in the DP sense.However,they backpropagated derivative information through standard Jacobian matrix calculations from one“layer”to the previous one, explicitly addressing neither direct links across several layers nor potential additional efﬁciency gains due to network sparsity(but perhaps such enhancements seemed obvious to the authors).。

稀疏表示分类中遮挡字典构造方法的改进

稀疏表示分类中遮挡字典构造方法的改进第一章：引言介绍稀疏表示分类的背景和意义，以及现有研究中存在的遮挡字典构造问题。

简要介绍本文的主要研究内容和贡献。

第二章：相关工作综述综述稀疏表示分类及其在图像识别中的应用，以及已有方法解决的问题。

重点介绍现有方法中存在的局限性和不足，为本文提出的改进方法奠定理论基础。

第三章：改进的遮挡字典构造方法详细介绍本文提出的改进方法：基于非空子空间门限的遮挡字典构造方法。

阐述该方法的理论基础和具体实现方式，并与现有方法进行比较和分析。

第四章：实验结果和分析通过对真实数据集上的实验，比较改进方法和现有方法的分类准确度和效率，验证改进方法的有效性和优越性，并分析和讨论实验结果。

第五章：总结和展望总结全文的主要工作和研究成果，阐述改进方法的重要性和应用前景，并提出未来进一步改进的方向和展望。

随着计算机技术的发展和普及，图像识别已经被广泛应用于人类的各个领域。

在这个领域中，稀疏表示分类作为一种有效的分类方法，因其在识别准确度和效率方面的优势而备受关注。

稀疏表示分类是一种基于字典学习的图像识别方法，该方法可以将图像表示为由字典中的原子线性组合而成的向量，其中大部分原子系数为零。

在分类时，将待分类图像表示为稀疏向量，由此可以快速地进行图像分类和识别。

其因为拥有较高的识别准确度和较低的计算成本而在近年来受到越来越多的关注。

然而，稀疏表示分类在实际应用中也遇到了一些问题。

其中之一就是遮挡字典构造问题。

遮挡字典指的是由于图像中某些部分被遮挡或缺失，导致字典的某些原子部分无法区分不同类别的图像。

这种遮挡情况是普遍存在的，并且会严重影响图像识别的准确度和鲁棒性。

为了解决这个问题，现有的一些方法主要采用了增量字典学习、梯度迭代等策略，但这些方法存在训练时间长、字典过大和识别效率低等问题，需要进一步优化和改进。

基于此，本文提出了一种改进的遮挡字典构造方法，即基于非空子空间门限的遮挡字典构造方法。

设计学生成绩信息管理系统心得体会

第一篇、学生成绩管理系统设计报告设计学生成绩信息管理系统心得体会南京理工大学数据库课程设计作者: 学院(系): 专业: 题目: 指导老师:学号计算机科学与工程学院网络工程学生成绩管理系统衷宜2013 年9 月目录一、概述·3 二、需求分析··4 三、系统设计··9 四、系统实施··15 五、系统测试··29 六、收获和体会·33七、附录··34 八、参考文献··34一、概述1、项目背景当今时代是飞速发展的信息时代。

在各行各业中离不开信息处理，这正是计算机被广泛应用于信息管理系统的环境。

计算机的最大好处在于利用它能够进行信息管理。

使用计算机进行信息控制，不仅提高了工作效率，而且大大的提高了其安全性。

尤其对于复杂的信息管理，计算机能够充分发挥它的优越性。

计算机进行信息管理与信息管理系统的开发密切相关，系统的开发是系统管理的前提。

目前随着个大高校的扩招，在校学生数量庞大。

拥有一款好的学习成绩管理系统软件，对于加强对在校生的成绩管理起到积极作用。

并且，可以为在校生随时查阅自己的成绩信息、教师录入成绩、管理员进行信息维护等提供方便，为学校节省大量人力资源本系统就是为了管理好学生成绩信息而设计的。

2、编写目的首先,学生成绩管理是一个学校不可缺少的部分，它的内容对于学校的管理者和学生以及学生家长来说都至关重要，所以一个良好的学生成绩管理系统应该能够为用户提供充足的信息和快捷的查询手段。

学生成绩管理系统对学校加强学生成绩管理有着极其重要的作用. 作为计算机应用的一部分，使用计算机对学生成绩信息进行管理，具有手工管理所无法比拟的优点。

例检索迅速、查找方便、可靠性高、存储量大、保密性好、寿命长、成本低等。

这些优点能够极大地提高管理者管理的效率,也是学校走向科学化、正规化管理,与世界接轨的重要条件。

abstract方法

用关键字abstract修饰的方法称为abstract方法（抽象方法）。

abstract int min (int x, int y );
对于abstract方法，只允许声明，不允许实现（没有方法体），而且不允许使用final和abstract同时修饰一个方法或类，也不允许使用static修饰abstract方法，即abstract方法必须是实例方法
1.abstract类中可以有abstract方法
和普通类（非abstract类）相比，abstract类中可以有abstract方法，
也可以有非abstract方法（非abstract类中不可以有abstract方法）。

注意：abstract类里也可以没有abstract方法。

2.abstract类不能用new运算符创建该类的对象。

3.abstract类的子类
如果一个非abstract类是abstract的子类，他必须重写父类的abstract
方法，即去掉abstract方法的abstract修饰，并给出方法体。

如果一个abstract 类是abstract类的子类，它可以重写父类的abstract方法，也可以继承父类的abstract方法。

4.abstract类的对象可以作为上转型对象
可以使用abstract类声明对象，尽管不能使用new运算符创建该对象，但该对象可以成为其子对象的上转型对象，那么该对象就可以调用子类重写的方法。

深度学习和互联网行业「黑话」指南

深度学习和互联网行业「黑话」指南技术黑话•hypothesis：我猜的，先做一个假设•intuition：我猜的，而且别人也认同•rebuttal：撕逼•Ablation study：控制变量法•baseline：(故意)选的方法，比我的差•benchmark：我会的数据集•End-to-end：拉通对齐，组合拳•interpretability：稍微画一下图•fine tune：再训练一下•feature map：中间计算结果•Ground truth：真实结果•State-of-the-art (SOTA)：最棒棒•Receptive field(RF)：能看的区域•empirically：实验做了，理论不会•theoretically：我瞎说的，没做实验•multi-task：把几个loss加一起•multi-stage：分步骤来•multi-domain：几堆样本一起训练•multi-modality：不同模态数据训练•集成学习：多训几个模型一起用•compared method：我能打过的•Rethink：别人都不行•attention：加权•few-shot learning：还没学懂•zero-shot learning：瞎蒙•self-supervised：自学•semi-supervised：有人教，也自学•unsupervised：没人教，到处瞎学•pixel-wise：像素级别•image-wise：图片级别•anchor：锚点•Short connection：走捷径行业黑话•接地气：别做太洋气，土一点•联动：两个品牌互相@对方•背书：找说话有分量的人夸我•咱们拉个群吧：和你沟通真费劲•在做了：进度0％•有案例吗？：有现成的可以抄吗•能承受较大的工作压力：加班•有强烈责任心：没做完不准走•包三餐：早晚都得加班•扁平化管理：领导和你一个办公室还有很多其他的，就不一一列举了...复盘（review），赋能（enable），抓手(grip)对标（benchmark），沉淀(internalize)，对齐(alignment) 拉通(stream-line)，倒逼(push back)，落地(landing)中台(middle office)，漏斗(funnel)，闭环(closed loop)打法(tactics)，履约(delivery)，串联(cascade)纽带(bond)，矩阵(matrix)，协同(collabration)反哺(give back)，交互(inter-link)，兼容(inclusive)，包装(package)，相应(relative)，刺激(stimulate)规模(scale)，重组(restructure)，量化(measurable)宽松(loose)，认知(perception)，发力(put the force on ) 智能(smart)，颗粒度(granularity)，方法论(methodology) 组合拳(blended measures)，生命周期(life cycle)转自：coggle。

纹理物体缺陷的视觉检测算法研究--优秀毕业论文

摘要
在竞争激烈的工业自动化生产过程中，机器视觉对产品质量的把关起着举足轻重的作用，机器视觉在缺陷检测技术方面的应用也逐渐普遍起来。与常规的检测技术相比，自动化的视觉检测系统更加经济、快捷、高效与安全。纹理物体在工业生产中广泛存在，像用于半导体装配和封装底板和发光二极管，现代化电子系统中的印制电路板，以及纺织行业中的布匹和织物等都可认为是含有纹理特征的物体。本论文主要致力于纹理物体的缺陷检测技术研究，为纹理物体的自动化检测提供高效而可靠的检测算法。纹理是描述图像内容的重要特征，纹理分析也已经被成功的应用与纹理分割和纹理分类当中。本研究提出了一种基于纹理分析技术和参考比较方式的缺陷检测算法。这种算法能容忍物体变形引起的图像配准误差，对纹理的影响也具有鲁棒性。本算法旨在为检测出的缺陷区域提供丰富而重要的物理意义，如缺陷区域的大小、形状、亮度对比度及空间分布等。同时，在参考图像可行的情况下，本算法可用于同质纹理物体和非同质纹理物体的检测，对非纹理物体的检测也可取得不错的效果。在整个检测过程中，我们采用了可调控金字塔的纹理分析和重构技术。与传统的小波纹理分析技术不同，我们在小波域中加入处理物体变形和纹理影响的容忍度控制算法，来实现容忍物体变形和对纹理影响鲁棒的目的。最后可调控金字塔的重构保证了缺陷区域物理意义恢复的准确性。实验阶段，我们检测了一系列具有实际应用价值的图像。实验结果表明本文提出的纹理物体缺陷检测算法具有高效性和易于实现性。关键字: 缺陷检测；纹理；物体变形；可调控金字塔；重构
Keywords: defect detection, texture, object distortion, steerable pyramid, reconstruction
II

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Decentralizing UNIX Abstractions in the Exokernel ArchitecturebyH´e ctor Manuel Brice˜n o PulidoSubmitted to the Department of Electrical Engineering and Computer Science in partial fulﬁllment of the requirements for the degrees ofBachelor of ScienceandMaster of Engineering in Computer Science and Engineeringat theMASSACHUSETTS INSTITUTE OF TECHNOLOGYFebruary1997c Massachusetts Institute of Technology1997.All rights reserved.AuthorDepartment of Electrical Engineering and Computer ScienceFebruary7,1997 Certiﬁed byM.Frans KaashoekAssociate ProfessorThesis Supervisor Certiﬁed byGregory R.GangerPostdoctoral AssociateThesis Supervisor Certiﬁed byDawson R.EnglerPh.D.CandidateThesis Supervisor Accepted byArthur C.SmithChairman,Departmental Committee on Graduate StudentsDecentralizing UNIX Abstractions in the Exokernel ArchitecturebyH´e ctor Manuel Brice˜n o PulidoSubmitted to the Department of Electrical Engineering and Computer Science onFebruary7,1997in partial fulﬁllment of the requirements for the degrees ofBachelor of ScienceandMaster of Engineering in Computer Science and EngineeringAbstractTraditional operating systems(OSs)provide aﬁxed interface to hardware abstractions.This interface and its implementation hurt application performance andﬂexibility.What is needed is aﬂexible and high-performance interface to OS abstractions that can be customized to each application’s needs.To provide moreﬂexibility and performance to applications,the exokernel architecture decen-tralizes OS abstractions and places them in libraries.This thesis investigates how to decentralize OS abstractions while maintaining proper semantics.It also describes the implementation of a prototype library operating system–ExOS1.0–that has enough functionality to run a wide variety of applications from editors to compilers.ExOS1.0serves as an excellent tool for OS research and as a step toward the full understanding of how to design,build,and use library operating systems. Thesis Supervisor:M.Frans KaashoekTitle:Associate ProfessorThesis Supervisor:Gregory R.GangerTitle:Postdoctoral AssociateThesis Supervisor:Dawson R.EnglerTitle:Ph.D.CandidateAcknowledgmentsThe work presented in this thesis is joint work with Dawson Engler and Frans Kaashoek.It goes without saying that the discussions with Tom Pinckney and Greg Ganger greatly increased the breadth and depth of this work.I thank the members of the Parallel and Distributed Operating Systems group for withstanding me(Costa and Rusty,why are you smiling?)and providing a joyful research environment.The painstaking task of reading the whole thesis was endured several times by Dawson Engler, Frans Kaashoek and Greg Ganger.Their invaluable feedback is greatly appreciated.Dawson Engler is thanked specially for questioning many of the issues covered in this thesis, and poking my brain with his questions and suggestions.Without his feedback,this would have been a truly different thesis.Tom Pinckney is thanked for answering my numerous questions and listening to all my comments (including the random ones).My parents are thanked for all their encouragement throughout life and their example of hard work and excellence.A special thanks goes to the Fundaci´o n Gran Mariscal de Ayacucho and the Venezuelan Gov-ernment for making it possible to enrich my education at MIT.This research was supported in part by ARPA contract N00014-94-1-0985,by a NSF National Young Investigator Award to Prof.Frans Kaashoek,and by an Intel equipment donation.4Contents1Introduction91.1Decentralization Advantages91.2Decentralization Challenges101.3Solution and Contribution101.4Thesis Overview10 2Decentralizing OS Abstractions132.1OS Abstractions132.2OS Abstractions Semantics142.3Centralized OSs Services and Features162.4Decentralizing OS Abstractions18 3ExOS1.0233.1Design233.2Experimental Framework:XOK243.3Implementation of Major Abstractions and Mechanisms243.3.1Bootstrapping,Fork and Exec243.3.2Process Management253.3.3Signals263.3.4File Descriptors263.3.5Files273.3.6Sockets283.3.7Pipes293.3.8Pseudo-Terminals293.4Performance Evaluation303.5Discussion30 4Future Work:ExOS2.0334.1Design Goals334.2Design Description344.2.1Local Directories and Files344.2.2Sockets3554.2.3Pseudo-Terminals364.2.4Signals36 5Related Work37 6Conclusion396List of Tables3-1Supported File Descriptor Types27 3-2Application Benchmark Results3078Chapter1IntroductionIt has been widely recognized that traditional operating systems(OSs)should provide much more ﬂexibility and performance to applications[2,3,4,14].The exokernel architecture is intended to solve the performance andﬂexibility problems associated with traditional OSs by giving applications protected and efﬁcient control of hardware and software resources.Exokernels simply protect resources,allowing applications to manage them.Libraries implementing OS abstractions can be linked with applications to provide programmers the same interfaces as traditional OSs under this architecture.Two major questions of this thesis are“can library operating systems provide the same abstractions as traditional OSs?”and“if so,how?.”This thesis explores mechanisms for providing common OS interfaces while simultaneously maintaining theﬂexibility and performance advantages of the exokernel architecture.Traditional OSs enforce abstractions on hardware resources and the software structures used for resource management.They enforce these abstractions by using a well-deﬁned system call interface and keeping all system state centralized in a privileged address space.Centralization simpliﬁes the sharing of system state because all system state is available to all processes when a system call is being serviced.The system call guarantees that the state will not be corrupted or maliciously modiﬁed.Unfortunately these advantages come at the cost ofﬂexibility and performance,since ap-plications are forced to use speciﬁc abstractions with speciﬁc policies to access hardware resources. For example,disk blocks in UNIX systems are cached,and the cache uses a least-recently-used policy for cache block replacement.Applications that beneﬁt from either eliminating caching or using a different policy cannot replace this policy under traditional UNIX systems.1.1Decentralization AdvantagesThe exokernel OS architecture gives applications more control over resources by decentralizing resource management from the OS into unprivileged libraries(called library operating systems).In contrast to conventional OSs,in which a central authority both protects and abstracts hardware re-sources,exokernels only protect resources,leaving their management to applications.For example, exokernel“processes”manage their own virtual memory.Processes handle page faults,allocate memory and map pages as necessary.This control allows for the implementation of mechanisms9such as copy-on-write and memory mappedﬁles all at user-level.Thus,any of these traditional abstractions can be optimized and customized on a per-process basis.With the exokernel’s powerful low-level primitives,traditional OS abstractions can be imple-mented in library operating systems.Applications can use a speciﬁc library OS depending on their access patterns,usage of abstractions,and desired policies.Additionally,if none of the available abstractions are suited for a particular application,unprivileged programmers can safely implement their own abstractions.In this way,libraries provide maximumﬂexibility and performance to applications.1.2Decentralization ChallengesTo decentralize OS abstractions many problems must be addressed:state that was previously persistent across process invocations may no longer be;shared state can be corrupted by other applications if not properly protected;resources can be accessed concurrently,so there must be mechanisms to guarantee a minimum amount of concurrency control;ﬁnally,since all names manipulated by an exokernel are physical names such as physical page number or a physical disk block,there must be mechanisms to map these names to the logical names commonly used by applications.The main question then becomes how to implement library operating systems on exokernels. To answer this question,this thesis will explore an implementation of the application programming interface(API)provided by OpenBSD,a BSD-like UNIX OS.This interface provides a widely used set of OS abstractions that,in addition to allowing us to explore the issues in decentralizing OS abstractions,will provide the exokernel prototype with a large application base to test and evaluate the exokernel and library OS architectures.The main challenge of this implementation is conforming to the API while at the same time maintaining theﬂexibility and performance of the exokernel OS architecture.1.3Solution and ContributionThis thesis addresses the challenges of decentralizing OS abstractions byﬁrst exploring what se-mantics they have and how their state is shared.With this information as a foundation,mechanisms for decentralizing OS abstractions are described.Two main approaches are identiﬁed:duplicating the mechanisms used under centralized schemes and using mechanisms inherent to decentralized schemes.The insights gained are applied to OpenBSD to create a library operating system called ExOS1.0that provides the same OS abstractions as OpenBSD.ExOS1.0provides enough func-tionality to run a variety of applications from compilers to editors.1.4Thesis OverviewThe remainder of this thesis is organized as follows.Chapter2describes the issues and mechanisms used to decentralize OS abstractions.Chapter3describes the design,implementation and evaluation10of ExOS1.0.Chapter4discusses planned future work with library operating systems,including a partial design of ExOS2.0.Chapter5discusses work related to library operating systems and the decentralization of UNIX abstractions.Finally,Chapter6concludes.1112Chapter2Decentralizing OS AbstractionsThis chapter discusses the issues that arise when decentralizing OS abstractions.In order to provide better insight,centralized and decentralized approaches will be contrasted.Although this thesis does not cover all possible OS abstractions,this chapter gives a high-level overview of common OS abstractions in order to make concrete the issues.Semantics deﬁne an important part of OS abstractions,therefore the most important semantic issues will be enumerated.In order to contrast centralized and decentralized approaches,the features or abilities of centralized OSs will be described along with examples of how to implement the major abstractions.At this point,the issues and methods to decentralize OS abstractions can be better understood.This chapter is divided into four sections.Section 2.1describes the general OS abstractions. Section2.2describes the major semantics characteristics of such abstractions,including protection, atomicity,and concurrency.Section 2.3presents some of the features and abilities of centralized OSs and how they relate to the semantics of abstractions.Section 2.4concludes with the mecha-nisms and issues related to decentralizing OS abstractions.2.1OS AbstractionsTo focus the discussion of decentralizing OS abstractions,it is useful toﬁrst describe some common OS abstractions.This section describes OS abstractions,including some invariants that are enforced in most systems.The abstractions described are divided into four categories:processes,virtual memory,ﬁle systems,and communication.Processes:A process is basically a program in execution.It consists of the executable program’s data and stack,its program counter,stack pointer,and other registers,and all other information needed to run the program[17].Processes can be suspended and resumed to allow multiple processes to timeshare a machine.There are usually a set of credentials associated with each process that deﬁne the resources it is allowed to access.Typically a process can create another process creating a family tree of processes.Virtual Memory:The abstraction of virtual memory simulates access to larger memory spaces and isolates programs from each other.This is usually done by moving memory13pages back and forth between physical memory and a larger backing store(e.g.a disk), and using hardware support to remap every memory access appropriately.For example, when two different processes access memory location5,that location will correspond to a different physical memory addresses for each process,which provides isolation of memory between the processes.The only way processes can share regions of their address space is via explicit sharing.Most OSs provide ways to explicitly setup regions of memory that are shared between processes.The credentials held by the processes can be used to validate the sharing of these memory regions.File System:File system abstractions permit data to be made persistent across reboots.Information is stored in entities calledﬁles.Files are usually hierarchically named to simplify and organizeﬁle access.These names are unique and persistent across reboots.Additional information is stored about eachﬁle,like access times and ownership,to allow accounting and to restrict access to theﬁle.File system abstractions usually involve strong semantics and invariants,because of the many requirements needed to guarantee persistence.For example, all meta-data(information aboutﬁles)must be carefully written to a stable storage to guard against inconsistency in case of a power failure.Communication:Communication abstractions allow processes to exchange information with other processes on the same machine and on other machines in the network.The communication abstractions may have varying semantics,such as guaranteed delivery,best-effort,record-based,etc.Most current OSs do not protect against spooﬁng of communication channels that go over a network,but they do protect communication channels on the same machine.For example,they generally prevent the interception of information sent between two programs by a third program on the same machine.Also,most OSs restrict access to incoming data from the network to only the processes holding that connection.2.2OS Abstractions SemanticsNow that some common OS abstractions have been laid out,their semantics can be understood.This section explores seven of the main semantic characteristics that OS abstractions deal with in varying degrees:protection,security,concurrency control,naming,coherence,atomicity,and persistence. These semantic issues have to be considered in order to properly implement these abstractions under any scheme,be it centralized or decentralized.In some cases,the actual semantics are more strict than necessary and relaxing them will not change the behavior or the correctness of programs.Protection:Protection prevents unwanted changes to state.Protection can lead to fault-isolation,and to an abstraction barrier.Fault-isolation is the containment of faults within processes. For example,the process abstraction has fault-isolation,in that the failure of one process does not directly affect any unrelated process.The abstraction barrier protects the state of an abstraction from changes due to methods outside the abstraction.For example,the internal state of most,if not all,UNIX abstractions is protected from processes.Processes are not able see or directly change this state,they can only call the procedures exported by the abstractions.14Access Control:Access control deﬁnes the kind of actions and entities that are allowed to perform these actions.For example,UNIX associates a<uid,gid>pair with each process.When a process wants to write aﬁle,the OS compares the<uid,gid>pair that is allowed to write with that of the process to determine if the operation is allowed.Concurrency:Concurrency deﬁnes the behavior of abstractions when they are simultaneously accessed or acted upon by programs.There are various degrees of concurrency,from none to complete serializability.For example,ifﬁles had no concurrency,it would be acceptable for two simultaneous writes to the same region of theﬁle to produce the effect of writing any mixture of the two regions written.In contrast,most UNIX OSs use an absolute time semantics,specifying that the later write is what would be seen by the next read of the same region of theﬁle(if both readers and writers are on the same machine).This absolute time concurrency can be said to be serializable –its outcome can be recreated from some ordering of allﬁle writes.Concurrency semantics provide invariants about abstractions that make it easier to reason about their possible states.Naming:Naming permits unrelated processes to access an object or to communicate with each other.By agreeing on the name,two processes simultaneously or at different times can access the same object.Naming is used under UNIX forﬁles,pipes,and sockets.For example,by agreeing on a name beforehand,one process can write to aﬁle any mysterious events that it detects,and another process later in time can read the sameﬁle and display the results to a user.This behavior is possible because the name associated with theﬁle does not change through time.Coherence:Coherence deﬁnes when and how the abstraction state is well-deﬁned.For example, lack of coherence could imply that a change to aﬁle by one process will not be seen by another process unless the other process explicitly synchronizes theﬁle.UNIX OSs typically provide strong ﬁle coherence.If one process writes to aﬁle,the change can immediately be observed by other processes on the same system.Atomicity:Atomicity describes the behavior of abstractions interruptions.Strong atomicity guarantees make it easier to reason about and to restore the state of an abstraction after an unexpected interruption(e.g.a power failure).For example,most UNIX abstractions are atomic with respect to explicit interruptions such as signals.Additionally,UNIX abstractions strive to be atomic with respect to unexpected interruptions such as power failure(although they do not always succeed).For example,ﬁle delete is atomic with respect to unexpected interruptions–theﬁle is either removed or not.On the other hand, longﬁle writes are not atomic–if an unexpected interruption occurs only part of the write may be completed.Persistence:Persistence refers to the lifetime of an object.Objects may have longer lifetimes than their creators.For example,system V shared memory segments persist until explicitly removed. Even if these memory segments are not mapped by any process,the data in them will be available to any process that subsequently maps them.This organization allows processes to share information in memory across invocations.Another more useful example is the exit status of processes.Even after a process has terminated,its exit status will be available until the parent process is ready to use it.In contrast,the data memory of a process is not persistent;it vanishes when the process exits.152.3Centralized OSs Services and FeaturesCentralized OSs have certain abilities and features that help in implementing OS abstractions and in guaranteeing their semantics.This section summarizes the relevant characteristics of centralized OSs and describes how they are used to implement the semantics discussed in the previous section. The relevant features include:Controlled Entry Points:Calls into centralized OSs can only occur at well-deﬁned entry points(i.e.,system calls),thus guaranteeing that all the guards and proper checks have been executed before changing the state of any abstraction state.Processes in general do not have this characteristic;procedures can be entered at any point.Different Protection Domains:By executing in a different protection domain,abstraction state can be protected against wild reads and writes.Preventing wild reads ensures that no extra information is revealed about the internal state.Preventing wild writes guarantees that the state is not modiﬁed by methods external to the abstraction.This organization provides fault-isolation,because faults outside the abstraction are not propagated into the abstraction except(possibly)through the exported interfaces,which can guard against inconsistent data.Different protection domains combined with controlled entry points allow one to implement protection by separating the abstraction state from its client and by controlling the methods that modify the abstraction state.Processes can not intentionally or unintentionally modify the state of an abstraction,except through the well-deﬁned methods.If a process has a fault, it will not affect the abstraction,except for the fact that the abstraction has to detect the termination of the process and properly clean up its state.Well-Formed Updates:By having controlled entry points and different protection domains, centralized abstractions can enforce well-formed updates.The abstraction methods are the only ones that can modify the object,and they can only be called at speciﬁc starting points.Thus,if they are correct,all updates to the abstraction state will be well-formed.State Uniﬁcation:All of an abstraction’s state can be uniﬁed in a single location.Preventing multiple copies eases the task of maintaining coherence.For example,in most centralized OSs,only theﬁle system cachesﬁle blocks,so that no two processes see a different view of theﬁle(unless they explicitly request this).Abstractions can control what state processes see and when they see it.The state can be made inaccessible while it is incoherent.For example,ﬁle blocks are usually cached in an OS to avoid repeated access to disk.If one process writes to a cachedﬁle block,other processes that want to read the block will wait for the ongoing write to complete and then read the copy cached by theﬁle system abstraction.Atomicity can be implemented in a similar way to coherence.State is not allowed to be seen if an atomic operation affecting that state is in progress.Additionally,UNIX systems provide the notion of signals,a way to interrupt processes.Certain system calls should appear atomic even if interrupted by signals.Abstraction implementations usually wait until all resources16for a given operation are available,then they check for any signals.If any signals are pending, the operation is aborted and no state is changed.Otherwise,the operation is carried through until fully completed.For example,if a largeﬁle write request is made,theﬁle abstraction ﬁrst checks for the available disk space.If there is enough and it does not have any signals pending,it will start the large write operation.Any process trying to look at theﬁle at that point will be blocked until the write completes.Additionally,if any signals are posted after the write has started,they are ignored,because undoing the writes is difﬁcult on traditional UNIXﬁle systems.This“guard-action approach”guarantees that the write will appear atomic to all processes.Even the midway point is atomic(unless the system crashes)because no other process will be allowed to see theﬁle.Single Point of Serialization:With uniﬁcation,the centralized OS is a single point for ing standard mechanisms such as locks and critical regions,a centralized OS enforces concurrency semantics of abstractions.Because centralized OSs live longer than most processes,they can release locks even if the relevant process terminates. Concurrency control can be easily implemented with a single point of serialization.Processes accessing an abstraction do so through one or more well-deﬁned entry points.The abstraction can block requests if there are requests in progress.For example,if twoﬁle writes operations to the same region of aﬁle are requested,theﬁle system abstraction blocks the second until theﬁrst isﬁnished,thus,the behavior of concurrent write accesses toﬁles is well-deﬁned. Global Information:With uniﬁcation of all abstractions in a centralized location,abstractions now have access to information about all processes.This information includes access to information across time.This eases the job of naming and persistence.In addition,all name resolution takes place at one location:the centralized OS.Together with access to global information,the names can be quickly translated to the underlying object. For example,if a process wants to access a shared memory segment,it will query its virtual memory manager for the segment.This manager in turn can communicate with the shared memory abstraction to locate the segment.Once located,the virtual memory manager for the process can map the segment in the process’s address space.Expanded Authority:Centralized OSs generally have an expanded authority that regular processes do not have.This authority allows protected and controlled access to powerful resources.Security is enabled by the fact of global information and strict checks for access.The centralized OS is the holder of all resources and process credentials.It can use these credentials to validate access to resources.The drawback is that improper or missing checks will allow unauthorized access to a resource.This is in contrast to a scenario where the resources and credentials are separated and checks are always done.Single Point of Update:The centralized OS provides a single point of modiﬁcation to improve the system and add abstractions.Once this modiﬁcation has been made,all processes beneﬁt from the change.17Unfortunately,while centralization can simplify the implementation of OS abstractions,it limits ﬂexibility and performance.Flexibility is limited by providing only oneﬁxed set of abstractions to resources.Thus,any program that could beneﬁt from a different abstraction for a resource are unable to do so.Performance is limited by inﬂexibility and strong invariants.For example,if two writes are requested to two different regions of the sameﬁle,the UNIXﬁle system abstraction will not allow the two writes to take place at the same time in order to guarantee coherency and atomicity.If the processes doing the writes do not require these semantics(e.g.if it is a temporary ﬁle),they cannot take advantage of this fact to improve their performance.This problem can be solved by decentralizing of OS abstractions.With appropriately decen-tralized control,abstractions can be better customized and specialized by applications according to their needs.2.4Decentralizing OS AbstractionsThere are two ways to decentralize OS abstractions:partially duplicate the features and abilities of centralized schemes or use the inherent abilities of decentralized schemes to implement the OS abstractions and their semantics in libraries.In some cases,the nature of the semantics required by an abstraction may limit the choices for implementation to centralized schemes.For example,in order to provide absolute coherence,it may be necessary to locate the state in a centralized location to avoid any communication with other holders of the same state.To better explain how semantics and abstractions can be implemented in a decentralized manner,this section presents mechanisms that can be used in decentralized settings and discusses possible implementations of the semantics properties described in 2.2.The relevant abilities of decentralized OSs are:Separation:In contrast to centralized operating systems’uniﬁcation,abstraction state is maintained separately for each process.This organization allows each process to hold and protect its own abstraction state.As in microkernels,a practical consequence is that errors in the OS code no longer crash the system,only the application they are associated with.Minimized Authority:Minimized authority follows the principle of least-privilege.Each abstraction has control over only its own internal state.Any modiﬁcation to other abstraction state must be done through explicit interfaces with that abstraction.Decoupled-changes:In contrast to centralized operating systems’single-point-update,de-centralized OSs decouple changes.This means that changes in the implementation of an abstraction for one process do not directly affect other processes.This structure gives room forﬂexibility and performance improvements,since the implementation of an abstraction can be optimized and customized for its speciﬁc usage.Local Information:If implemented properly,decentralized abstractions work mostly with local information.This organization has the advantage of reliability and scalability.Abstrac-tions rely less on and have less contention for a central repository of information.18。