A Machine Learning Approach to Automatic Production of Compiler Heuristics
- 格式:pdf
- 大小:374.20 KB
- 文档页数:10
高三英语人工智能单选题40题1. Artificial intelligence is a branch of computer science that aims to create intelligent _____.A.machinesB.devicesC.equipmentsD.instruments答案解析:A。
“machines”通常指机器,在人工智能领域常指具有一定智能的机器;“devices”一般指设备、装置;“equipments”是“equipment”的复数形式,指装备、器材;“instruments”指仪器、工具。
2. In the field of artificial intelligence, algorithms are used to train _____.A.modelsB.patternsC.shapesD.forms答案解析:A。
“models”在人工智能中常指模型,可以通过算法进行训练;“patterns”指模式、图案;“shapes”指形状;“forms”指形式。
3. Deep learning is a powerful technique in artificial intelligence that uses neural _____.worksB.systemsC.structuresanizations答案解析:A。
“networks”指网络,在深度学习中常指神经网络;“systems”指系统;“structures”指结构;“organizations”指组织。
4. Artificial intelligence can process large amounts of data to make accurate _____.A.predictionsB.forecastsC.projectionsD.expectations答案解析:A。
人工智能生态相关的专业名词解释下载温馨提示:该文档是我店铺精心编制而成,希望大家下载以后,能够帮助大家解决实际的问题。
文档下载后可定制随意修改,请根据实际需要进行相应的调整和使用,谢谢!并且,本店铺为大家提供各种各样类型的实用资料,如教育随笔、日记赏析、句子摘抄、古诗大全、经典美文、话题作文、工作总结、词语解析、文案摘录、其他资料等等,如想了解不同资料格式和写法,敬请关注!Download tips: This document is carefully compiled by the editor. I hope that after you download them, they can help yousolve practical problems. The document can be customized and modified after downloading, please adjust and use it according to actual needs, thank you!In addition, our shop provides you with various types of practical materials, such as educational essays, diary appreciation, sentence excerpts, ancient poems, classic articles, topic composition, work summary, word parsing, copy excerpts,other materials and so on, want to know different data formats and writing methods, please pay attention!人工智能技术的快速发展,推动着人工智能生态系统的形成和完善。
中科院自动化所的中英文新闻语料库中科院自动化所(Institute of Automation, Chinese Academy of Sciences)是中国科学院下属的一家研究机构,致力于开展自动化科学及其应用的研究。
该所的研究涵盖了从理论基础到技术创新的广泛领域,包括人工智能、机器人技术、自动控制、模式识别等。
下面将分别从中文和英文角度介绍该所的相关新闻语料。
[中文新闻语料]1. 中国科学院自动化所在人脸识别领域取得重大突破中国科学院自动化所的研究团队在人脸识别技术方面取得了重大突破。
通过深度学习算法和大规模数据集的训练,该研究团队成功地提高了人脸识别的准确性和稳定性,使其在安防、金融等领域得到广泛应用。
2. 中科院自动化所发布最新研究成果:基于机器学习的智能交通系统中科院自动化所发布了一项基于机器学习的智能交通系统研究成果。
通过对交通数据的收集和分析,研究团队开发了智能交通控制算法,能够优化交通流量,减少交通拥堵和时间浪费,提高交通效率。
3. 中国科学院自动化所举办国际学术研讨会中国科学院自动化所举办了一场国际学术研讨会,邀请了来自不同国家的自动化领域专家参加。
研讨会涵盖了人工智能、机器人技术、自动化控制等多个研究方向,旨在促进国际间的学术交流和合作。
4. 中科院自动化所签署合作协议,推动机器人技术的产业化发展中科院自动化所与一家著名机器人企业签署了合作协议,共同推动机器人技术的产业化发展。
合作内容包括技术研发、人才培养、市场推广等方面,旨在加强学界与工业界的合作,加速机器人技术的应用和推广。
5. 中国科学院自动化所获得国家科技进步一等奖中国科学院自动化所凭借在人工智能领域的重要研究成果荣获国家科技进步一等奖。
该研究成果在自动驾驶、物联网等领域具有重要应用价值,并对相关行业的创新和发展起到了积极推动作用。
[英文新闻语料]1. Institute of Automation, Chinese Academy of Sciences achievesa major breakthrough in face recognitionThe research team at the Institute of Automation, Chinese Academy of Sciences has made a major breakthrough in face recognition technology. Through training with deep learning algorithms and large-scale datasets, the research team has successfully improved the accuracy and stability of face recognition, which has been widely applied in areas such as security and finance.2. Institute of Automation, Chinese Academy of Sciences releases latest research on machine learning-based intelligent transportationsystemThe Institute of Automation, Chinese Academy of Sciences has released a research paper on a machine learning-based intelligent transportation system. By collecting and analyzing traffic data, the research team has developed intelligent traffic control algorithms that optimize traffic flow, reduce congestion, and minimize time wastage, thereby enhancing overall traffic efficiency.3. Institute of Automation, Chinese Academy of Sciences hosts international academic symposiumThe Institute of Automation, Chinese Academy of Sciences recently held an international academic symposium, inviting automation experts from different countries to participate. The symposium covered various research areas, including artificial intelligence, robotics, and automatic control, aiming to facilitate academic exchanges and collaborations on an international level.4. Institute of Automation, Chinese Academy of Sciences signs cooperation agreement to promote the industrialization of robotics technologyThe Institute of Automation, Chinese Academy of Sciences has signed a cooperation agreement with a renowned robotics company to jointly promote the industrialization of robotics technology. The cooperation includes areas such as technology research and development, talent cultivation, and market promotion, aiming to strengthen the collaboration between academia and industry and accelerate the application and popularization of robotics technology.5. Institute of Automation, Chinese Academy of Sciences receivesNational Science and Technology Progress Award (First Class) The Institute of Automation, Chinese Academy of Sciences has been awarded the National Science and Technology Progress Award (First Class) for its important research achievements in the field of artificial intelligence. The research outcomes have significant application value in areas such as autonomous driving and the Internet of Things, playing a proactive role in promoting innovation and development in related industries.。
中考英语生物工程的前沿技术单选题40题1. Gene editing technology in medicine can ____ many incurable diseases.A. treatB. preventC. causeD. ignore答案解析:A。
本题考查动词的词义辨析。
选项A“treat”有治疗的意思,基因编辑技术在医学上可以用来治疗很多不治之症,符合语境。
选项B“prevent”是预防,基因编辑技术主要是对已有的疾病进行处理而不是预防,所以该选项错误。
选项C“cause”是导致,与基因编辑技术在医疗中的积极作用相悖,错误。
选项D“ignore”是忽视,不符合基因编辑技术在医疗中的功能,这里主要考查词汇的理解。
2. In agricultural gene editing, which of the following is a possible benefit?A. Reducing the yieldB. Increasing the resistance to pestsC. Making plants more sensitive to diseasesD. Decreasing the nutritional value答案解析:B。
本题考查农业方面基因编辑的好处。
选项B“增加对害虫的抵抗力”是基因编辑在农业方面可能带来的好处。
选项A“减少产量”不是好处,不符合题意。
选项C“使植物对疾病更敏感”是负面的,不是好处。
选项D“降低营养价值”也是负面的,不符合基因编辑在农业中的积极意义,这里主要考查词汇和对基因编辑在农业应用的理解。
3. The gene - editing tool CRISPR is known for its ____.A. complexityB. inaccuracyC. high costD. simplicity and efficiency答案解析:D。
热点话题一:人工智能与科技标题:The Impact of Artificial Intelligence on the Job Market内容概要:随着人工智能的不断发展,许多工作领域正在经历巨大的变革。
自动化技术和机器学习的应用对就业市场产生了深远的影响。
一些工作可能会被自动化取代,但同时也会出现新的就业机会,需要人们具备更高级的技能,如创意思维、解决复杂问题的能力和人际沟通技巧。
这需要教育体系和培训机构相应地调整,以培养适应未来工作环境的人才。
标题:Ethical Considerations in Artificial Intelligence Development内容概要:人工智能的快速发展引发了一系列伦理问题。
例如,自主驾驶汽车在道路上的行为决策,医疗诊断的准确性,以及个人隐私在数据分析中的使用等。
这些问题需要严肃思考和监管,以确保人工智能的应用不会损害人类的安全和权益。
同时,也需要制定一系列的法规和规范,来指导人工智能技术的开发和使用。
标题:The Role of Artificial Intelligence in Healthcare内容概要:人工智能在医疗领域的应用日益增多。
从辅助医生进行更精准的诊断,到药物研发和基因编辑,人工智能技术正在改变医疗行业的面貌。
这些技术可以加速疾病诊断和治疗过程,提高医疗效率。
然而,同时也需要注意数据安全和患者隐私的保护,以及确保医疗决策仍然具有人类医生的判断和道德。
标题:Artificial Intelligence in Education: Personalized Learning内容概要:人工智能在教育领域的应用也具有巨大潜力。
个性化学习平台利用人工智能分析学生的学习习惯和表现,为每个学生定制教学内容和进度。
这可以提高学生的学习效果,激发学习兴趣。
然而,这也带来了一些挑战,如如何平衡人工智能辅助和教师指导,以及如何确保学生在个性化学习中获得全面的知识和技能。
正确答案:A、B 你选对了Quizzes for Chapter 11 单选(1 分)图灵测试旨在给予哪一种令人满意的操作定义得分/ 5 多选(1 分)选择下列计算机系统中属于人工智能的实例得分/总分总分A. W eb搜索引擎A. 人类思考B.超市条形码扫描器B. 人工智能C.声控电话菜单该题无法得分/1.00C.机器智能 1.00/1.00D.智能个人助理该题无法得分/1.00正确答案:A、D 你错选为C、DD.机器动作正确答案: C 你选对了6 多选(1 分)选择下列哪些是人工智能的研究领域得分/总分2 多选(1 分)选择以下关于人工智能概念的正确表述得分/总分A.人脸识别0.33/1.00A. 人工智能旨在创造智能机器该题无法得分/1.00B.专家系统0.33/1.00B. 人工智能是研究和构建在给定环境下表现良好的智能体程序该题无法得分/1.00C.图像理解C.人工智能将其定义为人类智能体的研究该题无法D.分布式计算得分/1.00正确答案:A、B、C 你错选为A、BD.人工智能是为了开发一类计算机使之能够完成通7 多选(1 分)考察人工智能(AI) 的一些应用,去发现目前下列哪些任务可以通过AI 来解决得分/总分常由人类所能做的事该题无法得分/1.00正确答案:A、B、D 你错选为A、B、C、DA.以竞技水平玩德州扑克游戏0.33/1.003 多选(1 分)如下学科哪些是人工智能的基础?得分/总分B.打一场像样的乒乓球比赛A. 经济学0.25/1.00C.在Web 上购买一周的食品杂货0.33/1.00B. 哲学0.25/1.00D.在市场上购买一周的食品杂货C.心理学0.25/1.00正确答案:A、B、C 你错选为A、CD.数学0.25/1.008 填空(1 分)理性指的是一个系统的属性,即在_________的环境下正确答案:A、B、C、D 你选对了做正确的事。
得分/总分正确答案:已知4 多选(1 分)下列陈述中哪些是描述强AI (通用AI )的正确答案?得1 单选(1 分)图灵测试旨在给予哪一种令人满意的操作定义得分/ 分/总分总分A. 指的是一种机器,具有将智能应用于任何问题的A.人类思考能力0.50/1.00B.人工智能B. 是经过适当编程的具有正确输入和输出的计算机,因此有与人类同样判断力的头脑0.50/1.00C.机器智能 1.00/1.00C.指的是一种机器,仅针对一个具体问题D.机器动作正确答案: C 你选对了D.其定义为无知觉的计算机智能,或专注于一个狭2 多选(1 分)选择以下关于人工智能概念的正确表述得分/总分窄任务的AIA. 人工智能旨在创造智能机器该题无法得分/1.00B.专家系统0.33/1.00B. 人工智能是研究和构建在给定环境下表现良好的C.图像理解智能体程序该题无法得分/1.00D.分布式计算C.人工智能将其定义为人类智能体的研究该题无法正确答案:A、B、C 你错选为A、B得分/1.00 7 多选(1 分)考察人工智能(AI) 的一些应用,去发现目前下列哪些任务可以通过AI 来解决得分/总分D.人工智能是为了开发一类计算机使之能够完成通A.以竞技水平玩德州扑克游戏0.33/1.00常由人类所能做的事该题无法得分/1.00正确答案:A、B、D 你错选为A、B、C、DB.打一场像样的乒乓球比赛3 多选(1 分)如下学科哪些是人工智能的基础?得分/总分C.在Web 上购买一周的食品杂货0.33/1.00A. 经济学0.25/1.00D.在市场上购买一周的食品杂货B. 哲学0.25/1.00正确答案:A、B、C 你错选为A、CC.心理学0.25/1.008 填空(1 分)理性指的是一个系统的属性,即在_________的环境下D.数学0.25/1.00 做正确的事。
高一人工智能英语阅读理解25题1<背景文章>Artificial intelligence (AI) has become one of the most talked - about topics in recent years. AI can be defined as the simulation of human intelligence processes by machines, especially computer systems. These processes include learning, reasoning, problem - solving, perception, and language understanding.The development of AI has a long history. It started in the 1950s when the concept was first introduced. In the early days, AI research focused on simple tasks like playing games and solving basic mathematical problems. However, with the development of computer technology and the increase in data availability, AI has made great strides.AI has found applications in various fields. In the medical field, AI can assist doctors in diagnosing diseases. For example, it can analyze medical images such as X - rays and MRIs to detect early signs of diseases that might be missed by human eyes. In education, AI - powered tutoring systems can provide personalized learning experiences for students. They can adapt to the individual learning pace and style of each student, helping them to better understand difficult concepts. In the transportation industry, self - driving cars, which are a significant application of AI, are expectedto revolutionize the way we travel. They can potentially reduce traffic accidents caused by human error and improve traffic efficiency.However, AI also brings some potential negative impacts. One concern is the impact on employment. As AI systems can perform many tasks that were previously done by humans, there is a fear that many jobs will be lost. For example, jobs in manufacturing, customer service, and some administrative tasks may be at risk. Another issue is the ethical considerations. For instance, how should AI make decisions in life - or - death situations? And there are also concerns about data privacy as AI systems rely on large amounts of data.1. <问题1>What is the main idea of this passage?A. To introduce the development of computer technology.B. To discuss the applications and impacts of artificial intelligence.C. To explain how to solve problems in different fields.D. To show the importance of data in AI systems.答案:B。
pythonautomat库的用法Automat是一个Python库,用于简化有限状态机的创建和使用。
它提供了一种方便的方式来描述状态机的行为和状态转换。
本文将详细介绍Automat库的用法。
Automat库的安装:Automat库可以通过pip安装:`pip install Automat`有限状态机(Finite State Machine):有限状态机是一种抽象的计算模型,它由一系列状态、状态之间的转换以及对输入的响应组成。
每当有输入时,状态机将根据当前状态和输入执行特定的行为,并将状态转换到下一个状态。
有限状态机可以用于各种应用程序中,如自动化控制、协议通信、游戏等。
Automat的用法:要使用Automat库,首先需要定义状态和转换。
状态是有限状态机的基本单元,用于表示系统的特定状态。
转换是从一个状态到另一个状态的行为。
```pythonfrom twisted.internet.defer import inlineCallbacks``````pythonfrom automat import MethodicalMachine, NoDefaultclass MyStateMachine:pass```定义行为:```pythonfrom automat import MethodicalMachine, NoDefault class MyStateMachine:def action1(self, arg1):print("Action 1 executed with argument:", arg1) def action2(self):print("Action 2 executed")```定义转换:```pythonfrom automat import MethodicalMachine, NoDefault class MyStateMachine:def action1(self, arg1):print("Action 1 executed with argument:", arg1) def action2(self):print("Action 2 executed")def event1(self, arg):passdef event2(self, arg):pass```在状态之间定义转换:要定义状态之间的转换,我们可以使用`MY_STATE.to(MY_STATE2)`语法。
A Machine Learning Approach to AutomaticProduction of Compiler HeuristicsAntoine Monsifrot,Fran¸c ois Bodin,and Ren´e QuiniouIRISA-University of Rennes France{amonsifr,bodin,quiniou}@irisa.frAbstract.Achieving high performance on modern processors heavilyrelies on the compiler optimizations to exploit the microprocessor archi-tecture.The efficiency of optimization directly depends on the compilerheuristics.These heuristics must be target-specific and each new proces-sor generation requires heuristics reengineering.In this paper,we address the automatic generation of optimizationheuristics for a target processor by machine learning.We evaluate thepotential of this method on an always legal and simple transformation:loop unrolling.Though simple to implement,this transformation mayhave strong effects on program execution(good or bad).However decid-ing to perform the transformation or not is difficult since many inter-acting parameters must be taken into account.So we propose a machinelearning approach.We try to answer the following questions:is it possible to devise alearning process that captures the relevant parameters involved in loopunrolling performance?Does the Machine Learning Based Heuristicsachieve better performance than existing ones?Keywords:decision tree,boosting,compiler heuristics,loop unrolling.1IntroductionAchieving high performance on modern processors heavily relies on the abil-ity of the compiler to exploit the underlying architecture.Numerous program transformations have been implemented in order to produce efficient programs that exploit the potential of the processor architecture.These transformations interact in a complex way.As a consequence,an optimizing compiler relies on internal heuristics to choose an optimization and whether or not to apply it.De-signing these heuristics is generally difficult.The heuristics must be specific to each implementation of the instruction set architecture.They are also dependent on changes made to the compiler.In this paper,we address the problem of automatically generating such heuristics by a machine learning approach.To our knowledge this is the first study of machine learning to build these heuristics.The usual approach consists in running a set of benchmarks to setup heuristics parameters.Very fewD.Scott(Ed.):AIMSA2002,LNAI2443,pp.41–50,2002.c Springer-Verlag Berlin Heidelberg200242 A.Monsifrot,F.Bodin,and R.Quinioupapers have specifically addressed the issue of building such heuristics.Never-theless approximate heuristics have been proposed[8,11]for unroll and jam a transformation that is like unrolling(our example)but that behaves differently.Our study aims to simplify compiler construction while better exploiting optimizations.To evaluate the potential of this approach we have chosen a simple transformation:loop unrolling[6].Loop unrolling is always legal and is easy to implement,but because it has many side effects,it is difficult to devise a decision rule that will be correct in most situations.In this novel study we try to answer the following questions:is it possible to learn a decision rule that selects the parameters involved in loop unrolling effi-ciency?Does the Machine Learning Based Heuristics(denoted MLBH)achieve better performance than existing ones?Does the learning process really take into account the target architecture?To answer thefirst question we build on previous studies[9]that defined an abstract representation of loops in order to capture the parameters influencing performance.To answer the second question we compare the performance of our Machine Learning Based Heuristics and the GNU Fortran compiler[3]on a set of applications.To answer the last question we have used two target machines, an UltraSPARC machine[12]and an IA-64machine[7],and used on each the MLBH computed on the other.The paper is organized as follows.Section2gives an overview of the loop unrolling transformation.Section3shows how machine learning techniques can be used to automatically build loop unrolling heuristics.Section4illustrates an implementation of the technique based on the OC1decision tree software[10].2Loop Unrolling as a CaseThe performance of superscalar processors relies on a very high frequency1and on the parallel execution of multiple instructions(this is also called Instruction Level Parallelism–ILP).To achieve this,the internal architecture of superscalar microprocessors is based on the following features:Memory hierarchy:the main memory access time is typically hundreds of times greater than the CPU cycle time.To limit the slowdown due to memory accesses,a set of intermediate levels are added between the CPU unit and the main memory;the level the closest to the CPU is the fastest,but also the small-est.The data or instructions are loaded by blocks(sets of contiguous bytes in memory)from one memory level of the hierarchy to the next level to exploit the following fact:when a program accesses some memory element,the next contiguous one is usually also accessed in the very near future.In a classical configuration there are2levels,L1and L2of cache memories as shown on the figure1.The penalty to load data from main memory tends to be equivalent to executing1000processor cycles.If the data is already in L2,it is one order of magnitude less.If the data is already in L1,the access can be done in only a few CPU cycles.1typically2gigahertz corresponding to a processor cycle time of0.5nanosecondA Machine Learning Approach43MemoryMain L2CacheCacheL1CPUx1000x100x10x1Fig.1.Memory hierarchyMultiple Pipelined Functional Units:the processor has multiple functional units that can run in parallel to execute several instructions per cycle (typically an integer operation can be executed in parallel with a memory access and a floating point computation).Furthermore these functional units are pipelined.This divides the operation in a sequence of steps that can be performed in parallel.Scheduling instructions in the functional units is performed in an out-of-order or in-order mode.Contrary to the in-order ,in the out-of-order mode instructions are not always executed in the order specified by the program.When a processor runs at maximum speed,each pipelined functional unit executes one instruction per cycle.This requires that all operands and branch addresses are available at the beginning of the cycle.Otherwise,functional units must wait during some delays.The processor performance depends on these waiting delays.The efficiency of the memory hierarchy and ILP are directly related to the structure and behavior of the code.Many program transformations reduce the number of waiting delays in program execution.Loop unrolling [6]is a simple program transformation where the loop body is replicated many times.It may be applied at source code level to benefit from all compiler optimizations.It improves the exploitation of ILP:increasing the size of the body augments the number of instructions eligible to out-of-order scheduling.Loop unrolling also reduces loop management overhead but it has also some beneficial side effects from later compiler steps such as common sub-expression elimination.However it also has many negative side effects that can cancel the benefits of the transformation:–the instruction cache behavior may be degraded (if the loop body becomes too big to fit in the cache),–the register allocation phase may generate spill code (additional load and store instructions),–it may prevent other optimization techniques.As a consequence,it is difficult to fully exploit loop pilers are usually very conservative.Their heuristics are generally based on the loop body size:under a specific threshold,if there is no control flow statement,the loop is unrolled.This traditional approach under-exploits loop unrolling [5]and must be adapted when changes are made to the compiler or to the target architecture.The usual approach to build loop unrolling heuristics for a given target com-puter consists in running a set of benchmarks to setup the heuristics parameters.This approach is intrinsically limited because in most optimizations such as loop unrolling,too many parameters are involved.Microarchitecture characteristics44 A.Monsifrot,F.Bodin,and R.Quiniou(for instance the size of instruction cache,...)as well as the other optimizations (for instance instruction scheduling,...)that follow loop unrolling during the compilation process should be considered in the decision procedure.The main parameters,but not all(for instance number instruction cache misses),which influence loop unrolling efficiency directly depend on the loop body statements. This is because loop unrolling mainly impacts code generation and instruction scheduling.As a consequence,it is realistic to base the unrolling decision on the properties of the loop code while ignoring its execution context.3Machine Learning for Building HeuristicsMachine learning techniques offer an automatic,flexible and adaptive framework for dealing with the many parameters involved in deciding the effectiveness of program optimizations.Classically a decision rule is learnt from feature vectors describing positive and negative applications of the transformation.However, it is possible to use this framework only if the parameters can be abstracted statically from the loop code and if their number remains limited.Reducing the number of parameters involved in the process is important as the performance of machine learning techniques is poor when the number of dimensions of the learning space is high[10].Furthermore learning from complex spaces requires more data.To summarize the approach,the steps involved in using a machine learning technique for building heuristics for program transformation are:1.finding a loop abstraction that captures the“performance”features involvedin an optimization,in order to build the learning set,2.choosing an automatic learning process to compute a rule in order to decidewhether loop unrolling should be applied,3.setting up the result of the learning process as heuristics for the compiler. In the remainder of this section,we present the representation used for abstract-ing loop properties.The next section shows how to sort the loop into winning and loosing classes according to unrolling.Finally,the learning process based on learning decision trees is overviewed.3.1Loop AbstractionThe loop abstraction must capture the main loop characteristics that influence the execution efficiency on a modern processor.They are represented by integer features which are relevant static loop properties according to unrolling.We have selected5classes of integer features:Memory access:number of memory accesses,number of array element reuses from one iteration to another.Arithmetic operations count:number of additions,multiplications or divi-sions excepting those in array index computations.Size of the loop body:number of statements in the loop.A Machine Learning Approach45 Control statements in the loop:number of if statements,goto,etc.in the loop body.Number of iterations:if it can be computed at compile time.In order to reduce the learning complexity,only a subset of these features are used for a given compiler and target machine.The chosen subset was determined experimentally by cross validation(see Section4).The quality of the predictions achieved by an artificial neural network based on20indices was equivalent to the predictive quality of the6chosen features.Figure2gives the features that were selected and an example of loop ab-straction.do i=2,100a(i)=a(i)+a(i-1)*a(i+1) enddo Number of statements1 Number of arithmetic operations2 Minimum number of iterations99 Number of array accesses4 Number of array element reuses3 Number of if statements0Fig.2.Example of features for a loop.3.2Unrolling Beneficial LoopsA learning example refers to a loop in a particular context(represented by the loop features).To determine if unrolling is beneficial,each loop is executed twice.Then,the execution times of the original loop and of the unrolled loop are compared.Four cases can be considered:not significant:the loop execution time is too small and therefore the timing is not significant.The loop is discarded from the learning set.equal:the execution times of the original and of the unrolled loop are close.A threshold is used to take into account the timer inaccuracy.Thus,the loop performance is considered as invariant by unrolling if the benefit is less than 10%.improved:the speedup is above10%.The loop is considered as being benefi-cial by unrolling.degraded:there is a speed-down.The loop is considered as a degraded loop by unrolling.The loop set is then partitioned into equivalence classes(denoted loop classes in the remainder).Two loops are in the same loop class if their respective ab-stractions are equal.The next step is to decide if a loop class is to be considered as a positive or a negative example.Note that there can be beneficial and degraded loops in the same class as non exhaustive descriptions are used to represent the loops.This is a natural situation as the loop execution or compilation context may greatly influence its execution time,for instance due to instruction cache memory effects. The following criterion has been used to decide whether a class will represent a positive or a negative example:46 A.Monsifrot,F.Bodin,and R.Quiniou1.In a particular class,a loop whose performance degrades by less than 5%is counted once,a loop that degrades performance by 10%is counted twice.A loop that degrades performance more than 20%is counted three times.2.if the number of unrolling beneficial loops is greater than the number of degraded loops (using the weights above),then the class represents a positive example,else the class represents a negative example.3.3A Learning Method Based on Decision Trees and BoostingWe have chosen to represent unrolling decision rules as decision trees.Decision trees can be learnt efficiently from feature based vectors.Each node of the de-cision tree represents a test checking the value(s)of one (or several)feature(s)which are easy to read by an expert.This is not the case for statistical meth-ods like Nearest Neighbor or Artificial Neural Network for instance,which have comparable or slightly better performance.We used the OC1[10]software.OC1is a classification tool that induces oblique decision trees.Oblique decision trees produce polygonal partitionings of the feature space.OC1recursively splits a set of objects in a hyperspace by finding optimal hyperplanes until every object in a subspace belongs to the same class.nnyyxynBA 6x+y > 60 ?3x −2y > 6 ?B −x+2y > 8 ?A Fig.3.The left side of the figure shows an oblique decision tree that uses two attributes.The right side shows the partitioning that this tree creates in the attribute space.A decision tree example is shown in Figure 3,together with its 2-D related space.Each node of the tree tests a linear combination of some indices (equiva-lent to an hyperplane)and each leaf of the tree corresponds to a class.The main advantages of OC1is that it finds smaller decision trees than classical tree learn-ing methods.The major drawback is that they are less readable than classical ones.The classification of a new loop is equivalent to finding a leaf loop class.Once induced,a decision tree can be used as a classification process.An object represented by its feature vector is classified by following the branches of the tree indicated by node tests until a leaf is reached.To improve the accuracy obtained with OC1we have used boosting [13].Boosting is a general method for improving the accuracy of any given algorithm.A Machine Learning Approach 47Boosting consists in learning a set of classifiers for more and more difficult prob-lems:the weights of examples that are misclassified by the classifier learnt at step n are augmented (by a factor proportional to the global error)and at step n +1a new classifier is learnt on this weighted examples.Finally,the global classification is obtained by a weighted vote of the individual classifier according to their proper accuracy.In our case 9trees were computed.4ExperimentsThe learning set used in the experiments is Number of loops 1036Discarded loops 177Unrolling beneficial loops 233Unrolling invariant loops 505Unrolling degraded loops 121Loop classes 572Positive examples 139Negative examples 433Table 1.IA-64learning set.made of loops extracted from programs inFortran 77.Most of them were chosen inavailable benchmarks [4,1].We have studiedtwo types of programs:real applications (themajority comes from SPEC [4])and compu-tational kernels.Table 1presents some char-acteristics of a loop set (cf section 3.2).The accuracy of the learning method wasassessed by a 10-fold cross-validation.We have experiment with pruning.We have ob-tained smaller trees but the resulting quality was degraded.The results without pruning are presented in Table 2.Two factors can explain the fact that the overall accuracy cannot be better than 85%:1.since unrolling beneficial and degraded loops can appear in the same class (cf section 3.2)a significant proportion of examples may be noisy,2.the classification of positive examples is far worse than the classification of negative ones.Maybe the learning set does not contain enough beneficial loops.To go beyond cross validation another set of experiments has been performed on two target machines,an UltraSPARC and an IA-64.They aim at showing the technique does catch the most significant loops of the programs.The g77[3]compiler was used.With this compiler,loop unrolling can be globally turned on and off.To assess our method we have implemented loop unrolling at the sourceTable 2.Cross validation accuracyUltraSPARC IA-64normal boosting normal boosting Accuracy of overallexample classification 79.4%85.2%82.6%85.2%Accuracy of positive example classification 62.4%61.7%73.9%69.6%Accuracy of negative example classification85.1%92.0%86.3%92.3%48 A.Monsifrot,F.Bodin,and R.Quinioucode level using TSF[2].This is not the most efficient scheme because in some cases this inhibits some of the compiler optimizations(contrary to unrolling performed by the compiler itself).We have performed experiments to check whether the MLB heuristics are at least as good as compiler heuristics and whether the specificities of a target architecture can be taken into account.A set of benchmark programs were selected in the learning set and for each one we have:1.run the code compiled by g77with-O3option,2.run the code compiled with-O3-funroll options:the compiler uses its ownunrolling strategy,3.unroll the loops according to the result of the MLB heuristics and run thecompiled code with-O3option.The heuristics was learned for the target machine from learning set where the test program was removed.4.unroll the loops according to the result of the MLB heuristics learnt for theother target machine and run the compiled code with-O3option.Fig.4.IA-64:-O3is the reference execution time.The performance results are given in Figure4and Figure5respectively for the IA-64and UltraSPARC targets.The average execution time of the optimized programs for the IA-64is93.8% of the reference execution time(no unrolling)using the MLB heuristics and 96.8%using the g77unrolling strategy.On the UltraSPARC we have respectively 96%and98.7%showing that our unrolling strategy performs better.Indeed,A Machine Learning Approach49Fig.5.UltraSPARC:-O3is the reference execution time.gaining a few percent on average execution time with one transformation is significant because each transformation is not often beneficial.For example,only 22%of the loops are beneficial by unrolling on IA-64and17%on UltraSPARC.In the last experiment we exchanged the decision trees learnt for the two target machines.On the UltraSPARC,the speedup is degraded from96%to 97.9%and on the IA-64it is degraded from93.8%to96.8%.This shows that the heuristics are effectively tuned to a target architecture.5ConclusionCompilers implement a lot of optimization algorithms for improving perfor-mance.The choice of using a particular sequence of optimizations and their parameters is done through a set of heuristics hard coded in the compiler.At each major compiler revision,but also at new implementations of the target Instruction Set Architecture,a new set of heuristics must be reengineered.In this paper,we have presented a new method for addressing such reengi-neering in the case of loop unrolling.Our method is based on a learning process which adapts to new target architectures or new compiler ing an abstract loop representation we showed that decision trees that provide target specific heuristics for loop unrolling can be learnt.While our study is limited to the simple case of loop unrolling it opens a new direction for the design of compiler heuristics.Even for loop unrolling, there are still many issues to consider to go beyond thisfirst result.Are there better abstractions that can capture loop characteristics?Can hardware counters50 A.Monsifrot,F.Bodin,and R.Quiniou(for instance cache miss counters)provide better insight on loop unrolling?How large should the learning set be?Can other machine learning techniques be more efficient than decision trees?More fundamentally our study raises the question whether it could be pos-sible or not to quasi automatically reengineer the implementation of a set of optimization heuristics for new processor target implementations.Acknowledgments.We would like to gratefully thank I.C.Lerman and L. Miclet for their insightful advice on machine learning techniques.References1.David Bailey.Nas kernel benchmark program,June1988./benchmark/nas.2.F.Bodin,Y.M´e vel,and R.Quiniou.A User Level Program Transformation Tool.In Proceedings of the International Conference on Supercomputing,pages180–187, July1998,Melbourne,Australia.3.GNU Fortran Compiler./.4.Standard Performance Evaluation Corporation./.5.Jack W.Davidson and Sanjay Jinturkar.Aggressive Loop Unrolling in a Retar-getable,Optimizing Compiler.In Compiler Construction,volume1060of Lecture Notes in Computer Science,pages59–73.Springer,April1996.6.J.J.Dongarra and A.R.Hinds.Unrolling loops in FORTRAN.Software Practiceand Experience,9(3):219–226,March1979.7.IA-64./design/Itanium/idfisa/index.htm.8.A.Koseki,H.Komastu,and Y.Fukazawa.A Method for Estimating OptimalUnrolling Times for Nested Loops.In Proceedings of the International Symposium on Parallel Architectures,Algorithms and Networks,1997.9.A.Monsifrot and puter Aided Hand Tuning(CAHT):“ApplyingCase-Based Reasoning to Performance Tuning”.In Proceedings of the15th ACM International Conference on Supercomputing(ICS-01),pages196–203.ACM Press, June17–212001,Sorrento,Italy.10.Sreerama K.Murthy,Simon Kasif,and Steven Salzberg.A System for Inductionof Oblique Decision Trees.Journal of Artificial Intelligence Research,2:1–32,1994.11.Vivek Sarkar.Optimized Unrolling of Nested Loops.In Proceedings of the14thACM International Conference onSupercomputing(ICS-00),pages153–166.ACM Press,May2000.12.UltraSPARC./processors/UltraSPARC-II/.13.C.Yu and D.B.Skillicorn.Parallelizing Boosting and Bagging.Technical report,Queen’s University,Kingston,Ontario,Canada K7L3N6,February2001.。