Mining Business Topics in Source Code using Latent Dirichlet Allocation ABSTRACT
- 格式:pdf
- 大小:224.75 KB
- 文档页数:8
商务英语翻译Unit 21 Red tape and other examples of government bureaucracy hinder a company’s entry into a market.繁琐的手续及其他官僚作风的例子阻碍了公司进入市场。
2 In these days of increasing global integration, the task many international marketers face is not so much market entry as managing the marketing mix in different national markets.在全球日益一体化的今天,各国际市场面临的任务与其说是建立各种市场准入问题,不如说是如何对不同的国际市场的营销组合的问题。
3 Some companies, however, do develop the same product for all markets regardless of existing local preferences.无论如何,一些公司确实在所有的市场开发相同的产品,忽视了现有的当地客户的偏好。
4 Firms sometimes customize a product to every market; at other times they offer one standardized product everywhere; and sometimes they compromise and settle in the middle.公司有的时候给不同的市场提供不同的产品,而有的时候则为所有的市场提供一个标准化的产品,有的时候则是妥协采用折中的解决方案。
5 The advent of the Internet and Intranets has the potential to accelerate the process of mining all markets for relevant information and for features that can be included in new products.网络和局域网的来临,有可能加速了挖掘相关市场信息或用于新产品的特征这一进程.6 When Citibank introduced its credit card in the Asia-Pacific region, it launched it sequentially and tailored the productfeatures for each country while maintaining its premium positioning.当花旗银行在亚太地区推出信用卡业务时,它在各个国家依次推出,并且在维持它的高端定位的同时,为每个国家度身打造了一些独特的产品特征。
电脑信息课的英语Computer Information CourseComputer information is an essential subject that covers various aspects of computing technology, data management, and information systems. This course provides students with a comprehensive understanding of the fundamental concepts and practical skills required in the field of computer information.The course curriculum typically consists of the following topics:1. Introduction to Computer Systems: This module introduces the basic components and functions of computer systems, including hardware, operating systems, and software.2. Data Representation: Students learn how data is represented and stored in computer systems using binary numbers, ASCII codes, and various data structures.3. Computer Networks: This module covers the basics of computer networking, including network protocols, network topologies, and internet connectivity.4. Database Management Systems: Students learn the principles and practices of database design, implementation, and management. They also explore topics like query languages and data security.5. Information Systems and Business Applications: This part of the course focuses on the applications of computer information in business environments. Topics covered include enterprise resourceplanning (ERP), customer relationship management (CRM), and supply chain management (SCM) systems.6. Web Development: Students acquire skills in web development, including HTML, CSS, and JavaScript. They also learn about website design, usability, and the principles of responsive web design.7. Data Analytics: This module introduces students to the field of data analytics and its applications in decision-making and problem-solving. Topics covered include data visualization, statistical analysis, and data mining techniques.8. Information Security: Students learn about the importance of information security and the measures needed to protect computer systems and data. Topics covered include encryption techniques, access control, and security policies.Throughout the course, students engage in practical assignments and projects to apply their learning and develop essential skills in areas such as data manipulation, programming, and problem-solving. By the end of the course, students should have a solid understanding of computer information and be equipped with the necessary skills to pursue careers in areas such as information technology, data analysis, or system administration.。
矿产资源相关的英语作文Mining resources are the backbone of our modern world. They’re like the hidden treasures beneath our feet, waiting to be unearthed and put to use. From shiny gold to humble gravel, these resources shape our lives in ways we often don’t even realize.Think about it – every gadget you own, every building you see, they all started as a glint in a miner’s eye. The metals and minerals dug up from the earth form the foundation of our technology and infrastructure. Without them, we’d still be living in ca ves, banging rocks together.But mining isn’t all about shiny riches. It’s a gritty, dirty business, with its fair share of challenges. Environmental concerns loom large, as we grapple with the impact of extracting resources from the earth. We’re constantly walking the tightrope between progress and preservation, trying to strike a balance that benefits bothhumanity and the planet.Then there’s the human side of mining – the sweat and toil of the workers who brave the depths of the earth to bring these resources to light. It’s a tough job, filled with danger and uncertainty. Yet, for many, it’s a way of life, passed down through generations like a torch in the darkness.And let’s not forget the geopolitical minefield that mining often becomes. Countries jostle for control of valuable resources, sometimes sparking conflicts that rage for years. It’s a high-stakes game of power and influence, played out on the global stage.But amidst all the chaos and controversy, one thing remains clear: mining is essential to our way of life. Without it, we’d be lost – adrift in a sea of want and need. So, as we delve deeper into the earth in search of riches, let’s not forget the responsibility that comes with it. Let’s mine wisely, with one eye on the f uture and the other on the past.。
大数据时代书籍简介英文回答:In the era of big data, the importance of understanding and harnessing the power of data has become increasingly crucial. As a result, there has been a surge in the number of books that delve into the subject of big data, offering insights, strategies, and practical tips for navigatingthis data-driven world. These books cover a wide range of topics, including data analysis, machine learning, data visualization, and data management.One popular book in this field is "Big Data: A Revolution That Will Transform How We Live, Work, and Think" by Viktor Mayer-Schönberger and Kenneth Cukier. This book explores the potential of big data to revolutionize various aspects of our lives, from business and healthcare to politics and education. It provides real-life examples and case studies to illustrate how big data is already being used to make informed decisions and driveinnovation.Another notable book is "Data Science for Business" by Foster Provost and Tom Fawcett. This book focuses on the practical applications of data science in the business world. It covers topics such as data mining, predictive modeling, and data-driven decision making. The authors provide clear explanations and use real-world examples to demonstrate how businesses can leverage data to gain a competitive edge.For those interested in the technical aspects of big data, "Hadoop: The Definitive Guide" by Tom White is a comprehensive resource. This book offers a deep dive into Hadoop, an open-source framework used for processing and analyzing large datasets. It covers topics such as data storage, data processing, and data analysis using Hadoop. The author provides code examples and practical tips for setting up and using Hadoop in a real-world environment.中文回答:在大数据时代,理解和利用数据的力量变得越来越重要。
Workflow mining:A survey of issues and approachesW.M.P.van der Aalst a,*,B.F.van Dongen a ,J.Herbst b ,L.Maruster a ,G.Schimm c ,A.J.M.M.Weijters aaDepartment of Technology Management,Eindhoven University of Technology,P.O.Box 513,NL-5600MB,Eindhoven,The NetherlandsbDaimlerChrysler AG,Research and Technology,P.O.Box 2360,D-89013Ulm,GermanycOFFIS,Escherweg 2,D-26121Oldenburg,GermanyReceived 2October 2002;received in revised form 29January 2003;accepted 26February 2003AbstractMany of today Õs information systems are driven by explicit process models.Workflow management systems,but also ERP,CRM,SCM,and B2B,are configured on the basis of a workflow model specifying the order in which tasks need to be executed.Creating a workflow design is a complicated time-consuming process and typically there are discrepancies between the actual workflow processes and the processes as perceived by the management.To support the design of workflows,we propose the use of workflow mining.Starting point for workflow mining is a so-called ‘‘workflow log’’containing information about the workflow process as it is actually being executed.In this paper,we introduce the concept of workflow mining and present a common format for workflow logs.Then we discuss the most challenging problems and present some of the workflow mining approaches available today.Ó2003Elsevier B.V.All rights reserved.Keywords:Workflow mining;Workflow management;Data mining;Petri nets1.IntroductionDuring the last decade workflow management technology [2,4,21,35,41]has become readily available.Workflow management systems such as Staffware,IBM MQSeries,COSA,etc.offer*Corresponding author.Tel.:+31-40-247-4295;fax:+31-40-243-2612.E-mail addresses:w.m.p.v.d.aalst@tm.tue.nl (W.M.P.van der Aalst),joachim.j.herbst@ (J.Herbst),schimm@offis.de (G.Schimm).0169-023X/$-see front matter Ó2003Elsevier B.V.All rights reserved.doi:10.1016/S0169-023X(03)00066-1/locate/datakData &Knowledge Engineering 47(2003)237–267238W.M.P.van der Aalst et al./Data&Knowledge Engineering47(2003)237–267generic modeling and enactment capabilities for structured business processes.By making process definitions,i.e.,models describing the life-cycle of a typical case(workflow instance)in isolation, one can configure these systems to support business processes.These process definitions need to be executable and are typically graphical.Besides pure workflow management systems many other software systems have adopted workflow technology.Consider for example Enterprise Resource Planning(ERP)systems such as SAP,PeopleSoft,Baan and Oracle,Customer Relationship Management(CRM)software,Supply Chain Management(SCM)systems,Business to Business (B2B)applications,etc.which embed workflow technology.Despite its promise,many problems are encountered when applying workflow technology.One of the problems is that these systems require a workflow design,i.e.,a designer has to construct a detailed model accurately describing the routing of work.Modeling a workflow is far from trivial:It requires deep knowledge of the business process at hand(i.e.,lengthy discussions with the workers and management are needed) and the workflow language being used.To compare workflow mining with the traditional approach towards workflow design and enactment,consider the workflow life cycle shown in Fig.1.The workflow life cycle consists of four phases:(A)workflow design,(B)workflow configuration,(C)workflow enactment,and(D) workflow diagnosis.In the traditional approach the design phase is used for constructing a workflow model.This is typically done by a business consultant and is driven by ideas of man-agement on improving the business processes at hand.If the design isfinished,the workflow system(or any other system that is‘‘process aware’’)is configured as specified in the design phase. In the configuration phases one has to deal with limitation and particularities of the workflow management system being used(cf.[5,65]).In the enactment phase,cases(i.e.,workflow instances) are handled by the workflow system as specified in the design phase and realized in the configu-ration phase.Based on a running workflow,it is possible to collect diagnostic information which is analyzed in the diagnosis phase.The diagnosis phase can again provide input for the design phase thus completing the workflow life cycle.In the traditional approach the focus is on the design and configuration phases.Less attention is paid to the enactment phase and few organi-zations systematically collect runtime data which is analyzed as input for redesign(i.e.,the diag-nosis phase is typically missing).The goal of workflow mining is to reverse the process and collect data at runtime to support workflow design and analysis.Note that in most cases,prior to the deployment of a workflowW.M.P.van der Aalst et al./Data&Knowledge Engineering47(2003)237–267239 system,the workflow was already there.Also note that in most information systems transactional data is registered(consider for example the transaction logs of ERP systems like SAP).The in-formation collected at run-time can be used to derive a model explaining the events recorded. Such a model can be used in both the diagnosis phase and the(re)design phase.Modeling an existing process is influenced by perceptions,e.g.,models are often normative in the sense that they state what‘‘should’’be done rather than describing the actual process.As a result models tend to be rather subjective.A more objective way of modeling is to use data related to the actual events that took place.Note that workflow mining is not biased by perceptions or normative behavior.However,if people bypass the system doing things differently,the log can still deviate from the actual work being done.Nevertheless,it is useful to confront man-made models with models discovered through workflow mining.Closely monitoring the events taking place at runtime also enables Delta analysis,i.e.,detecting discrepancies between the design constructed in the design phase and the actual execution regis-tered in the enactment phase.Workflow mining results in an‘‘a posteriori’’process model which can be compared with the‘‘a priori’’model.Workflow technology is moving into the direction of more operationalflexibility to deal with workflow evolution and workflow exception handling [2,7,10,13,20,30,39,40,64].As a result workers can deviate from the prespecified workflow design. Clearly one wants to monitor these deviations.For example,a deviation may become common practice rather than being a rare exception.In such a case,the added value of a workflow system becomes questionable and an adaptation is required.Clearly,workflow mining techniques can be used to create a feedback loop to adapt the workflow model to changing circumstances and detect imperfections of the design.The topic of workflow mining is related to management trends such as Business Process Re-engineering(BPR),Business Intelligence(BI),Business Process Analysis(BPA),Continuous Process Improvement(CPI),and Knowledge Management(KM).Workflow mining can be seen as part of the BI,BPA,and KM trends.Moreover,workflow mining can be used as input for BPR and CPI activities.Note that workflow mining seems to be more appropriate for BPR than for CPI.Recall that one of the basic elements of BPR is that it is radical and should not be restricted by the existing situation[23].Also note that workflow mining is not a tool to(re)design processes. The goal is to understand what is really going on as indicated in Fig.1.Despite the fact that workflow mining is not a tool for designing processes,it is evident that a good understanding of the existing processes is vital for any redesign effort.This paper is a joint effort of a number of researchers using different approaches to workflow mining and is a spin-offof the‘‘Workflow Mining Workshop’’.1The goal of this paper is to introduce the concept of workflow mining,to identify scientific and practical problems,to present a common format to store workflow logs,to provide an overview of existing approaches,and to present a number of mining techniques in more detail.The remainder of this paper is organized as follows.First,we summarize related work.In Section3we define workflow mining and present some of the challenging problems.In Section 4we propose a common XML-based format for storing and exchanging workflow logs.This format is used by the mining tools developed by the authors and interfaces with some of the 1This workshop took place on May22nd and23rd2002in Eindhoven,The Netherlands.240W.M.P.van der Aalst et al./Data&Knowledge Engineering47(2003)237–267leading workflow management systems(Staffware,MQSeries Workflow,and InConcert).Sec-tions5–9introducefive approaches to workflow mining focusing on different aspects.These sections give an overview of some of the ongoing work on workflow mining.Section10 compares the various approaches and list a number of open problems.Section11concludes the paper.2.Related workThe idea of process mining is not new[8,11,15–17,24–29,42–44,53–57,61–63].Cook and Wolf have investigated similar issues in the context of software engineering processes.In[15]they describe three methods for process discovery:one using neural networks,one using a purely algorithmic approach,and one Markovian approach.The authors consider the latter two the most promising approaches.The purely algorithmic approach builds afinite state machine (FSM)where states are fused if their futures(in terms of possible behavior in the next k steps) are identical.The Markovian approach uses a mixture of algorithmic and statistical methods and is able to deal with noise.Note that the results presented in[6]are limited to sequential behavior.Cook and Wolf extend their work to concurrent processes in[16].They propose specific metrics(entropy,event type counts,periodicity,and causality)and use these metrics to discover models out of event streams.However,they do not provide an approach to generate explicit process models.Recall that thefinal goal of the approach presented in this paper is to find explicit representations for a broad range of process models,i.e.,we want to be able to generate a concrete Petri net rather than a set of dependency relations between events.In[17] Cook and Wolf provide a measure to quantify discrepancies between a process model and the actual behavior as registered using event-based data.The idea of applying process mining in the context of workflow management wasfirst introduced in[11].This work is based on workflow graphs,which are inspired by workflow products such as IBM MQSeries workflow(formerly known as Flowmark)and InConcert.In this paper,two problems are defined.Thefirst problem is tofind a workflow graph generating events appearing in a given workflow log.The second problem is tofind the definitions of edge conditions.A concrete algorithm is given for tackling thefirst problem.The approach is quite different from other approaches:Because the nature of workflow graphs there is no need to identify the nature(AND or OR)of joins and splits.As shown in[37],workflow graphs use true and false tokens which do not allow for cyclic graphs. Nevertheless,[11]partially deals with iteration by enumerating all occurrences of a given task and then folding the graph.However,the resulting conformal graph is not a complete model.In [44],a tool based on these algorithms is presented.Schimm[53,54,57]has developed a mining tool suitable for discovering hierarchically structured workflow processes.This requires all splits and joins to be balanced.Herbst and Karagiannis also address the issue of process mining in the context of workflow management[24–29]using an inductive approach.The work presented in[27,29]is limited to sequential models.The approach described in[24–26,28]also allows for concurrency.It uses stochastic task graphs as an intermediate representation and it generates a workflow model described in the ADONIS modeling language.In the induction step task nodes are merged and split in order to discover the underlying process.A notable difference with other approaches is that the same task can appear multiple times in the workflow model.The graphW.M.P.van der Aalst et al./Data&Knowledge Engineering47(2003)237–267241 generation technique is similar to the approach of[11,44].The nature of splits and joins(i.e., AND or OR)is discovered in the transformation step,where the stochastic task graph is transformed into an ADONIS workflow model with block-structured splits and joins.In con-trast to the previous papers,the work in[8,42,43,61,62]is characterized by the focus on workflow processes with concurrent behavior(rather than adding ad hoc mechanisms to capture parallelism).In[61,62]a heuristic approach using rather simple metrics is used to construct so-called‘‘dependency/frequency tables’’and‘‘dependency/frequency graphs’’.In[42]another variant of this technique is presented using examples from the health-care domain.The pre-liminary results presented in[42,61,62]only provide heuristics and focus on issues such as noise. The approach described in[8]differs from these approaches in the sense that for the a algorithm it is proven that for certain subclasses it is possible tofind the right workflow model.In[3]the a algorithm is extended to incorporate timing information.Process mining can be seen as a tool in the context of Business(Process)Intelligence(BPI). In[22,52]a BPI toolset on top of HPs Process Manager is described.The BPI tools set includes a so-called‘‘BPI Process Mining Engine’’.However,this engine does not provide any techniques as discussed before.Instead it uses generic mining tools such as SAS Enterprise Miner for the generation of decision trees relating attributes of cases to information about execution paths(e.g., duration).In order to do workflow mining it is convenient to have a so-called‘‘process data warehouse’’to store audit trails.Such as data warehouse simplifies and speeds up the queries needed to derive causal relations.In[19,46–48]the design of such warehouse and related issues are discussed in the context of workflow logs.Moreover,[48]describes the PISA tool which can be used to extract performance metrics from workflow logs.Similar diagnostics are provided by the ARIS Process Performance Manager(PPM)[34].The later tool is commercially available and a customized version of PPM is the Staffware Process Monitor(SPM)[59]which is tailored towards mining Staffware logs.Note that none of the latter tools is extracting the process model.The main focus is on clustering and performance analysis rather than causal relations as in[8,11,15–17,24–29,42–44,53–57,61–63].Much of the work mentioned above will be discussed in more detail in Sections5–9.Before doing so,wefirst look at workflow mining in general and introduce a common XML-based format for storing and exchanging workflow logs.3.Workflow miningThe goal of workflow mining is to extract information about processes from transaction logs. Instead of starting with a workflow design,we start by gathering information about the workflow processes as they take place.We assume that it is possible to record events such that(i)each event refers to a task(i.e.,a well-defined step in the workflow),(ii)each event refers to a case(i.e.,a workflow instance),and(iii)events are totally ordered.Any information system using transac-tional systems such as ERP,CRM,or workflow management systems will offer this information in some form.Note that we do not assume the presence of a workflow management system.The only assumption we make,is that it is possible to collect workflow logs with event data.These workflow logs are used to construct a process specification which adequately models the behavior registered. The term process mining refers to methods for distilling a structured process description from a set242W.M.P.van der Aalst et al./Data&Knowledge Engineering47(2003)237–267of real executions.Because these methods focus on so-called case-driven process that are sup-ported by contemporary workflow management systems,we also use the term workflow mining.Table1shows a fragment of a workflow log generated by the Staffware system.In Staffware events are grouped on a case-by-case basis.Thefirst column refers to the task(description),the second to the type of event,the third to the user generating the event(if any),and the last column shows a time stamp.The corresponding Staffware model is shown in Fig.2.Case10shown in Table1follows the scenario wherefirst task Register is executed followed Send questionnaire, Receive questionnaire,and Evaluate.Based on the Evaluation,the decision is made to directly archive(task Archive)the case without further processing.For Case9further processing is nee-ded,while Case8involves a timeout and the repeated execution of some tasks.Someone familiar with Staffware will be able to decide that the three cases indeed follow a scenario possible in the Staffware model shown in Fig.2.However,three cases are not sufficient to automatically derive the model of Fig.2.Note that there are more Staffware models enabling the three scenarios shown in Table1.The challenge of workflow mining is to derive‘‘good’’workflow models with as little information as possible.To illustrate the principle of process mining in more detail,we consider the workflow log shown in Table2.This log abstracts from the time,date,and event type,and limits the information to the order in which tasks are being executed.The log shown in Table2contains information aboutfive cases(i.e.,workflow instances).The log shows that for four cases(1,2,3,and4)the tasks A,B,C, and D have been executed.For thefifth case only three tasks are executed:tasks A,E,and D. Each case starts with the execution of A and ends with the execution of D.If task B is executed, then also task C is executed.However,for some cases task C is executed before task B.Based on the information shown in Table2and by making some assumptions about the completeness of the log(i.e.,assuming that the cases are representative and a sufficient large subset of possible be-haviors is observed),we can deduce for example the process model shown in Fig.3.The model is represented in terms of a Petri net[50].The Petri net starts with task A andfinishes with task D. These tasks are represented by transitions.After executing A there is a choice between either executing B and C in parallel or just executing task E.To execute B and C in parallel two non-observable tasks(AND-split and AND-join)have been added.These tasks have been added for routing purposes only and are not present in the workflow log.Note that for this example we assume that two tasks are in parallel if they appear in any order.By distinguishing between start events and complete events for tasks it is possible to explicitly detect parallelism(cf.Section4).Table2contains the minimal information we assume to be present.In many applications,the workflow log contains a time stamp for each event and this information can be used to extract additional causality information.In addition,a typical log also contains information about the type of event,e.g.,a start event(a person selecting an task from a worklist),a complete event(the completion of a task),a withdraw event(a scheduled task is removed),etc.Moreover,we are also interested in the relation between attributes of the case and the actual route taken by a particular case.For example,when handling traffic violations:Is the make of a car relevant for the routing of the corresponding traffic violation?(For example,People driving a Ferrari always pay theirfines in time.)For this simple example(i.e.,Table2),it is quite easy to construct a process model that is able to regenerate the workflow log(e.g.,Fig.3).For more realistic situations there are however a number of complicating factors:W.M.P.van der Aalst et al./Data&Knowledge Engineering47(2003)237–267243 Table1A Staffware logDirective description Event User yyyy/mm/dd hh:mmCase10Start bvdongen@staffw_e2002/06/1912:58Register Processed to bvdongen@staffw_e2002/06/1912:58Register Released by bvdongen@staffw_e2002/06/1912:58Send questionnaire Processed to bvdongen@staffw_e2002/06/1912:58Evaluate Processed to bvdongen@staffw_e2002/06/1912:58Send questionnaire Released by bvdongen@staffw_e2002/06/1913:00Receive questionnaire Processed to bvdongen@staffw_e2002/06/1913:00Receive questionnaire Released by bvdongen@staffw_e2002/06/1913:00Evaluate Released by bvdongen@staffw_e2002/06/1913:00Archive Processed to bvdongen@staffw_e2002/06/1913:00Archive Released by bvdongen@staffw_e2002/06/1913:00Terminated2002/06/1913:00Case9Start bvdongen@staffw_e2002/06/1912:36Register Processed to bvdongen@staffw_e2002/06/1912:36Register Released by bvdongen@staffw_e2002/06/1912:35Send questionnaire Processed to bvdongen@staffw_e2002/06/1912:36Evaluate Processed to bvdongen@staffw_e2002/06/1912:36Send questionnaire Released by bvdongen@staffw_e2002/06/1912:36Receive questionnaire Processed to bvdongen@staffw_e2002/06/1912:36Receive questionnaire Released by bvdongen@staffw_e2002/06/1912:36Evaluate Released by bvdongen@staffw_e2002/06/1912:37Process complaint Processed to bvdongen@staffw_e2002/06/1912:37Process complaint Released by bvdongen@staffw_e2002/06/1912:37Check processing Processed to bvdongen@staffw_e2002/06/1912:37Check processing Released by bvdongen@staffw_e2002/06/1912:38Archive Processed to bvdongen@staffw_e2002/06/1912:38Archive Released by bvdongen@staffw_e2002/06/1912:38Terminated2002/06/1912:38Case8Start bvdongen@staffw_e2002/06/1912:36Register Processed to bvdongen@staffw_e2002/06/1912:36Register Released by bvdongen@staffw_e2002/06/1912:35Send questionnaire Processed to bvdongen@staffw_e2002/06/1912:36Evaluate Processed to bvdongen@staffw_e2002/06/1912:36Send questionnaire Released by bvdongen@staffw_e2002/06/1912:36Receive questionnaire Processed to bvdongen@staffw_e2002/06/1912:36Receive questionnaire Expired bvdongen@staffw_e2002/06/1912:37Receive questionnaire Withdrawn bvdongen@staffw_e2002/06/1912:37Receive timeout Processed to bvdongen@staffw_e2002/06/1912:37Receive timeout Released by bvdongen@staffw_e2002/06/1912:37Evaluate Released by bvdongen@staffw_e2002/06/1912:37Process complaint Processed to bvdongen@staffw_e2002/06/1912:37Process complaint Released by bvdongen@staffw_e2002/06/1912:37Check processing Processed to bvdongen@staffw_e2002/06/1912:37(continued on next page)Table 2A workflow log Case identifier Task identifier Case 1Task A Case 2Task A Case 3Task A Case 3TaskB Case 1Task B Case 1TaskC Case 2Task C Case 4Task A Case 2Task B Case 2TaskD Case 5Task A Case 4Task C Case 1Task D Case 3Task C Case 3Task D Case 4Task B Case 5TaskE Case 5Task D Case4TaskDFig.2.The staffware model.Table 1(continued )Directive description Event Useryyyy/mm/dd hh:mm Check processing Released by bvdongen@staffw_e 2002/06/1912:38Process complaint Processed to bvdongen@staffw_e 2002/06/1912:37Process complaint Released by bvdongen@staffw_e 2002/06/1912:37Check processing Processed to bvdongen@staffw_e 2002/06/1912:37Check processing Released by bvdongen@staffw_e 2002/06/1912:38Archive Processed to bvdongen@staffw_e 2002/06/1912:38ArchiveReleased by bvdongen@staffw_e2002/06/1912:38Terminated2002/06/1912:38244W.M.P.van der Aalst et al./Data &Knowledge Engineering 47(2003)237–267W.M.P.van der Aalst et al./Data&Knowledge Engineering47(2003)237–267245•For larger workflow models mining is much more difficult.For example,if the model exhibits alternative and parallel routing,then the workflow log will typically not contain all possible combinations.Consider10tasks which can be executed in parallel.The total number of inter-leavings is10!¼3628800.It is not realistic that each interleaving is present in the log.More-over,certain paths through the process model may have a low probability and therefore remain undetected.•Workflow logs will typically contain noise,i.e.,parts of the log can be incorrect,incomplete,or refer to exceptions.Events can be logged incorrectly because of human or technical errors.Events can be missing in the log if some of the tasks are manual or handled by another sys-tem/organizational unit.Events can also refer to rare or undesired events.Consider for example the workflow in a hospital.If due to time pressure the order of two events(e.g.,make X-ray and remove drain)is reversed,this does not imply that this would be part of the regular medical protocol and should be supported by the hospitalÕs workflow system.Also two causally unre-lated events(e.g.,take blood sample and death of patient)may happen next to each other with-out implying a causal relation(i.e.,taking a sample did not result in the death of the patient;it was sheer coincidence).Clearly,exceptions which are recorded only once should not automat-ically become part of the regular workflow.•Table2only shows the order of events without giving information about the type of event,the time of the event,and attributes of the event(i.e.,data about the case and/or task).Clearly,it isa challenge to exploit such additional information.Sections5–9will present different approaches to some of these problems.To conclude this section,we point out legal issues relevant when mining(timed)workflow logs. Clearly,workflow logs can be used to systematically measure the performance of employees.The legislation with respect to issues such as privacy and protection of personal data differs from country to country.For example,Dutch companies are bound by the Personal Data Protection Act (Wet Bescherming Persoonsgegeven)which is based on a directive from the European Union.The practical implications of this for the Dutch situation are described in[14,31,51].Workflow logs are not restricted by these laws as long as the information in the log cannot be traced back to indi-viduals.If information in the log can be traced back to a specific employee,it is important that the employee is aware of the fact that her/his activities are logged and the fact that this logging is used to monitor her/his performance.Note that in a timed workflow log we can abstract from information about the workers executing tasks and still mine the process.Therefore,it is possible to avoid collecting information on the productivity of individual workers and legislation such as the Personal246W.M.P.van der Aalst et al./Data&Knowledge Engineering47(2003)237–267Data Protection Act does not apply.Nevertheless,the logs of most workflow systems contain information about individual workers,and therefore,this issue should be considered carefully.4.Workflow logs:A common XML formatIn this section we focus on the syntax and semantics of the information stored in the workflow log.We will do this by presenting a tool independent XML format that is used by each of the mining approaches/tools described in the remainder.Fig.4shows that this XML format connects transactional systems such as workflow management systems,ERP systems,CRM systems,and case handling systems.In principle,any system that registers events related to the execution of tasks for cases can use this tool independent format to store and exchange logs.The XML format is used as input for the analysis tools presented in Sections5–9.The goal of using a single format is to reduce the implementation effort and to promote the use of these mining techniques in multiple contexts.Table3shows the Document Type Definition(DTD)[12]for workflow logs.This DTD specifies the syntax of a workflow log.A workflow log is a consistent XML document,i.e.,a well-formed and valid XMLfile with top element WorkFlow_log(see Table3).As shown,a workflow log consists of(optional)information about the source program and information about one or more workflow processes.Each workflow process(element process)consists of a sequence of cases (element case)and each case consists of a sequence of log lines(element log_line).Both processes and cases have an id and a description.Each line in the log contains the name of a task(element task_name).In addition,the line may contain information about the task instance(element task_instance),the type of event(element event),the date(element date),and the time of the event (element time).It is advised to make sure that the process description and the case description are unique for each process or case respectively.The task name should be a unique identifier for a task within a process.If there are two or more tasks in a process with the same task name,they are as-。
外贸开发信给客人写邮件的时候,必须要忘记中国人的行文和思维方式,要按欧美人的习惯去思考问题和写邮件,这样才能让客人把你当成同类。
到那一天,你随便写一封邮件,当大部分人都看不出这封邮件出自一个中国人之手,你就出师了!我先列举一些大部分朋友写email时常范的错误,大家可以看看,对比一下自己,其中的几条是自己也会有的?接下来再讨论开发信怎么写。
呵呵。
1)邮件写得过长。
客人的时间很宝贵,每天要收到数百封邮件,你想想,一个不认识的人发了一封又长又臭的邮件给你,英语表述又不好,还加了好几M的附件,你会不会认真去看?而且很多老外的时间观念很强,每天都有几块固定的时间用来处理email,很多长篇大论的邮件,只要不是他的熟人发的,一般会被直接删除,或者是把你的地址设为垃圾邮件。
我问过很多西欧客人,他们一般处理每一封邮件的时间是2-3秒,也就是大致扫一眼,重要的邮件,一般马上仔细阅读并回复,不是太重要的,会在outlook里标注上要处理的具体时间,然后从inbox拉到相应的子目录里。
换句话说,只要客人的邮箱地址是对的,也是你要找的right person,你的开发信只能停留在他眼前2-3秒,就是决定命运的时刻了。
这种情况下,试问你敢不敢把邮件写得很长?2)没有明确的主题。
一个不明确的主题,会让客人根本没兴趣去打开陌生人的邮件。
这个就需要经验了,内容要言简意赅,直接吸引客人通过主题去点开邮件,目的就达到了。
至于他看了以后有没有反应,就要看实际情况和你内容的功力了。
有些人写邮件会这样设置主题:“we are the manufacturer of lights”,又或者“need cooperation”,或者“Guangdong *** trading company ltd”,或者“price list for lights-Guangdong *** trading company ltd”等等,一看就知道是推销信。
global sourcesGlobal Sources: Connecting Businesses around the WorldIntroductionGlobalization has revolutionized the way businesses operate. In this interconnected world, companies are no longer confined to local markets; they now have the opportunity to expand their reach and tap into global markets. However, with increasing global trade comes the challenge of finding reliable suppliers, manufacturers, or distributors. This is where Global Sources come into play. This document explores the significance of Global Sources and how it connects businesses around the world.What is Global Sources?Global Sources is a leading business-to-business (B2B) media company that specializes in facilitating trade between buyers and suppliers. Established in 1970, Global Sources offers a wide range of services to help businesses connect and build relationships. These services include online marketplaces, physical trade shows, sourcing reports, and trade magazines.Online MarketplacesOne of the flagship services offered by Global Sources is its online marketplaces. These digital platforms allow buyers to search for suppliers, manufacturers, or products from all over the world. The online marketplaces cover various industries, including electronics, fashion, home and kitchen, and more. With the extensive database of verified suppliers, buyers can easily find the right business partners to meet their specific needs.Physical Trade ShowsIn addition to its online presence, Global Sources organizes trade shows in major cities around the world. These trade shows attract thousands of buyers and suppliers, creating a platform for face-to-face interactions and business negotiations. These events are advantageous as they allow businesses to showcase their products or services, establish brand recognition, and network with potential partners. Furthermore, trade shows organized by Global Sources often provide sourcing seminars and industry-specific conferences to keep attendees updated with the latest market trends and insights.Sourcing ReportsGlobal Sources produces comprehensive sourcing reports that provide valuable market intelligence and analysis. These reports help businesses make informed decisions when sourcing products or suppliers. The reports cover a wide range of industries and include information on market trends, competitive analysis, supplier profiles, and pricing data. By leveraging these reports, businesses can gain a competitive edge and better understand their target markets, allowing for more effective sourcing strategies.Trade MagazinesGlobal Sources publishes several trade magazines that serve as valuable resources for businesses. These magazines cover various industries and provide readers with industry news, product updates, and expert insights. The magazines also include advertisements and product listings from suppliers, enabling buyers to discover new products and suppliers conveniently.The Advantages of Global SourcesGlobal Sources offers numerous advantages for both buyers and suppliers.For buyers, Global Sources provides a one-stop platform for sourcing quality products, while also connecting them with reliable and verified suppliers. The online marketplaces and sourcing reports simplify the sourcing process, facilitating efficient decision-making. The physical trade shows enable buyers to establish personal connections, assess product quality, and negotiate prices directly.For suppliers, Global Sources offers a global platform to showcase their products and services. The online marketplaces and trade shows provide opportunities to connect with international buyers, expand customer base, and increase sales. Furthermore, the sourcing reports help suppliers understand market demand, allowing them to tailor their offerings to meet customer needs effectively.ConclusionGlobal Sources plays a vital role in connecting businesses around the world. Its online marketplaces, physical tradeshows, sourcing reports, and trade magazines are invaluable resources that enable buyers and suppliers to establish fruitful relationships. In an increasingly globalized world, businesses can rely on Global Sources to find trusted partners, explore new markets, and drive growth. With its comprehensive services and extensive network, Global Sources continues to contribute to the success of businesses in a rapidly evolving global marketplace.。
煤矿智能开采技术专业介绍范文英文版Introduction to the Specialty of Coal Mine Intelligent Mining TechnologyIn the rapidly advancing technological era, the coal mining industry is also undergoing significant transformations. One such transformation is the emergence of the Coal Mine Intelligent Mining Technology specialty, which aims to equip students with the knowledge and skills necessary to efficiently and safely extract coal using modern technology.This specialty covers a wide range of topics, including the principles of coal mining, the latest advancements in mining equipment and technology, as well as the environmental and safety considerations in mining operations. Students are introduced to the concept of intelligent mining, which involves the integration of robotics, automation, and artificialintelligence to improve mining efficiency and reduce human intervention.The curriculum of this specialty also focuses on data analysis and management, as intelligent mining generates a large amount of data that needs to be processed and analyzed. Students learn how to use advanced software and tools to monitor and control mining operations, predict potential hazards, and optimize resource utilization.In addition to technical skills, students are also trained in project management, teamwork, and communication skills, which are crucial for working in a multidisciplinary team in the mining industry. The program prepares students to become professionals who can contribute to the sustainable development of the coal mining industry by applying innovative technology solutions.The Coal Mine Intelligent Mining Technology specialty is designed to meet the growing demand for skilled professionals in the coal mining industry. With the continuous evolution oftechnology, this specialty will continue to play a crucial role in enhancing the efficiency, safety, and sustainability of coal mining operations.中文版煤矿智能开采技术专业介绍在科技迅速发展的时代,煤矿行业也在经历着巨大的变革。
电子商务与电子政务本栏目责任编辑:李雅琪数据挖掘在跨境电商客户特征分析中的应用孙海波(广东工业大学,广东广州510000)摘要:近些年来,经济全球化程度逐步加深,互联网信息技术迅速发展,跨境电子商务已然变成了中外贸易的新增长点。
同时,在跨境电商平台上,销售的商品日益丰富。
对于消费者来说,要在这么海量的商品里面,选择符合他需求的商品是一个普遍存在的问题困境。
通过统计分析和挖掘跨境电商客户自身的一些属性特征和其购买的商品的一些属性特征,为跨境电商企业的营销策略和物流布局提供参考,帮助消费者快速挑选到满意的商品。
关键词:数据挖掘;跨境电子商务;特征分析;Hadoop;FP-Growth 算法中图分类号:F272.3文献标识码:A文章编号:1009-3044(2021)15-0239-03开放科学(资源服务)标识码(OSID ):Application of data mining in cross border e-commerce customer characteristics analysis SUN Hai-bo(Guangdong University of Technology,Guangzhou 510000,China;)Abstract :In recent years,with the deepening of economic globalization and the rapid development of Internet information technolo⁃gy,cross-border e-commerce has become a new growth point of Sino foreign trade.At the same time,on the cross-border e-com⁃merce platform,more and more goods are sold.It is a common problem for consumers to choose the products that meet their needs in such a large number of commodities.Through statistical analysis and mining some attribute characteristics of cross-border e-commerce customers and their purchased goods,this paper provides reference for marketing strategy and logistics layout of cross-border e-commerce enterprises,and helps consumers quickly select satisfactory goods.。
上海交通大学软件学院——云时代的软件工程Software Engineering in Cloud Era沈备军副教授高可靠软件实验室/~bjshen 云计算对软件工程带来了新的挑战和新的机遇:1)云服务软件的开发和演化技术云计算分为五层,如下图所示,其中软件工程着重于最上面两层:SaaS(Software as a Service)和PaaS(Platform as a Service)。
在SaaS层,主要研究内容包括(不限于):●SaaS应用的软件在线定制技术:适合大量租户个性化定制的软件架构、定制方法和技术,包括业务流程在线定制、业务规则在线定制、功能在线定制、数据在线定制、界面在线定制。
●Web应用到SaaS的迁移方法和技术:把遗留Web应用迁移到SaaS模式的方法、单租户架构到多租户架构的迁移技术、基于租户空间虚拟化的在线定制框架、以及在线定制能力增强技术等。
●软件在线升级技术:在不停机的情况下,对软件及数据库进行在线升级,不影响或尽少影响用户的使用。
●软件动态演化机制:软件能自动地根据环境和运行反馈,进行自动调整和优化,这属于自主计算技术的研究。
●SaaS应用的质量评估方法:SaaS软件质量模型和质量评估方法,突出SaaS应用在安全性、可定制性以及性能和可伸缩性方面的质量要求;并针对其中性能特性,研究基于SLA收益最大化的性能优化方法。
在PaaS层,主要研究内容包括(不限于):●云计算的编程模式:云计算实际上是一种处理大规模密集型数据的并行分布式计算技术,为了让开发人员充分利用云计算的便利性和可用性,研究新型的适合于云计算的编程模式必不可少。
云编程模型应能适合于大规模数据集的并行计算,适合于多虚拟机的任务调度,能够构建新型的云计算应用程序,可在网络上提供更加丰富的用户体验。
典型的代表是MapReduce (Google)和Dryad (Microsoft)。
●云计算的自适应资源调度和管理则是PaaS层另一个重要的研究内容。
大学英语考试专业英语八级TEM8模拟题2020年(6)(总分100, 做题时间90分钟)READING COMPREHENSION. SECTION A MULTIPLE-CHOICE QUESTIONSPASSAGE ONE(1)People who live in Taylorstown have made their choices: scenery over shopping, deer over drive-throughs The historic enclave, although not untouched by the building boom that exploded in Loudoun County before so dramatically going bust, remains largely rural, with all the benefits and inconveniences that entails(2)"It's far from everything," Tara Linhardt, president of the Taylorstown Community Association, said with a smile. The bluegrass musician has lived in Taylorstown since she was a child in the 1970s. Clearly, she views its remoteness as an asset.(3)Taylorstown wasn't always out of the way. In the 19th century, it was one of the busiest and most heavily populated areas of Loudoun, thanks to milling, mining and agriculture. Its population dwindled, however, when mining and milling became history. Taylorstown now is unincorporated, with the county divvying up its residents among the surrounding jurisdictions of Lovettsville, Waterford, Lucketts and Leesburg. Officialdom aside, the locals consider themselves residents of Taylorstown if they live within about a three-mile radius of an old store at the junction of Taylorstown and Loyalty roads.(4)The store, shuttered in 1998, is a passionate cause in Taylorstown. A nonprofit group with grassroots backing is spearheading its reopening as a "very green" business and recently installed a new system. But the day that the store will again be able to sell bread and local produce "won't come anytime soon," said Anne Larson, an artist and long-time Taylorstown resident.(5)1t's a matter of money, of course, and the store's boosters are pursuing grants. Meanwhile, the store hosts **munity gatherings, such as craft fairs and lectures on topics of area interest such as Lyme disease. Lyme disease, carried by deer ticks, is perhaps Taylorstown's No. 1 problem. "Deer are so comfortable here," Linhardt said, "that most people have had it twice."(6)Richard Brown, a Quaker, founded Taylorstown in the 1730s when he built a mill on the banks of Catoctin Creek near where the store is now. Although Brown's mill is long gone, the surrounding area has been designated a historic district and has been on the National Register of Historic Places since 1976. It is the site of two of the oldest stone houses in the county, Hunting Hill and FoxtonCottage, as well as a mill later built by town namesake Thomas Taylor.(7)The three-mile radius that extends from the store now encompasses about 1,500 households, said Tami Carlow, vice president of the Taylorstown Community Association. These households sit on land that is alternately rolling and open or steep and wooded.(8)A good number of the oldest structures got their start as "patent houses," explained historian and Taylorstown resident Rich Gillespie. In colonial times, construction of a 16-by-20-foot cabin was a requirement for obtaining a patent or land grant.(9)Taylorstown owes much of its pastoral beauty to its still-abundant farms. On a summer day along Loyalty Road—named in honor of Taylorstown's Unionist sympathies during the Civil War—fields are dense with green corn or punctuated by round bales of hay waiting to be collected. Placidly grazing cattle and horses are everywhere.(10)These days, Taylorstown's **e in both the working and gentleman's varieties, and Ken Loewinger's 175-acre Glenwood is both. Loewinger runs a small horse-boarding operation, but he also works full time in Washington as a real estate lawyer.(11)"I couldn't afford to have this in Great Falls," Loewinger said, gesturing toward a rambling red bam, rolling pastures and a large stone house that grew out of a 1750s log cabin, possibly a patent house.(12)The D.C.-born Loewinger initially worried about whether the country would be a good fit. Loudoun County didn't even have a synagogue when he and his wife, Margaret Krol, moved to Taylorstownin 1991. They attended religious services in a bingo parlor. Now, the county has two synagogues, Loewinger said, and he has found the country to be "a richer environment than the city."(13)Luda and John Eichelberger would agree. The husband-and-wife volcanologists moved into a log house a five-minute walk from the Taylorstown Store late last year. Eichelberger, who works at the U.S. Geological Survey in Reston met his Russian wife while doing fieldwork on the Kamchatka Peninsula.(14)A profession that involves close observation of remote mountains requires a love of the outdoors, and in Taylorstown the Eichelbergers indulge that love with running and biking. One of their favorite trails loops along Catoctin Creek, where deer browse nonchalantly beneath pawpaw trees and a great blue heron feels at home enough to stand on a boulder and do its dead-on impression of a statue.(15)Luda Eichelberger, who volunteers at the Smithsonian's Museum of Natural History, said she has "never Lived so close with animals." The frogs are her particular favorite. They "sing likebirds," she said.(16)Five years ago, Leslie McElroy and Richard Jones relocated from a less-distant locale—elsewhere in the Wasbington suburbs—but they profess equal enthusiasm about life on their 10-acre Taylorstown farm. Like Eichelberger, McElroy works for the Geological Survey in Reston; Jones, an estimator for a commercial **pany, commutes 120 miles round trip to Prince George's County. The two have spent much of their free time during the past three years adding 1,600 square feet to their 800-square-foot house.(17)The project is about done—"finally," McElroy said. With**pletion, she and Jones, who own five horses, plan to make better use of the bridle trails that **b their area. They also will have more time to harness Kallai, their 2,000-pound Percheron mare, to a cart and wheel along neighboring roads. That kind of outing isn't possible in many places that are within driving distance of the District.(18)"This is the last little bit of truly rural, affordable Loudoun County," McElroy said.PASSAGE TWO(1)We keep an eye out for wonders, my daughter and I, every morning as we walk down our farm lane to meet the school bus. And wherever we find them, they reflect the magic of water: a spider web drooping with dew like a necklace. A rain-colored heron rising from the creek bank. One astonishing morning, we had a visitation of frogs. Dozens of them hurtled up from the grass ahead of our feet, launching themselves, white-bellied, in bouncing arcs, as if we'd been caught in a downpour of amphibians. It seemed to mark the dawning of some new watery age. On another day we met a snapping turtle in his olive drab armor. Normally this is a pond-locked creature, but some ambition had moved him onto our gravel lane, using the rainy week as a passport from our farm to somewhere else.(2)The little, nameless creek tumbling through our hollow holds us in bondage. Before we came to southern Appalachia, we lived for years in Arizona, where a permanent brook of that size would merit a nature preserve. In the Grand Canyon State, every license plate reminded us that water changes the face of the land, splitting open rock desert like a peach, leaving mile-deep gashes of infinite hue. Cities there function like space stations, importing every ounce of fresh water from distant rivers. But such is the human inclination to take water as a birthright that public fountains still may bubble in Arizona's town squares and farmers there raise thirsty crops. Retirees from rainier climates irrigate green lawns that impersonate the grasslands they left behind. The truth encroaches on all the fantasies, though, when desert residents wait months between rains,watching cacti tighten their belts and roadrunners skirmish over precious beads from a dripping garden faucet. Water is life. It's the salty stock of our origins, the pounding circulatory system of the world. It makes up two-thirds of our bodies, just like the map of the world; our vital fluids are saline, like the ocean. The apple doesn't fall far from the tree.(3)Even while we take Mother Water for granted, humans understand in our bones that she is the boss. We stake our civilizations on the coasts and mighty rivers. Our deepest dread is the threat of having too little moisture—or too much. We've lately raised the Earth's average temperature by 0.74℃, a number that sounds inconsequential. But these words do not: flood, drought, hurricane, rising sea levels, bursting levees. Water is the visible face of climate and, therefore, climate change. Shifting rain patterns flood some regions and dry up others as nature demonstratesa grave physics lesson: Hot air holds more water molecules than cold.(4)The results are in plain sight along beaten coasts from Louisiana to the Philippines as super-warmed air above the ocean brews superstorms, the likes of which we have never known. In arid places the same physics amplify evaporation and drought, visible in the dust-dry farms of the Murray-Darling River Basin in Australia. On top of the Himalaya, glaciers whose melt water sustains vast populations are dwindling. The snapping turtle I met on my lane may have been looking for higher ground. Last summer brought us a string of floods that left tomatoes spoilt on the vine and our farmers needing disaster relief for the third consecutive year. The past decade has brought us more extreme storms than ever before, of the kind that dump many inches in a day, laying down crops and utility poles and great sodden oaks whose roots cannot find purchase in the saturated ground. The word "disaster" seems to mock us. After enough repetitions of shocking weather, we can't remain indefinitely shocked.(5)How can the world shift beneath our feet? All we know is founded on its rhythms: Water will flow from the snowcapped mountains, rain and sun will arrive in their proper seasons. Humans first formed our tongues around language, surely, for the purpose of explaining these constants to our children. What should we tell them now? That "reliable" has been rained out, or died of thirst? When the Earth seems to raise its own voice to the pitch of a gale, have we the ears to listen?(6)A world away from my damp hollow, the Bajo Piura Valley is a great bowl of the driest Holocene sands I've ever gotten in my shoes. Stretching from coastal, northwestern Peru into southern Ecuador, the 14,OOO-square-mile Piura Desert is home to many endemic forms ofthorny life. Profiles of this eco-region describe it as dry to drier, and Bajo Piura on its southern edge is what anyone would call driest. Between January and March it might get close to an inch of rain, depending on the whims of El /, my driver explained as we bumped over the dry bed of the /Piura, "but in some years, nothing at all." For hours we passed through white-crusted fields and then into eye-burning valleys beyond the limits of endurance for anything but sparse stands of the deep-rooted Prosopis pallida, arguably nature's most arid-adapted tree. And remarkably, some scattered families of Homo sapiens.(7)They are economic refugees, looking for land that costs nothing. In Bajo Piura they find it, although living there has other costs, and fragile drylands pay their own price too, as people'exacerbate desertification by cutting anything living for firewood. What brought me there, as a journalist, was an innovative reforestation project. Peruvian conservationists, partnered with the NGO Heifer International, were guiding the population into herding goats, which eat the protein-rich pods of the native mesquite and disperse its seeds over the desert. In the shade of a stick shelter, a young mother set her dented pot on a dung-fed fire and showed how she curdles goat's milk into white cheese. But milking goats is hard to work into her schedule when she, and every other woman she knows, must walk about eight hours a day to collect water.PASSAGE THREE(1)/ Vargas had been in Hatch, New Mexico, only six months, since March, and already he owned his own business to compete with Netflix, delivering DVDs and video games to ranchers and people who lived within 20 miles of town. He had worked out a deal with Senora Gaspar, who owned the video store, to pay him 90 percent of the delivery fee, and if he took out more than 50 videos in a week, a premium on the extras.(2)/ had a lightweight motorized bicycle, which made it feasible. Gas prices were high, and delivery and pickup saved customers money. Also, it was convenient—they didn't have to waittill they had an errand in town. Most of the customers were Mexican families who worked the land for Anglos, or Anglos who owned cattle or pecan groves. /organized his schedule to avoid random trips. It was a lot of riding, but he liked the terrain—the low hills, the bare mountains, pale blue in the day and shadowed in the evenings, the vast sky. He liked seeing the fields of onions and chiles, the pecan trees, the alfalfa growing, the cattle grazing. He saw hawks, antelope, badgers, and deer, and learned their habits.(3)In a few weeks he knew most of his customers—the Gallegos family out on Castaneda Road, who grew green chiles, the Brubakersfarther on, the widow woman, Senora Obregon, who still ran the Bar SW ranch. The Michaels family was a mile east, the Garcias were on the other side of Interstate 25—they owned the bakery—and Tom Martinez lived in the trailer a mile past. Many of the families grew chiles—that's what Hatch was famous for—and marketed them to the co-op in Albuquerque or along the town highway, pickled or fresh, in jellies or as ristras. Everyone knew /, too, the chico loco on his moped.(4)The more people knew him, of course, the more people knew about his business. He was strong, had a good smile, and was anatural salesman. He talked to the Mexican families in Spanish, asked where their relatives came from, who was left in Hermosillo or Juarez or Oaxaca. He talked to the Anglos to improve his English and to show he was a serious businessman. He expected great things of himself one day.(5)/ English was passable, because he'd worked almost a year in Dem ing before he came to Hatch. He'd washed dishes at Sí /from six to two, and at four he mopped floors at the elementary school. In between he spent his off hours at the Broken Spoke, where he met people, even some women, like Brenda, who was a hairdresser, then unemployed. At eleven one night /was walking home to his trailer, and Brenda stopped in her Trans Am with the muffler dragging. She gave him a ride, and one thing led to another. After a month, Brenda wanted to get married—she was pregnant, she said—and /said why not. Two weeks after the wedding, he found out there was no baby, and Brenda ran off to California with a wine salesman.(6)To pay off Brenda's debts, /used his meager savings and tooka third job unloading freight at the train yard, though he still wasn't making enough money, or sleeping enough, either. One evening, after Ultimo was threatened with eviction from Brenda's apartment, his boss at the school found him dozing at a teacher's desk, and he was finished in Deming. He walked north with his thumb out, but no one picked him up. In two days, 46 miles later, with nothing but the clothes he wore and a blanket he'd brought from home, he staggered past Las Uvas Dairy and a few broken-down houses and into Hatch, where he saw a HELP WANTED sign in the window of the Frontera Mercado. He went in and got a job stocking groceries.(7)Hatch was in the fertile cottonwood corridor along the banks of the Rio Grande River, with the interstate to the east and open country in every other direction—ranches, pasture, rangeland. The days were getting warmer by then, and he slept in the brush along the river, shaved and washed himself there, and ate for breakfast whatever he had scavenged from the mercado the day before. If he wasn't working, he spent sunny mornings in the park and rainy ones in the library. Then /Gaspar hired him to work the morning shift at thevideo store, checking in rentals, cleaning, replenishing the stock of candy bars and popcorn. He established a more efficient check-in, organized a better window display, and built a new sign from construction waste: Gaspar's Movies.(8) "What delivery?" /Gaspar asked.(9) "Our delivery," Ultimo said. "I have bought a moped from Tom Martínez."(10)Some of his customers ordered movies for **pany /gave them. /Obregón, 55 years old, had lost her husband and wanted someone to talk to. She reminded /of his grandmother in Mexico, and he often made the ranch his last stop of the evening so he had time to sit on her porch and listen to her stories. Her husband had been killed two winters before, when, as he was feeding the cattle in a blizzard, a 1,500-pound bull slipped on a patch of ice and crushed him. They'd lost 100 head in that storm. Her children were in Wichita, Denver, and Salt Lake City, two sons and a daughter, and none of them wanted anything to do with the ranch. When he visited, /Obregón dressed well, as if /presence meant something, and she offered him steak and potatoes, and always leftovers to take with him afterward.SSS_SINGLE_SEL1. According to the passage, the rural land of Taylorstown______PASSAGE ONE•** both advantages and disadvantages to its residents.•** once dominated by urban buildings and noises.•** the process of economy prosperity to recession.** home to many famous historical sites and people.A AB BC CD DSSS_SINGLE_SEL2. Which of the following is INCORRECT about the old store?______PASSAGE ONE•** has shifted to other aspect than selling groceries.•** now works as a representative of Taylorstown.•** operators need more fund to run it.** has switched in accordance with the development of the time.A AB BD DSSS_SINGLE_SEL3. Which of the following is NOT to indicate the beautiful scenery of Taylorstown?______PASSAGE ONE• A....the surrounding area has been designated a historic district and... (Para. 6)•** households sit on land that is alternately rolling and open or steep and wooded (Para. 7)C....fields are dense with ** or punctuated by round bales of hay waiting to be collected. (Para. 9)** grazing cattle and horses are everywhere. (Para. 9)A AB BC CD DSSS_SINGLE_SEL4. When Loewinger said "he has found the country to be 'a richer environment than the city'" (Para. 12), he meant ______PASSAGE ONE•** in the village was **fortable than in the city.** was more entertainment in the village than in the city.•** economic in the country better developed than in the city. ** people enjoyed richer spiritual life than people in the city.A AB BC CD DSSS_SINGLE_SEL5. The sentence in Para. 1 "... using the rainy week as a passport from our farm to somewhere else" implies that ______PASSAGE TWO•** would be a flood nearby.•** has put the turtle onto the road.•** would be lots of amphibians.** author showed great interest in country life.B BC CD DSSS_SINGLE_SEL6. Which of the following adjectives can best describe Arizona?______PASSAGE TWO•**.•**.•**.**.A AB BC CD DSSS_SINGLE_SEL7. Which of the following does NOT show the immediate consequences of climate change? ______PASSAGE TWO•** storms arise above the ocean.•** on the peak of mountains are becoming less and less.•** people in barren areas are suffering from little rainfall. ** are ruined and crops are laid down.A AB BC CD DSSS_SINGLE_SEL8. Which of the following statements does NOT contain a personification? ______PASSAGE TWO• A."..., humans understand in our bones that she is the boss."(Para. 3)• B."Hot air holds more water molecules than cold." (Para. 3) • C."The word 'disaster' seems to mock us." (Para. 4)• D."When the Earth seems to raise its own voice to the pitch ofa gale..." (Para. 5)B BC CD DSSS_SINGLE_SEL9. A suitable title for the passage would be ______PASSAGE TWO•** Causes of Disasters.•** Evolution of Creatures.•** Resource Is Limited.** Is Life.A AB BC CD DSSS_SINGLE_SEL10. The relationship between the second and third paragraphs is that ______PASSAGE THREEA. each presents one side of the picture.B. the former generalizes and the latter gives examples.C. the latter is the logical result of the former.D. both present working experience.A AB BC CD DSSS_SINGLE_SEL11. Which of the following phrases is NOT suitable to describe the character of ? ______PASSAGE THREE•** good-tempered but cunning man.•** vigorous guy of sincerity.•** talented and hard-working businessman.** salesman of good character.A AB BD DSSS_SINGLE_SEL12. From the passage we can infer that the relationship betweenand Brenda was ______PASSAGE THREE•**.•**.•**.**.A AB BC CD DSSS_SINGLE_SEL13. What happened to after Brenda's disappearance?______PASSAGE THREE•** wanted to divorce his wife.•** was heavily involved in debts.•** was recovering from drinking.** had been working in a grocery.A AB BC CD DSSS_SINGLE_SEL14. Which of the following writing techniques is used to develop the passage? ______PASSAGE THREE•**.•**.•**.**.A AB BD DSSS_TEXT_QUSTI15. SECTION B SHORT ANSWER QUESTIONSWhat did Richard Brown do for the town?PASSAGE ONESSS_TEXT_QUSTI16. What does Ken Loewinger do in Washington?PASSAGE ONESSS_TEXT_QUSTI17. Why are frogs Luda's particular favorite?PASSAGE ONESSS_TEXT_QUSTI18. According to the passage, what is the fundamental role of water?PASSAGE TWOSSS_TEXT_QUSTI19. What're the eye-burning valleys like?PASSAGE TWOSSS_TEXT_QUSTI20. What benefits could customers gain from his delivery and pickup service?PASSAGE THREESSS_TEXT_QUSTI21. What was life like before he worked for Gaspar?PASSAGE THREESSS_TEXT_QUSTI22. Why was willing to listen to Senora Obregón's stories?PASSAGE THREE1。
信息科学与电子工程专业英语作文全文共6篇示例,供读者参考篇1Information Science and Electronic Engineering: A Kid's ViewHi there! My name is Emma and I'm 10 years old. Today, I want to tell you all about this really cool thing called Information Science and Electronic Engineering. It's a big fancy name, but it's actually really awesome and fun to learn about!So, what is Information Science and Electronic Engineering? Well, it's basically the study of how we use computers, electronics, and technology to process, store, and share information. It's kind of like how we use our phones, tablets, and computers to send messages, watch videos, play games, and do all sorts of other cool stuff!One of the main parts of Information Science and Electronic Engineering is something called computer science. This is all about how computers work and how we can make them do amazing things. Computer scientists create the software and programs that make our devices work. They also develop things like video games, apps, and websites. How cool is that?Another important part is electronic engineering. This is where people design and build the actual hardware and physical components that make up our computers and other electronic devices. Electronic engineers create things like microchips, circuit boards, and even robots! They have to understand a lot about electricity, physics, and materials to make all these awesome gadgets.Now, you might be thinking, "That sounds really complicated and boring!" But trust me, it's actually super interesting and fun to learn about. There are so many incredible things that Information Science and Electronic Engineering has brought us.For example, did you know that your favorite video games and apps were all created using computer science and electronic engineering? The people who make those games have to know a lot about programming, graphics, and how to make the game run smoothly on different devices. It's like they're modern-day wizards creating these amazing virtual worlds for us to explore!And what about the internet? The internet is basically one huge network of computers and devices all connected together. It's thanks to Information Science and Electronic Engineeringthat we can send messages, watch videos, and access information from anywhere in the world with just a few clicks!But that's not all! Information Science and Electronic Engineering is also helping to make our world a better place. Scientists and engineers are using these technologies to develop things like:Renewable energy sources like solar panels and wind turbinesMedical devices that can help diagnose and treat diseasesRobots that can assist people with disabilities or perform dangerous tasksSystems that can help predict and prepare for natural disastersSee? It's not just about making cool gadgets and games (although those are pretty awesome too). Information Science and Electronic Engineering is also helping to solve real-world problems and make people's lives better.Maybe one day, you could even become an Information Scientist or Electronic Engineer yourself! Just think about how cool it would be to create the next big video game, design a new type of robot, or help build a better, more sustainable world.Well, that's about all I have to say on this topic for now. I hope I was able to give you a better understanding of what Information Science and Electronic Engineering is all about. It's a fascinating field that combines creativity, problem-solving, and a passion for technology. Who knows, maybe you'll be the one to make the next big breakthrough!Thanks for listening, and happy exploring the amazing world of Information Science and Electronic Engineering!篇2Information Science and Electronic Engineering: A World of Wonder!Hi there! My name is Tommy, and I'm an 8-year-old kid who loves learning about science and technology. Today, I want to share with you my thoughts on something called Information Science and Electronic Engineering. It's a big name, but it's all about using computers and electronics to make our lives better!Let me start by telling you about my favorite thing in the world – video games! I love playing them on my console or tablet. Did you know that all those amazing games we enjoy are created by people who study Information Science and Electronic Engineering? They use computers and coding to design thegraphics, sounds, and gameplay that make these games so much fun.But it's not just about games! Information Science and Electronic Engineering also help us in so many other ways. Have you ever used a smartphone or a tablet? Those cool devices are packed with tiny electronic parts that make them work. Engineers design these parts and put them together in a way that lets us take pictures, send messages, and even watch movies on the go!Speaking of movies, did you know that special effects in movies are also created using Information Science and Electronic Engineering? They use computers to make things like explosions, spaceships, and even giant monsters look super realistic. Isn't that amazing?Another thing I find really cool is how Information Science and Electronic Engineering help us communicate with people all over the world. Think about the internet – it's like a giant network that connects computers and devices everywhere. Engineers make sure this network works properly, so we can send emails, watch videos, and even video chat with our friends and family who live far away.But it's not just about fun stuff like games and movies. Information Science and Electronic Engineering also help us in really important ways. For example, doctors use special machines and computers to take pictures of our bodies and diagnose illnesses. Engineers make sure these machines work correctly, so doctors can give us the right treatment.And did you know that even the cars we drive and the planes we fly on have computers and electronic systems inside them? Engineers design these systems to make sure our vehicles are safe and work properly.As you can see, Information Science and Electronic Engineering are all around us, making our lives easier and more exciting. But it's not just about using computers and electronics –it's also about solving problems and coming up with new ideas.Engineers have to think creatively and work hard to design new technologies that can help people in different ways. Maybe one day, they'll even invent a flying car or a robot that can do our homework for us! (Just kidding, but you never know!)Who knows, maybe some of you reading this will grow up to become the next great Information Science and Electronic Engineering experts! You could create the next big video game,design a faster computer, or even help build a rocket that goes to Mars. The possibilities are endless!So, if you love technology and want to make a difference in the world, keep learning and exploring. Ask questions, experiment with coding and electronics, and never stop being curious. The future of Information Science and Electronic Engineering is waiting for you!篇3My Favorite Things to Learn About - Information Science and Electronic Engineering!Hi everyone! My name is Tommy and I'm 9 years old. Today I want to tell you all about my favorite topics to learn about in school - information science and electronic engineering! I find this stuff super fascinating.Let's start with information science. This is all about how we collect, store, and use data and information. In class, we've learned about databases, which are like giant collections of information all stored together. Databases allow you to keep track of lots and lots of data in an organized way.For example, let's say you have a database of all the books in a library. The database would have an entry for each book, with details like the title, author, year it was published, and what section of the library it's located in. With a database, you can easily search for a specific book or books on a certain topic.We've also learned about data mining, which is how you can analyze all the data in a big database to find hidden patterns and trends. It's like being a detective and using clues to crack a case, except the clues are all the data points! Companies use data mining on their customer databases to better understand what people want to buy.Another key part of information science is data visualization. This is how you take data and present it in easy-to-understand visual ways like charts, graphs, and infographics. One of the coolest data viz techniques I've learned about is heat mapping, where you use different colors almost like a weather map to show things like website traffic or prices across a geographic area.Information science is all about turning giant pools of raw data into useful information and insights. Pretty amazing stuff when you think about just how much data exists today between the internet, business records, scientific research, and more!Now let's talk about the other topic I love - electronic engineering. This is figuring out how to design and build all the awesome electronic devices and systems that make the modern world go round.We've covered basic circuits in class, like how a simple circuit has a power source like a battery that provides the energy to make electrons flow through a conductor like a metal wire. By adding other components like switches, resistors, and lightbulbs, you can control and manipulate that flow of electricity to do useful things.From there, we've built up to more complex circuits using transistors, which are like solid-state switches that can amplify or transform electrical signals. Transistors are one of the fundamental building blocks of all modern electronics and computing. Billions of teeny tiny transistors are crammed into the microchips that act as the "brains" of smartphones, computers, and so much more.In addition to circuits and microchips, we've learned about other electronic devices and components like capacitors for storing electric charge, transformers for increasing or decreasing voltage, and all sorts of sensors for detecting things like light,sound, and motion. We even built our own basic robots by wiring up little motors and sensors to a microcontroller chip!Another amazing area of electronic engineering is communications systems - things like radio, television, internet, and wireless data networks. We learned how data can be encoded into electromagnetic waves and transmitted through the air, along cables, or via fiber optic lines. Multiplexing techniques allow tons of separate signals and channels to be combined together and transmitted simultaneously.On the receiving end, devices like radios and modems decode the incoming signals to recover the original data. Routers, switches, and other networking gear help route the data to its intended destination. Just think about how much electronic wizardry has to happen for you to simple send a text message or stream a video!My absolute favorite part of electronic engineering is imagining and designing all the cool future technologies that will one day exist. Technologies like laser-powered data transmission, quantum computers, and lifelike robotics and artificial intelligence seem like total science fiction now. But I can't wait to see what incredible new electronic marvels get invented as scientists and engineers keep pushing the boundaries.Well, that's my overview of why I find information science and electronic engineering so amazing. At their core, both fields are all about harnessing data and electricity to create powerful tools and technologies that can reshape how we live and understand the world around us. Just imagining where these disciplines will go next gets me incredibly excited!I have so much more to learn, but I feel like I'm getting a great foundation. Who knows, in 20 years I may be the one helping design quantum computer microchips or developing new data mining techniques to analyze zettabytes of information! For now though, I better get back to studying. This stuff is just endlessly fascinating to me. Thanks for reading, talk to you later!篇4My Big Sister Is An EngineerI love my big sister Emily! She's the coolest and smartest person I know. Emily is studying to be an engineer at university. At first, I didn't really understand what that meant, but now I think it's super awesome!Emily studies something called "Information Science and Electronic Engineering." That's a really long name, but she explained it to me in a way I could understand. You see,engineers are like modern-day wizards who use science and math to create amazing things that make our lives easier and more fun!Information Science is all about computers, the internet, and how we share and use information. Emily learns how to build websites, write computer programs, and keep our data safe from bad guys trying to steal it. Can you imagine a world without computers or the internet? I can't! It would be so boring.The Electronic Engineering part is even cooler. Emily gets to learn about electricity, circuits, and all the cool gadgets we use every day, like smartphones, TVs, and video game consoles. She knows how they work and can even build her own!One time, Emily let me visit her university lab, and it was like stepping into a magical workshop. There were tables filled with tiny electronic components, circuit boards, and tools I'd never seen before. Emily showed me a little robot she had built from scratch, and it could move around and follow simple commands.I thought that was the coolest thing ever!But being an engineer isn't just about building robots and gadgets. Emily also learns about communication systems, like how our voices travel through phone lines or how we can sendpictures and videos over the internet in seconds. It's like having a superpower!Emily says that engineers have to be good at math, science, and problem-solving. They have to think logically and creatively to find solutions to all kinds of problems. Sometimes, she has to work on projects with her classmates, which teaches her how to collaborate and communicate well with others.One of the things I admire most about Emily is how hard she works. She spends hours studying, doing homework, and working on projects. But she doesn't seem to mind because she's so passionate about what she's learning. She says that every day is an opportunity to discover something new and exciting.I can't wait until Emily graduates and gets a job as an engineer. Who knows, maybe she'll design the next big video game console or help build a rocket that takes people to Mars! The possibilities are endless when you're an engineer.Sometimes, when I'm playing with my Lego sets or trying to figure out a puzzle, I imagine myself as an engineer like Emily. I pretend that I'm building a giant robot or designing a new kind of computer. It's so much fun to let my imagination run wild and think about all the amazing things I could create someday.Emily says that the world needs more engineers to help solve problems and make life better for everyone. She always encourages me to study hard, especially in math and science, because those subjects will help me become a great engineer too if I want.I'm not sure yet if I want to be an engineer when I grow up, but I know that whatever I choose to do, I want to be just like my big sister Emily – smart, creative, hardworking, and always eager to learn new things.For now, I'll keep playing with my Legos, tinkering with my toys, and dreaming big dreams. Who knows, maybe one day I'll be the one building amazing robots or designing the next generation of computers and gadgets. With an engineer like Emily as my role model, anything seems possible!篇5My Big Sister's Awesome JobHi there! My name is Tommy, and I'm eight years old. I want to tell you all about my big sister Sarah and her really cool job. She's an Information Science and Electronic Engineering professional, which sounds super fancy and complicated, but I'lldo my best to explain it in a way that even a kid like me can understand.Sarah works with computers and electronics, which are like magic boxes that can do all sorts of amazing things. She helps make them work better, faster, and smarter. Isn't that awesome? Sometimes, she even gets to build new gadgets and gizmos from scratch!One of the things Sarah does is something called "coding." Now, I know what you're thinking – "Coding? That sounds boring!" But trust me, it's actually really exciting. Coding is like giving instructions to computers, sort of like telling your dog to sit or roll over, but way more complicated. Sarah uses special languages (not like French or Spanish, but like computer languages) to tell the machines what to do.For example, she might tell a computer to sort through a bunch of data and find specific information, or to control a robot arm in a factory. It's like she's a master magician, waving her wand (which is actually a keyboard) and making the computers do whatever she wants!Another part of Sarah's job is something called "hardware engineering." This means she gets to design and build the physical parts of computers and other electronic devices. It's likebeing a super-cool construction worker, but instead of building houses, she builds tiny little cities inside the machines!Sarah has shown me some of the gadgets she's worked on, and let me tell you, they're mind-blowing. There was this one thing that looked like a regular old box, but it could actually control the lights and temperature in our house just by talking to it. How crazy is that?But Sarah's job isn't just about coding and building stuff. She also has to solve lots of problems and figure out how to make things work better. Sometimes, things go wrong with computers or electronics, and it's her job to figure out what's causing the issue and fix it. It's like being a detective, but instead of looking for clues at a crime scene, she's investigating the inner workings of machines.Sarah says that one of the best parts of her job is getting to work with other really smart people. She's part of a team, and they all collaborate to come up with new ideas and solutions. It's like having a whole group of super-genius friends who all love computers and electronics as much as she does.Now, you might be thinking, "Wow, that sounds like a lot of hard work!" And you'd be right. Sarah has to study and learn new things all the time, and she often has to work late nights orweekends to meet deadlines. But she says it's all worth it because she loves what she does.You see, Sarah has always been fascinated by technology and how things work. When she was little, she used to take apart her toys and put them back together again (sometimes they even worked after!). She was always asking questions and trying to figure out how things worked.So, when it came time to choose a career, Information Science and Electronic Engineering was the perfect fit for her. She gets to work with the latest and greatest technology, and she's constantly learning and growing.Honestly, I'm still not entirely sure what Sarah does every day at her job, but I know it's something really important and exciting. And who knows, maybe one day I'll follow in her footsteps and become an Information Science and Electronic Engineering professional too!For now, though, I'm content to just listen to Sarah's stories and play with the cool gadgets she brings home from work. I might not understand all the technical details, but I know that my big sister has one of the coolest jobs in the world. And that's something to be proud of!篇6My Big Dream of Becoming an Awesome EngineerHi there! My name is Timmy and I'm 10 years old. I just finished 4th grade and can't wait for summer vacation! You know what I really love? I love science and building things. That's why I dream of becoming an amazing engineer when I grow up.Engineering is just the coolest thing ever. Do you know what engineers do? They get to design and create all sorts of awesome machines, robots, computers, phones, and other crazy gadgets! How awesome is that? Way better than being a boring teacher or accountant if you ask me.There are different types of engineers too. Information science engineers are like the brains behind all the technology we use every day. They design the software and programs that make our phones, laptops, and video games work. Without them, our devices would be basically just hunks of plastic and metal. Boring!Then you have electronic engineers who design the actual electronics and hardware inside our gadgets. They get to build circuit boards, processors, and all the tiny components that makeeverything run. It's like they get to play with ultra-tiny building blocks and wires all day long. So fun!I can't decide if I want to be an information science engineer or electronic engineer when I grow up. Maybe I'll do both! I'll design the software AND the electronics. Then I can create the whole product from scratch - nothing could stop me!Just imagine if I designed the next big gaming console or smart phone. I'd put in a blazing fast processor, tons of memory, and high-def graphics that would blow your mind. And I'd code all the games and apps too so you'd never get bored. It would be the hottest device ever and all the kids would beg their parents for one. I'd be rich AND famous!Or maybe I'll design robots instead. Not lame little robots that just vacuum, but awesome robots that can talk, play sports, or even fight crime. I'd make them super strong but also program them with artificial intelligence to be really smart. My robot squad could handle any problem or rescue any person in need. We'd be like a team of robotic superheroes protecting the world!Heck, if I'm really ambitious, I could even engineer entire cities of the future. Self-driving car systems, renewable energy grids, ultra-fast communication networks - I'd design it all. Thecities would basically run themselves with minimal human effort required. Everyone would live in modern, efficient luxury thanks to my brilliant engineering.Whoa, sorry, I got a little carried away dreaming there. Sometimes my big imagination gets the best of me. But isn't it crazy all the potential that engineers have? With their skills and knowledge, they can literally design our future world. That's why I'm going to study my butt off in math, science, and technology so I can grow up to be one.I may only be in elementary school now, but I'm already tinkering with electronics and coding as much as I can. My parents got me a basic robotics kit and I've been building simple rovers and robotic arms. I've also been teaching myself coding by making basic games and messing around on coding websites. It's never too early to start prepping to become an engineering prodigy!My friends think I'm a little weird always playing with circuits and computers instead of going outside. But who cares? I have huge plans that involve much more than just running around at the park all day. With hard work and perseverance, I'll get through all the tough engineering classes in college. Then I'llland my dream job designing the latest must-have technologies alongside other geniuses. We'll change the world together!Just you wait and see. In a couple decades, I may have engineered crazy holographic smartphones, sentient robot assistants, or self-sustaining colonies on Mars. The possibilities are endless because I'm going to be an engineering master of the future! Maybe you'll see my name in headlines or I'll become a legendary tech billionaire. Or maybe I'll just be quietly working behind the scenes, dreaming up must-have gadgets from my Silicon Valley lab. Whichever path I take, the world better brace itself for the insane creations of Timmy the Engineer!So yeah, that's my huge, epic dream career. What's your dream? Astronaut? Artist? Veterinarian? Whatever it is, just never stop dreaming big! Study hard, stay focused, and amazing things can happen, just like my coming engineering awesomeness. Okay, gotta go - I need to put some finishing touches on theAI-powered robotic puppy。
Five styles of Customer Knowledge Management,And how smart companies put them into actionMichael Gibbert1, Marius Leibold2, Gilbert Probst31 University of St Gallen, Switzerland and INSEAD, France,2 University of Stellenbosch, South Africa,University of Geneva, Switzerland3E-mail: Michael.Gibbert@unisg.ch, Leibold@mweb.co.za, Gilbert.Probst@hec.unige.chAbstract. In the aftermath of the knowledge economy, smart corporations begin to realize that the proverbial ‘if we only knew what we know’ also includes ‘if we only knew what our customers know.’ In this paper, the authors discuss the concept of Customer Knowledge Management (CKM), which refers to the management of knowledge from customers, i.e. knowledge resident in customers, in contrast to knowledge about customers, e.g. customer characteristics and preferences prevalent in previous work on knowledge management and customer relationship management.Subsequently, five styles of CKM are proposed and practically illustrated by way of corporate examples.Key words. Knowledge Management, Customer relationship management, customer knowledge management, value creation.Our customers are more knowledgeable than we realizeIn the emphasis on knowledge as a key competitive factor in the global economy, corporations may be overlooking a major element – customer knowledge. For example: Old Mutual, the largest insurance company in South Africa (and an internationally expanding FTSE 100 quoted company on the London Stock Exchange) has been incorporating the knowledge of patients concerning their own health condition and treatment directly by way of electronic means, instead of relying only on medical doctors to provide this. Customer knowledge is being used by Old Mutual both to screen applicants for medical insurance and more importantly to develop new medical insurance products.What is the reason for the increasing success of Old Mutual in South Africa and internationally? Partly due to a process called Customer Knowledge Management (CKM). It works like this: where patients’ health evaluation forms were previously completed manually by their doctors, it has been replaced by electronic forms that can be filled in mainly by patients themselves, from the convenience of their homes. Patients still require medical examination by their doctors, but the advantage to all parties are speed, greater accuracy (doctors are notorious for poor handwriting), more information, and especially additional knowledge input from patients themselves. The issues of ethics and professionalism (e.g. security of information, patient-doctor relationship) of course have to be carefully managed, but the advantages of customer knowledge input of their condition, treatment, effects of particular drugs, perception of medical insurance companies and their products are substantial and valuable to pharmaceutical companies, insurance companies, doctors and other stakeholders in the health management industry.Does customer knowledge management only happen in the pharmaceutical/insurance industries? We think not. Over the last six years, we have studied more than two-dozen companies, and found that smart companies are prolific customer knowledge managers (see insert Appendix B: ‘Our research’). Indeed, most companies today consider themselves as market driven, or customer-oriented. Yet only a few companies are actually managing well their perhaps most precious resource: the knowledge of, i.e. residing in their customers, as opposed to knowledge about their customers.Our research shows that by managing the knowledge of their customers, corporations are more likely to sense emerging market opportunities before their competitors, to constructively challenge the established wisdom of “doing things around here”, and to more rapidly create economic value for the corporation, its shareholders, and last, but not least, its customers. CKM is the strategic process by which cutting edge companies emancipate their customers from passive recipients of products and services, to empowerment as knowledge partners. CKM is about gaining, sharing, and expanding the knowledge residing in customers, to both customer and corporate benefit. It can take the form of prosumerism, mutual innovation, team-based co-learning, communities of practice, and joint intellectual property (IP)management. We have identified these as five styles of CKM, which are distinctively different practices, but not mutually exclusive.Expanding on CRM and Knowledge ManagementAt first glance, CKM may seem just another name for Customer Relationship Management (CRM), or Knowledge Management (KM). But customer knowledge managers require a different mindset along a number of key variables (see Table 1). Customer knowledge managers first and foremost focus on knowledge from the customer (i.e. knowledge residing in customers), rather than focusing on knowledge about the customer, as characteristic of customer relationship management. In other words, smart companies realize that corporate customers are more knowledgeable than one might think, and consequently seek knowledge through direct interaction with customers, in addition to seeking knowledge about customers from their sales representatives. Similarly, conventional knowledge managers typically focus only on trying to convert employees from egoistic knowledge hoarders into altruistic knowledge sharers (Eisenhardt and Galunic, 2000). In contrast, with CKM ‘If only we knew what we know’ turns into ‘if only we also knew what our customers know.’Table 1: CKM versus Knowledge Management & Customer Relationship Management KM CRM CKMKnowledge sought in Employee, team, company,networkCustomer Database Customer experience andcreativityAxioms ‘if only we know what weknew’ ‘retention is cheaper thanacquisition’‘if we only knew what ourcustomers know’Objectives Sharing knowledge aboutcustomers amongemployees Mining knowledge aboutthe customerGaining, sharing and expandingknowledge of (inside) thecustomerIndividual or group experiencesin applications, competitorbehavior, possible futuresolutions, etcRole of customer Passive, recipient ofproduct Captive, tied to productby loyalty schemesActive, knowledge partner.Recipient ofIncentivesEmployee Customer CustomerCorporate role Lobbying knowledgehoarding employeesCaptivate customers Emancipate customersBusiness objectives Efficiency and speedgains, avoidance of re-inventing the wheel Customer base nurturing,maintaining ourcustomersCollaboration with customers,joint value creationConceptual base Customer retention Customer satisfaction Customer success, innovation,organizational learningBusiness metrics Performance againstbudget; Customerretention rate Performance in terms ofcustomer satisfaction andloyaltyPerformance against competitorsin innovation and growth;Contribution to customer successThe logic of customer knowledge management seems counter-intuitive: the challenges of getting employees to share their knowledge with one another are daunting enough. Why would customers, of all people, want to share their knowledge to create value for the company and then pay for their own knowledge once it is deployed in the company’s products? This is further exacerbated because customers, like employees, are often not able to make knowledge, i.e. their experiences with the company’s products, their skills, and reflections explicit, and thereby easily transferable and shareable. The answer to these questions is customer knowledge managers put themselves in the shoes of corporate customers, kindling customers’ intrinsic, rather than extrinsic motivation to share their knowledge for the benefit of the company.Consider : The Internet retailer manages customer knowledge successfully through providing book reviews, the customer’s own order histories, order history of other customers, and customized suggestions based on prior orders. Effectively, , a commercial enterprise, developed into a platform of book enthusiasts that are keen to exchange knowledge about their favorite topics (intrinsic motivation). Motivating customers to share their knowledge the Amazon way is a remarkable achievement, particularly if contrasted with the often vain efforts to evangelize employees from egoistic knowledge hoarders to altruistic knowledge sharers by way of rewards systems that are mostly extrinsic. But customer knowledge management is not limited to successful Internet companies. Brick and mortar companies do it, too. Indeed, Holcim, an international cement company that produces the very stuff brick and mortars are made of, is a keen customer knowledge manager (see Insert: How Holcim manages customer knowledge).How the international cement manufacturer Holcim manages customer knowledge Holcim’s companies in the U.S. recently were conducting analysis how to deliver e-commerce solutions to their customers. But Holcim’s aspiration was much more ambitious than simply doing e-commerce. The idea was to create a knowledge sharing platform, where any member of the community of cement and aggregates consumers (concrete producers, distributors, but also engineers, architects) would be able not only to transact business (place orders, pay online), but also share and exchange knowledge (e.g. share cement order forecast, share good and bad experience with specific applications, etc.).In order to test and further develop this aspiration, Holcim’s customer knowledge managers conducted meetings with selected customers in the US. To ensure that their different customer segments were adequately represented. The customer mix was intentionally varied, comprising selected large multi nationals, medium domestic and small family owned companies. The objective of the meeting was to discuss current and emerging trends in the cement industry and the potential impact of these developments on Holcim’s customers, thereby jointly ascertaining how Holcim could create value for their customers.The discussion was open and free flowing - although Holcim had developed a set of value added services that were thought appropriate, Holcim did not implement these until after the customers had given their views, thereby adding value to the company’s services. As one customer knowledge manager at the cement manufacturer has it: “As part of the focus group discussions, Holcim’s customers were impressed the company was talking to them - no other supplier had chosen to do this - all they were seeing were press releases. This made customers feel ownership in our project.”In the meantime, Holcim has built and implemented the knowledge sharing platform in Canada, Belgium and France. Spain and the U.S. will follow shortly. What’s more, during the entire “build” phase, the company kept close contact with the customers and permanently validated with them what did – which was much-appreciated by Holcim’s customers. A representative of a large multinational mentioned: “I like your knowledge sharing platform, because you were listening what I told you during the first visit and really took my comments very serious.”1A shift in mindset towards looking at the customer as a knowledgeable entity has far-reaching implications. Most importantly, the customer is emancipated from being a passive recipient of products and services as in traditional knowledge management. Likewise, the customer is liberated from the ball and chain of customer loyalty schemes prevalent in CRM.Customer knowledge management is also different from traditional knowledge management in the objective pursued. Whereas traditional knowledge management is about efficiency gains (avoidance of “re-inventing the wheel’), customer knowledge management is about innovation and growth. Customer knowledge managers seek opportunities for partneringwith their customers as equal co-creators of organizational value. This is also in stark contrast to the desire to maintain and nurture an existing customer base. The well-known CRM adage ‘retention is cheaper than acquisition’ comes to mind. Unfortunately, retention becomes increasingly difficult in an age where competitors’ product offerings are often close imitations and only three mouse-clicks away. Therfore, customer knowledge managers are much less concerned with customer retention figures. Instead, they focus on how to generate growth for the corporation through acquiring new customers and through engaging in an active and value-creating dialogue with them.How do customer knowledge managers create innovation and growth? Again consider . The book retailer’s customers not only provide their insights, tips and tricks in terms of book reviews, they also provide useful pointers for further reading on a given subject, giving a custom-tailored, non-intimidating impetus for other customers to investigate – and possibly buy – these sources. What is more, this customer knowledge can be shared with the authors of new books, giving them ideas for further publications and their market potential. This process bears all the hallmarks of knowledge management: it provides useful information that is used in actions, creates sense, asks for interpretation, and leads to new combinations. Only, the knowledge is not that of the employee, but that of the customer, leading to value creation through innovation and growth, rather than to cost savings as in traditional knowledge management.CKM in theory and practiceCustomer-driven companies need to harness their capabilities to manage the knowledge of those who buy their products (Baker, 2000; Davenport and Klahr, 1998). The question is, why do many customer-driven companies not access the knowledge of their customers directly? The problem is that the existing mindset, as evidenced by the literature, provides very little assistance to these companies.Traditionally, market research was used to shed more light on what the customer knew and thought about the product, and how this differed from what the company had to afford the customer, resulting in enormous CRM databases (Galbreath and Rogers, 1999; Wilkestrom, 1996; Woodruff, 1997). More recently, firms thought they had found a new approach to access customer knowledge. Drawing on best practices from service companies, such as the1 Alois Zwinggi and Lucas Epple provided this example.big consulting businesses, most large organizations have instituted knowledge management systems. These systems, however, are based in an indirect understanding of what customers want. KM systems are typically geared towards disseminating what their sales force or intermediary has understood from listening to the customers who bought - or didn’t buy - the company’s products.It’s ironic: the conceptual predecessor of knowledge management has surpassed its own offspring. Ten years ago, proponents of the resource based view to strategy have proclaimed that a company be best conceptualized as a bundle of unique resources, or competencies, rather than as a bundle of product market positions (Barney, 1991). More recent contributions to the resource-based view question this one-sided thinking about the locus of competence (Prahalad and Ramaswamy, 2000; Inkpen, 1996). It has now been claimed that such competence actually moved beyond corporate boundaries, and that it is therefore worthwhile to also look for competence in the heads of customers, rather than only in the heads of employees.Similarly, CRM has been traditionally popular as a means to tie customers to the company through various loyalty schemes, but left perhaps the greatest source of value under-leveraged: the knowledge residing in customers. While both KM and CRM focused on gaining knowledge about the customer, managing customer knowledge is geared towards gaining knowledge directly from the customer.Whilst the literature provides little guidance for aspiring knowledge managers, we have found in our research with two dozen companies in e.g. the medical, financial services, measurement, agricultural chemicals, telecommunications, and beverages industries is a wide variety of different approaches to managing customer knowledge. Indeed, the very chasm between the wealth of practical examples of (intuitive) customer knowledge management and the dearth of (explicit) literature and guidance for managers seems remarkable. While we detected a wide variety of different approaches used by companies who manage customer knowledge, what was even more intriguing were the similarities among the individual approaches. We have crystallized these similarities in five styles of customer knowledge management, as displayed in Table 2.Table 2: Five styles of CKMStyle/CharacteristicProsumerism Team-based Co-learning Mutual Innovation CommunitiesFocus Developing tangible assets andbenefits Creating corporate social capital Creating new products &processesMission-specificexpertiseObjective Improved products & resultingbenefits Facilitate team learning fordealing with systemic changeCreate max return from newideasObtain & explicaExpertiseProcesses Pre, -Concurrent- & Postproduction integration Teamwork, empowerment, casedevelopment, quality programsIdea Fairs; Brainstorming;Customer IncubationBest Practices CONetworksSystems Planning, Control and decisionsupp. systems Knowledge sharing systems,Digital “nervous” systems,customer visits in teamsIdea Generation SupportSystemsExpert Systems,workspaces, GroSystemsPerformance Measures Effectiveness & Efficiency,Customer satisfaction & successSystems Productivity, Quality,Customer satisfaction & successROI from new products &processes, Customer successK-sharing behaviof decisions,Rate of hyperlinkCase Examples Quicken; IKEA ; Xerox, Holcim,Mettler ToledoSilicon Graphics, Ryder Microsoft; Sony;Intensity ofinteractionRelatively low Low to high Relatively low Relatively highType ofknowledgeMore explicit Explicit and tacit More tacit More tacit Five styles of CKM and their applicationProsumerismAlvin Toffler (1980) first used the expression “prosumer” to denote that the customer couldfill the dual roles of producer and consumer. Such co-production is not new, e.g. Boschdevelops engine management systems in co-production with Mercedes-Benz, who conceivesand assembles the automobile. What is new is the way that knowledge co-production withthe customer expresses itself in role patterns and codes of interactivity. For example,Quicken enables the customer to learn more about the available resources in financial services, thus creating options and a predisposition within the customer to rapidly tailor-makean offering in the future, also based on creatively suggesting new ideas and benefits.The way IKEA, the living environment furniture retailer, presents itself to customers is all about co-production, about how benefits and activities have been reallocated between producer and customer. The CKM process in IKEA transforms the customer into a co-value creator, endowing him/her with new competencies and benefaction opportunities. It liberates the customer from the platform of only past, accumulated knowledge by stimulating the him with a pattern of open-ended value-creating ideas, thereby effecting co-production and mutual new value evidenced in new IKEA furniture products and services.Team-based co-learningThe way that has manifested itself structurally has created a whole new set of team-based value chain (or systemic) learning relationships utilizing the knowledge of its customers. For example, the inter-linkages with the customer base and their interactive joint learning performance have made the company an attractive channel also for many other companies – we may now conceive no longer as a bookstore but a generalized access channel (or “portal”) for a wide range of products and services, many offered by separate but systemic-linked companies. Through the customer-systemic knowledge and co-learning interactions ’s original identity has been transformed, which in turn implies new value chain systems relationships.The change process in Xerox Corporation, from being a “copying machine company” to becoming the “document company” is similarly based on organizational learning resulting from customer knowledge management. Customer knowledge was the key to reconfigure the entire system of document management and its infrastructure, spanning resources and processes much broader than its own traditional realm of activities. Whereas the Prosumerism CKM style focuses more on co-production of products and services, team-based co-learning focuses on reconfiguring entire organizations and systems of value.Mutual innovationIn the 1970’s, Eric von Hippel found that most product innovations come not from within the company that produces the product but from end-users of the product (Von Hippel, 1977). For Silicon Graphics, lead customers from the movie industry have become an important source of new ideas and innovation. Silicon Graphics sends its best R&D people to Hollywood to learn firsthand what the most creative users of its products might want in the future. In addition, Silicon Graphics nurtures relationships with lead users from otherindustries that require massive computation and high-end graphics – such as for drug design and aerospace landing gear. Just asking these users about their future needs is unlikely to result in new products (although it can lead to continuous product improvement); the major breakthroughs come from mutual and closely integrated innovation practices.Ryder Systems in the trucking industry is another example of utilizing customer knowledge through mutual innovation. In close collaboration with customers it developed complex and extensive logistics solutions to its customers, probing deeply into the operations and even manufacturing and supply chain strategies of customers. Jointly they developed special knowledge of truck driver requirements, thereby reconfiguring truck personnel management activities. Ryder in effect has become, via mutual customer innovation, a logistics systems solutions expert, transcending its identity as a trucking company.Communities of creationCommunities of creation as a CKM style is reflected in the putting together of customer groups of expert knowledge that interact not only with the company, but importantly also with each other (Sawnhney and Prandelli, 2000; Wikstrom, 1996). Similarly to communities of practice, communities of creation are groups of people who first work together over a longer period of time, second they have interest in a common topic and third, want jointly create and share knowledge. Unlike the traditional communities of practice, however, communities of creation span organizational, rather than functional boundaries to create common knowledge and value. In the traditional computer software development process, Netscape and Microsoft make use of free “beta” versions of its products for use, testing, comments and reporting not only to the company, but also among the user community themselves. They enlist thousands of willing, devoted testers, some just interested in using the free “beta” product and others intent on looking for “bugs” to show off and perhaps even collect a prize. Customers appreciate product newsgroups and “chat rooms”, where they can also learn how the companies are acting on their feedback – resulting in loyalty and even a sense of ownership.Sony and Panasonic in the consumer electronics market have set up “antenna shops” at locations such as shopping centers and airports, where demanding customers frequent and prototype products are featured. Customers can experiment, test, and converse with each other, and development engineers and product managers are available to talk to and watch customers, getting first-hand knowledge of customers reactions and what they would reallywant. Another example of a community of creation style of CKM are the Weight Watchers. This company brings groups of customers together in order for customers to exchange knowledge and experience, and for weight watchers to obtain insights for CKM. The important point is that this does not happen in itself – it has to be carefully managed even if participation is voluntary and intrinsic as tends to be the case with Weight Watchers.Joint intellectual propertyThis style of CRM is probably the most intense involvement between customer and corporation – the notion of the corporation being “owned” by its customers. The Swedish companies Skandia Insurance and Kooperativa Förbundet (KF) increasingly think of themselves as businesses owned by customers, i.e. being in business for and because of its customers. Thus, intellectual property does not reside in the company, but is “owned” partly by the customers. This formula enabled KF to make remarkable achievements over a long period of time, becoming a pioneer in customer education and the consumer movement through joint knowledge ownership and its continuous development. Instead of just co-producing products and services together, customers and company co-create future business together. For example, the broker, banking and other retail customers of Skandia combine with the company’s key strategy decision-makers to review the scope of joint business, possible joint new strategic initiatives, and joint knowledge expansion of e.g. emerging markets. Customer success in fact becomes corporate success, and vice versa.Common stumbling blocksCKM can provide a significant competitive advantage to companies, but its possible stumbling blocks have to be appreciated and circumvented. The major stumbling blocks we have identified include:• Application of CKM with an inappropriate mindset, e.g. as a tool for leveraging knowledge from customers, instead of as a long-term customer value-creation mechanism for sustainable mutual growth and performance. Companies have to value and nurture their customers as knowledge partners, instead of knowledge sources, for CKM to be effective.• Underestimating customer diversity, and applying only one or two of the CKM styles exclusively to all types of customers. Customers differ, even in the same industrysegments, and this variety among customers requires a richness of CKM styles with different approaches and techniques.• Inappropriate incentives for customers and organizational entities to leverage CMK to its full potential for both parties. The challenge is both avoiding under-estimation of incentives, as well as over-estimating them. These can work as disincentives if not properly sensed, devised and implemented.• Inadequacies in organizational infrastructure and processes to handle the leveraging of knowledge from customers. Current theory regarding balanced scorecards in organizations emphasize organizational processes to enable customer satisfaction, and the converse should now also be possible---the ability of organizational processes to accommodate diverse customer knowledge inputs.• Falling into the possible ‘trap’ of over-reliance on (existing) customer knowledge (danger of being overly ‘customer-led’ and not broader ‘market-led’), without appropriate sensing of wider environmental impacts and influences.• Trust and protection issues not adequately emphasized, i.e. mutual understanding, reliance and confidentiality must be agreed upon and consistently implemented.Careful consideration has to be given to degrees of openness in sharing of knowledge, and cultural issues of respect, trust and ways of interacting have to adequately co-shaped and optimized.ConclusionCustomer knowledge management (CKM) creates new knowledge sharing platforms and processes between companies and their customers. It is a continuous strategic process by which companies enable their customers to move from passive information sources and recipients of products and services to empowered knowledge partners. Available evidence points to CKM as a potentially powerful competitive tool, contributing to improved success of both companies and their customers. It incorporates principles of knowledge management and customer relationship management, but moves decisively beyond it to a higher level of mutual value creation and performance.It is suggested that the styles of CKM can be prosumerism, group learning, mutual innovation, communities of creativity, and joint intellectual capital. Any company, depending on the nature of its various customers, can apply several of these five styles of CKM simultaneously. Certain cautions have to be observed when applying CKM, and if these are。
Mining Business Topics in Source Code using LatentDirichlet AllocationGirish Maskeri,Santonu Sarkar SETLabs,Infosys T echnologies Limited Bangalore560100,Indiagirish_rama@, santonu_sarkar@Kenneth Heafield California Institute of T echnologyCA USAkpu@kheafiABSTRACTOne of the difficulties in maintaining a large software system is the absence of documented business domain topics and correlation between these domain topics and source code. Without such a correlation,people without any prior appli-cation knowledge wouldfind it hard to comprehend the func-tionality of the tent Dirichlet Allocation(LDA), a statistical model,has emerged as a popular technique for discovering topics in large text document corpus.But its ap-plicability in extracting business domain topics from source code has not been explored so far.This paper investigates LDA in the context of comprehending large software systems and proposes a human assisted approach based on LDA for extracting domain topics from source code.This method has been applied on a number of open source and propri-etary systems.Preliminary results indicate that LDA is able to identify some of the domain topics and is a satisfactory starting point for further manual refinement of topics. Categories and Subject DescriptorsD.2.7[Software Engineering]:Distribution,Maintenance, and Enhancement—Restructuring,reverse engineering,and reengineering;General termsTheory,Algorithms,ExperimentationKeywordsMaintenance,Program comprehension,LDA1.INTRODUCTIONLarge legacy software systems often exist in a state of dis-organization with poor or no documentation.Adding new features andfixing bugs in such a system is highly error prone and time consuming since the original authors of the Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on thefirst page.To copy otherwise,to republish,to post on servers or to redistribute to lists,requires prior specific permission and/or a fee.ISEC’08,February19-22,2008,Hyderabad,India.Copyright2008ACM978-1-59593-917-3/08/0002...$5.00.system are generally no longer available.Moreover,the peo-ple maintaining the code-base do not comprehend the func-tional purpose of different program elements(functions,files, classes,data structures etc.)and the roles they play to fulfill various functional services offered by the system.When a software system is small,one can understand its functional architecture by manually browsing the source code.For large systems,practitioners often rely on program analysis techniques such as call graph,control and dataflow and slicing[2].Though this is very helpful to understand the structural intricacies of the system,it helps a little to com-prehend the functional intent of the system.The reason is not difficult to understand.Structural analysis techniques work on the structural information derived from a set of program source code.The structural information is at a very low level of granularity-such asfiles,functions,data-structures,variable usage dependencies,function calls and so on.This information hardly reveals any underlying func-tional purpose.For a large system,this information becomes overwhelmingly large for a maintainer to manually derive any functional intent out of it.Moreover,for a system with millions of lines of code,the memory requirement to hold the structural information and perform various analysis of-ten becomes a bottleneck.An important step to comprehend the functional intent of the system(or the intended functional architecture)is to identify the business topics that exist in the system,around which the high level components(or modules)have been im-plemented.For example consider a large banking applica-tion that deals with customers,bank accounts,credit cards, interest and so on.A maintainer without any application knowledge willfind it difficult to add a new interest calcula-tion logic.However,if it is possible to extract the business topics such as“customer”,“interest”from the source code and establish an association between“interest”with vari-ous program elements,it would be of immense help to the maintainer to identify the related functions,files,classes and data structures and carry out the required changes for inter-est calculation.This in turn can make a novice programmer much more productive in maintaining a system,specially when the software is in the order of millions of lines of code with little documentation.A plausible technique to identify such topics is to derive semantic information by extracting and analyzing various important“keywords”from the source code text[13].It is often the case that the original authors of the code leave hints to the meaning of program elements in the form of keywords infiles,functions,data type names and so on.Forinstance,it is highly common to use keywords like“proxy”,“http”while implementing an http-proxy functionality.Sim-ilarly,for a banking application one would surely like to use a keyword like“interest”in implementing functions,classes and data structures pertaining to interest calculation.Assuming that the meaningful keywords do exist in pro-gram elements,is it possible to semantically correlate these keywords to meaningful clusters where each meaningful clus-ter of keywords can be interpreted as a“Topic”?For in-stance,is it possible to group all“proxy”related keywords together into one cluster and identify it as“proxy”topic and “authentication”related keywords into another cluster form-ing the“authentication”topic?If this is possible,one can subsequently establish association between the topic“proxy”and various program elements(files,functions or data struc-tures)pertaining to“proxy”.This paper addresses the above problem and proposes a human assisted approach based on the Latent Dirichlet Allo-cation(LDA)[7]for identifying topics in source code.LDA has been quite popular in the realm of text document classi-fication and identifying topics from text documents.To the best of our knowledge,This is thefirst attempt at applying LDA in the context of source code analysis.The paper is organized as follows:In the next section we provide a brief review of the literature relevant to our work and the necessary background information on LDA.Section 3discusses the applicability of LDA for extracting domain topics from source code and provides an interpretation of the LDA model in the context of source code.A detailed description of extracting domain topics from source code us-ing LDA is presented in section4.We have applied LDA on a number of open source and proprietary systems.Section 5presents the results obtained.Advantages and disadvan-tages of the presented LDA based method and some of the interesting observations made during our experimentation are presented in Section6.Finally,Section7discusses the future research directions and concludes the paper.2.BACKGROUND AND RELATED WORKResearchers have long recognized the importance of lin-guistic information such as identifier names and comments in program comprehension.For instance,Biggerstaffet al.[5]have suggested assignment of domain concepts as an ap-proach to program comprehension.Tonella et al.[8]have proposed function names and signatures to obtain domain specific information.Anquetil et al.[3]have suggested that the information obtained from the name offile often carry the functional intent of the source code specified in thefile. Wilde et al.[24]have also suggested usage of linguistic in-formation to identify the functional intent of the system. Since then,linguistic information has been used in various program analysis and maintenance tasks such as traceabil-ity between external documentation and source code[4,16], feature location1[17,21,25],identifying high level concep-tual clones[15]and so on.Recently,linguistic information has also been used to iden-tify topics in source code and subsequently used for software clustering[13]and software categorization[11].Kuhn et al.[13]have used Latent Semantic Analysis(LSA) [9]based approach for identifying topics in source code by 1Feature location is sometimes referred to as concept loca-tion semantically clustering software artifacts such as methods,files or packages based on identifier names and comments. Our approach differs from that of Kuhn et al.in two ways. Firstly,and most importantly,our interpretation of a“topic”is different from that of Kuhn.Kuhn interprets semanti-cally clustered software artifacts(like methods,files etc.)as topics whereas we interpret a set of semantically related lin-guistic terms derived from identifier names and comments as a“topic”.Another important difference is in the ap-proach used for semantic clustering.While Kuhn et al.have adopted LSA to cluster a set of meaningful software arti-facts,our approach of clustering linguistic terms is based on the Latent Dirichlet Allocation.Kawaguchi et al.[11]uses linguistic information in source code for automatically identifying categories and categoriz-ing open source repositories.A cluster of related identifiers is considered as a“category”.In our case,we consider a cluster of terms derived from identifiers as a“topic”;thus a topic can certainly be considered synonymous to a“cat-egory”.However,our approach of clustering semantically related identifier terms differs from the approach suggested by Kawaguchi et al.[11].Kawaguchi et al.first uses LSA to derive pairwise similarity between the terms and then apply a clustering algorithm to cluster similar terms together.The LDA based approach we have adopted alleviates the need of having two steps.Since LDA is essentially a topic modeling technique,it not only discovers similarity between terms,it also creates a cluster of similar terms to form a topic.In the rest of this section,we provide a brief description of LDA and its use in extracting topics from text documents.2.1LDALatent Dirichlet Allocation(LDA)[7]is a statistical model, specifically a topic model,originally used in the area of nat-ural langauge processing for representing text documents. The basic idea of LDA is that a document can be considered as a mixture of a limited number of topics and each mean-ingful word in the document can be associated with one of these topics.Given a corpus of documents,LDA attempts to discover the following:•It identifies a set of topics•It associates a set of words with a topic•It defines a specific mixture of these topics for eachdocument in the corpus.LDA has been applied to extract topics from text docu-ments.For instance,Newman et al.[19]applied LDA to de-rive400topics such as“September11attacks”,“Harry Pot-ter”,“Basketball”and“Holidays”from a corpus of330000 New York Times news articles and represent each news arti-cle as a mixture of these topics.LDA has also been applied for identification of topics in a number of different areas.For instance,LDA has been used tofind scientific topics from abstracts of papers published in the proceedings of the na-tional academy of sciences[10].McCallum et al.[18]have proposed LDA to extract topics from social networks and apply it to a collection of250,000Enron emails.A varia-tion on LDA has also been used by Steyvers et al.[22]to analyze160,000abstracts from the“citeseer”computer sci-ence collection.Recently,Zheng et al.[6]have applied LDA to obtain various biological concepts from a protein related corpus.These applications of LDA seem to indicate that the tech-nique can be effective in identifying latent topics and sum-marizing large corpus of text documents.2.1.1LDA ModelFor the sake of completeness,we briefly introduce the LDA model.A thorough and complete description of the LDA model can be found in[7].The vocabulary for describing the LDA model is as follows:word A word is a basic unit defined to be an item from a vocabulary of size W.document A document is a sequence of N words denoted by d=(w1,···,w N)where w n is the n th word in the sequence.corpus A corpus is a collection of M documents denoted by D={d1,···,d M}.In the statistical natural language processing,it is common to model each document d as a multinomial distributionθ(d) over T topics,and each topic z j,j=1···T as a multino-mial distributionφ(j)over the set of words W.In order to discover the set of topics used and the distribution of these topics in each document in a corpus of documents D,we need to obtain an estimate ofφandθ.Blei et al.[7]have shown that the existing techniques of estimatingφandθare slow to converge and propose a new model-LDA.The LDA based model assumes a prior Dirichlet distribution on θ,thus allowing the estimation ofφwithout requiring the estimation ofθ.LDA assumes a generative process for creating a document [7]as presented below.1.choose N∼P oisson(ξ):Select the number of wordsN2.θ∼Dir(α):Selectθfrom the dirichlet distributionparameterized byα.3.For each w n∈w do(a)Choose a topic z n∼Multinomial(θ)(b)Choose a word w n from p(w n|z n,β),a multino-mial probabilityφz nIn this model,various distributions namely,the set of top-ics,topic distribution for each of the documents and word probabilities for each of the topics are in general intractable for exact inference[7].Hence a wide variety of approximate algorithms are considered for LDA.These algorithms at-tempt to maximize likelihood of the corpus given the model.A few algorithms have been proposed forfitting the LDA model to a text corpus such as variational Bayes[7],expec-tation propagation[14],and Gibbs sampling[10].3.APPLYING LDA TO SOURCE CODE Given that LDA has been successfully applied to large cor-pus of text data(as discussed in Section2.1),it is interest-ing to explore i)how applicable it is in the context of source code ii)how effective the technique is in identifying business topics in a large software system.To apply LDA in source code,we consider a software system to be a collection of source codefiles and the software system is associated with a set of business domain concepts(or topics).For instance, the Apache web server implements functionality associated with http-proxy,authentication,server,caching and so on. Similarly,a database server like Postgresql implements func-tionality related to storage management.Moreover,there exists a many-many relationship between these topics like authentication,storage management and the source codefiles that implement these topics.Thus a source codefile can be thought of as a mixture of these domain topics. Applying LDA to the source code now reduces to map-ping source code entities of a software system to the LDA model,described in Table1.Given this mapping,applica-LDA Model Source Code Entitiesword We define domain specific keywords ex-tracted from names of program elementssuch as functions,files,data structures andcomments to be the vocabulary set withcardinality V.A word w is an item fromthis vocabulary.document A source codefile becomes a document inLDA parlance.For our purpose,we repre-sent a documentf d=(w1,w2,...,w N)to be a sequence of N domain specific key-words.corpus The software systemS={f1,f2,...,f M}having M source codefiles forms the cor-pus.Table1:Mapping LDA to Source Codetion of LDA to source code corpus is not difficult.Given a software system consisting of a set of source codefiles,do-main related words are extracted from each of the source codefiing this,a source codefile-word matrix is con-structed where source codefiles form the rows,domain words form the columns and each cell represents the weighted oc-currence of the word,representing the column,in the source codefile representing the row.This source codefile-word matrix is provided as input to LDA.The result of LDA is a set of topics and a distribution of these topics in each source codefile.A topic is a collection of domain words along with the importance of each word to the topic represented as a numeric fraction.4.IMPLEMENTATIONWe have implemented a tool for static analysis of code. Figure1shows a part of this tool that specifically deals with topic extraction,topic location identification[5,24],visual-ization of the topic distribution and a modularity analysis based on domain topics.The main input of LDA based topic extraction is a document-word matrix wd[w,f d]=ηwhereηis a value indicating the importance of the word w in thefile f d.We will shortly describe an approach to computeηbased on the number and the place of occurrences of w in f d.Our current imple-mentation uses the Gibbs sampling method[10]that uses a markov chain monte carlo method to converge to the targetFigure1:Tool Block diagram distributions in an iterative manner.The detailed descrip-tion of this method is not in scope of this paper.Input Parameters.Our tool,based on the LDA approach,takes two parame-tersαandβ(as described in Section2.1.1)and creates the distributionφof words over topics and the distributionθof topics over documents.In addition to the parametersαand βthe tool also requires the number of iterations.Recall that LDA defines a topic as a probability distribution of over all the terms in the corpus.We have defined an user specified cut-offvalueΨwhich is used to discern the most important words for each topic.4.1Keyword ExtractionIn order to create a document-word matrix for LDA,it is extremely important tofind out meaningful domain related words in the source code.Approaches that apply LDA to linguistic text documents consider each word in the docu-ment for this purpose.However,unlike a plain text docu-ment a source codefile is a structured document and it is definitely not appropriate to assume that each word in the file would be domain related.Firstly,a large percentage of words in source codefiles constitute the programming language syntax,such as for,if,class,return,while and so on.Moreover,domain keywords are often embedded inside identifier names as subwords and identifier names need to be split appropriately to extract the relevant subwords. Given this observation we propose the following steps to ex-tract meaningful keywords from source codefiles:1.Fact Extraction.2.Splitting of identifiers and comments into keywords.3.Stemming the keywords into their respective common roots.4.Filtering of keywords to eliminate keywords that do not indicate any business concept.Fact Extraction.Fact extraction is the process of parsing source code doc-uments and extracting meta-data of interest such asfiles, functions,function dependencies,data structures etc.We have used source navigator[1]for extracting facts from source code.Identifier Splitting.Splitting of identifiers into meaningful subwords is essen-tial because unlike a natural language text where each word is independent and can be found in a dictionary,source code identifier names are generally not independent words but a sequence of meaningful chunks of letters and acronyms delimited by some character or through some naming con-vention.For instance a function named“add auth info”in httpd-2.0.53source code constitutes of three meaning-ful chunks of letters“add”,“auth”and“info”delimited by “”.The vocabulary we use for such meaningful chunks of let-ters is“keyword”.Each identifier is split into a set of key-words.In order to perform this splitting it is essential to know how the identifier names are delimited.There are a number of schemes such as underscore,hyphen,or through capitalization of specific letters(camel case)as in”getLoan-Details”.We have implemented a regular expression based identifier splitting program in Perl.Stemming.Generally keywords in source code are used both singu-larly as well as in plural.For example“loan”and“loans”. In our analysis it is not necessary to consider them as two different keywords.Hence we unify all such keywords by stemming them to their common root.Also,It is a common practice[23,12]to stem terms in order to improve the results of analysis.We have used the Porter’s stemming algorithm [20]for stemming all words into a common root. Filtering.Not all keywords are indicators of topics in the domain. For instance keywords such as“get”and“set”are very generic and a stop-words list is employed tofilter out such terms. 4.2File Keyword MappingHaving identified a set of unique keywords for a given software system,we now compute the wd matrix.For this purpose,we compute a weighted sum of the number of oc-currences of a word w in afile f d as follows:1.We define a heuristic weightageλ:{lt}→ℵthat assigns a user-defined positive integer to a“location-type”lt.A“location-type”lt denotes the structuralpart of a source codefile such asfile name,functionname,formal parameter names,comment,data-structure name and so on from which a keyword has been ob-tained.The weightage given to a word obtained from afunction name would be different from a word obtainedfrom a data structure name.2.The importance factor wd[w,f d]for a word w occur-ring in thefile f d is computed by the weighted sumof the frequency of occurrences of w for each locationtype lt i in a source codefile f d.That is,wd[w,f d]=Xlt iλ(lt i)×ν(w,f d,lt i)whereν(w,f d,lt i)denotes the frequency of occurrenceof the word w in the location type lt i of the sourcefilef d.In order to illustrate the calculation of the importance fac-tor wd consider the following code snippet from thefile Or-derDetails.java.public class OrderDetails implements java.io.Serializable{ private String orderId;private String userId;private String orderDate;private float orderValue;private String orderStatus;public String getOrderStatus(){return(orderStatus);}......}As discussed in section4.1,identifier names are split to get meaningful domain words and importance factor calcu-lated for each of the words.One such word that is extracted from the above code snippet is“Order”which occurs in com-ments and names of different type of identifiers such as in class name,attribute name and method name.These differ-ent types of sources for words constitute our set of location types lt.Generally,in an object oriented system,classes represent domain objects and their names are more likelyto yield domain words that are important for that class. Hence,λ(class)generally is assigned higher value by do-main experts thanλ(attribute).Let us assume that in this particular exampleλ(class)equals2,λ(attribute)equals1 andλ(method)equals1.The importance factor of the word “Order”in the above code snippet as calculated accordingto the formula given above is7.wd[Order,OrderDetails.java]=2∗1+1∗4+1∗1=7 Similarly,weighted occurrence is calculated for other words such as“details”,“user”and“status”.4.3Topic labelingLDA could not satisfactorily derive a human understand-able label for an identified topic.In most of the cases,the terms from which a label can be derived are abbreviationsof business concepts or acronyms.As a result it becomes hard to create a meaningful label for a topic automatically.In the current version of the tool,identified topics have been labeled manually.5.CASE STUDIESWe have tested our approach on a number of open source and proprietary systems.In the rest of this section we dis-cuss the results obtained using some of the topics as exam-ples.5.1Topic Extraction for ApacheWe extracted30topics for Apache.For the sake of brevity we list only two topics,namely“SSL”and“Logging”.Table1(a)lists the top keywords for topic“SSL”and their corre-sponding probability of occurrence when a random keywordis generated from the topic“SSL”.Our tool is able to extract not just the domain topics, but also infrastructure-level topics and cross cutting topics. For instance,“logging”is a topic that cuts acrossfiles and modules.Our tool,based on LDA,is able to cluster together all logging related keywords together as shown in table1(b) that lists the top keywords for topic“Logging”and their corresponding probability values.(a)Topic labeled as SSLKeyword Probabilityssl0.373722expr0.042501init0.033207engine0.026447var0.022222ctx0.023067ptemp0.017153mctx0.013773lookup0.012083modssl0.011238ca0.009548(b)Topic labeled as LoggingKeyword Probabilitylog0.141733request.036017mod0.0311config0.029871name0.023725headers0.021266autoindex0.020037format0.017578cmd0.01512header0.013891add0.012661Table2:Sample Topics extracted from Apache source code5.2Topic Extraction For PetstoreIn order to investigate the effect of naming on topic ex-traction results we considered Petstore,a J2EE blueprint implementation by Sun Microsystems.Being a reference J2EE implementation,it has followed good java naming con-ventions and a large number of identifiers have meaningful names.(a)Topic labeled as Con-tact InformationKeyword Probabilityinfo0.418520contact0.295719email0.050116address0.040159family0.040159given0.036840telephone0.026884by0.000332(b)Topic labeled as Ad-dress InformationKeyword Probabilityaddress0.398992street0.105818city0.055428code0.055428country0.055428zip0.055428name10.050847state0.046267name20.046267end0.005039add0.009548Table3:Sample Topics extracted from petstore source codeAs shown in table2(a)we are able to successfully group all“contact information”related terms together.However, what is more significant in this example is that the top key-words“info”,“contact”are meaningful and indicative of the probable name of the topic.For example if we concatenate these two keywords into“info contact”it can be considered as a valid label for the“contact information”topic.Similarly,even in the case of“address information”topic, shown in table2(b),the concatenation of the top keywords “address”and“street”can be used to label the“address in-formation”topic.It can be observed from the sample topics extracted that good naming convention yields more mean-ingful names thereby simplifying the process of labeling the topics.5.3Synonymy and Polysemy resolutionOne of the key factors in extracting coherent topics and grouping semantically related keywords together is the abil-ity of the algorithm employed to resolve synonymy-different words having the same meaning.We have observed that our tool is able to satisfactorily resolve synonymy to a good ex-tent since LDA models topics in afile and words in a topic using multinomial probability distributions.For instance consider the topic labeled as“transaction”in PostgreSQL shown in table5.3.LDA has identified that“transaction”and“xact”are synonymous and grouped them together in a single cluster as shown below.Keyword Probabilitytransaction0.149284namespace090856commit0.035349xact0.035349visible0.029506current0.029506abort0.026585names0.026585command0.023663start0.020742path0.017821Table4:Transaction and Xact Synonymy resolution by LDAWhat’s more interesting is the fact that our tool has been able to resolve polysemy-same words having different mean-ing in source code.A polyseme can appear in multiple do-main topics depending on the context.The reason for our tool to be able to identify polyseme is not difficult to under-stand.Note that LDA models a topic as a distribution of terms;therefore it is perfectly valid for a term to appear in two topics with different probability values.Furthermore, LDA tries to infer a topic for a given term with the knowl-edge of the context of the word,i.e.the document where the word is appearing.For instance,in Linux-kernel source code we observed that the term“volume”has been used in the context of sound control as well as in the context offile systems.LDA is able to differentiate between these different uses of the term and has grouped the same term in different topics.6.DISCUSSIONIn this section we discuss various factors that impact the results obtained.Subsequently we will discuss benefits and limitations of our approach.6.1Effect of number of TopicsOur approach for topic extraction accepts the number of topics to be extracted as an input from the user.We have observed that varying the number of topics has a signifi-cant impact on polysemy resolution.For instance,consider the example of polysemy resolution of the keyword“volume”in Linux-kernel,discussed in subsection5.3.We have con-ducted our experiment on Linux-kernel source code twice. In both the times we have kept all the parameters,namely α,βthe number of iterations and the cut-offthresholdΨsame except for the number of topics.In thefirst experiment the number of topics T was set to50and in the second ex-periment T was set to60.In both these experiments,of the total topics extracted two topics were“sound”related topics and one topic for“file systems”.Table6.1lists the probabil-ities of keyword“volume”in“sound”and“file systems”topic for both the experiments.Topictype’Volume’probabilityfor Experiment1with T=50’Volume’probabilityfor Experiment2with T=60Soundtopic10.0240.032Soundtopic20.0090.009file sys-temstopic<0.00020.004Table5:Effect of number of topics on polysemy resolution in Linux-kernelIn our experiments we have used a value of the threshold Ψto be0.001for determining whether a keyword belongs to a topic or not.If the probability of a keyword associated with a topic is less than0.001then we do not consider that keyword as indicator of that topic.In view of this,it can be observed from the table6.1that in experiment1the keyword“volume”has a probability of less than0.0002for topic“file systems”.Hence“volume”is associated with only the two sound related topics and not with the“file systems”topic.However,in experiment2when the number of topics was increased to60,the probability of“volume”for topic “file systems”is0.004.This probability is greater than our threshold0.001and hence“volume”is also considered as an indicator for“file systems”topic apart from the sound related topics.The polysemy in the keyword“volume”is revealed only in the second experiment with60topics. 6.2Discovering optimal number of TopicsThe problem of identifying optimal number of topics isnot specific to source code alone.Topic extraction from text doc-uments faces a very similar problem.Griffiths and Steyvers recommend trying different numbers of topics T and suggest using the maximum likelihood method on P(w|T)[10].Ap-plying this technique to extract topics from Linux suggests that the optimal number of topics in the case of Linux is270 as shown infigure2.Figure2:Inferring optimum number of topics for LinuxHowever,automatically inferring the number of topics by maximizing the likelihood is not without problems.。