Building Ontologies for Knowledge Management Applications in Group Sessions
- 格式:pdf
- 大小:470.64 KB
- 文档页数:6
Ontological Representation for Learning ObjectsJian Qin and Christina FinneranSchool of Information StudiesSyracuse UniversitySyracuse, NY 13244 USA+1 315 443 5642{jqin, cmfinner@}ABSTRACTMany of the existing metadata standards use content metadata elements that are coarse-grained representations of learning resources. These metadata standards limit users’ access to learning objects that may be at the component level. The authors discuss the need for component level access to learning resources and provide a conceptual framework of the knowledge representation of learning objects that would enable such access.KeywordsLearning objects, component access, intelligent access, knowledge schemas, metadata, ontologies INTRODUCTIONAs the design of search interfaces advances, digital libraries are witnessing limitations based on the underlying representation of the data. The describing author defines the granularity of a resource statically through using current metadata schemas [8]. The granularity of digital library resources will inevitably limit the user from retrieving finer-grain resources. Our paper will discuss an ontological approach to representing data within a digital library to enable more component level access.Learning objects refer to any entity, digital or non-digital, that can be used, re-used or referenced during technology-supported learning [9]. Broadly speaking, learning resources usually refer to documents or collections, whereas learning objects to the components of a document or collection. However, “learning objects” according to IMS [6] standards refers to any object, regardless of granularity.Within the domain of education digital libraries, one of the important goals of metadata is to enable the retrieval and adaptation of one learning object to another learning situation [1]. Providing the learning objects that the users seek, whatever the granularity, is the essence of contextual design [3] and essential for an effective digital library interface. Representations of learning resources must support the finest-grained level of granularity required by the core technologies as suggested in [7], in addition to application and support technologies. Objects within a learning resource need to be encoded in a way that they can be recognized, searched, referenced, and activated at different granularities. Other researchers have been addressing related technological solutions, such as dynamic metadata and automated component descriptions [2,8]. We will focus on the knowledge representation needs, and introduce a conceptual framework of an ontological approach toward metadata. Finally, we discuss how component-level representation contributes to user-focused interface design.ONTOLOGICAL APPROACH TO METADATAMetadata standards pose different levels of representation granularity, as demonstrated in Figure 1. Dublin Core (DC) [4] provides basic factual description, which is used most commonly in creating collection or resource level metadata. The educational extension of DC specifies contextual factors, such as the resource’s target audience and pedagogical goals. IEEE’s Learning Object Metadata (LOM)/IMS metadata standard defines more specific educational and technical parameters for learning resources [5,6]. These three metadata standards are best situated to represent learning resources at the collection or resource level. To reach a finer-grained level, where components in a resource are represented and correlated, knowledge schemas play an important role in in-depthrepresentation and more refined user access.Figure 1.Representation FrameworkLEAVE BLANK THE LAST 2.5 cm (1”) OF THE LEFTCOLUMN ON THE FIRST PAGE FOR THE COPYRIGHT NOTICE. Through an informal survey of the NSDL collections, we found that search, browsing, and navigation capabilities vary widely depending on the purpose, scope, and subject area of the collection. However, collection or documentlevel metadata dominates all types of searches available. A lack of finer-grained representation is becoming a crippling factor for user interfaces to provide in-depth searching for learning objects.SAMPLE MODEL FOR COMPONENT REPRESENTATIONAn ontological representation defines concepts and relationships up front. It sets the vocabulary, properties, and relationships for concepts, the result of which can be a set of rich schemas. The elements accumulate more meaning by the relationships they hold and the potential inferences that can be made by those relationships. The key advantage of an ontological representation within the realm of learning objects is its ability to handle different granularities. In order to describe learning resources at the collection level (e.g. web site) and further describe each of the components (e.g. interactive applet, image), relationships must be identified when the data are input. Only by having description at the component level will specific learning objects be able to be retrieved by users. Figure 2 demonstrates how even with a seemingly simple laboratory-learning object, fine-grained description of the component level can enable better access. For example, if an instructor is interested in a graph of steam gauge metrics, s/he should be able to search at the component level, rather than having to guess what type of resource(e.g. textbook, lab) might contain such a graph.An ontological model may be created based on the example in Figure 2. Each component in the model is normalized into a group of classes under class Lab. The attributes for Lab include “object subject,” “object URI,” and “parent source,” which are inherited by all its subclasses. The “object content” attribute is local to subclass Formula and also reused in other subclasses. A unique feature in this sample model is the reuse of classes in defining attribute types (e.g., the Hydrogeology class is reused in attribute objectSubject). This model can be converted directly into a Resource Description Framework (RDF) format, which can then be used as the motor behind intelligent navigation and retrieval interfaces. By creating ontologies for learning resources, we will be able to generate a set of knowledgeschemas for buildingknowledge bases andrepositories that can beshared and reused by systemdevelopers.Figure 3. A sample component representation model CONCLUSIONThe goal of education digital library interfaces is to support users, whether they be educators or learners, in accessing useful learning objects. The user will determinewhat is useful, and they also should be given the opportunity to search for components that may be useful.The ontological approach to representing learning objects provides a framework upon which to build more intelligent access within digital libraries.REFERENCES1. Conlan, O, Hockemeyer, C., Lefrere, P., Wade, V. andAlbert, D. Extending Educational Metadata Schemas toDescribe Adaptive Learning Resources. HT 2001,Aarhus, Denmark.2. Damiani, E., Fugini, M.G. and Bellettini, C. AHierarchy-Aware Approach to Faceted Classificationof Object-Oriented Components. ACM Transactions onSoftware Engineering and Methodology 8(3): 215-262,July 1999.3. Dong, A. and Agogino, A.M. Design Principles for theInformation Architecture of a SMET Education DigitalLibrary, JCDL 2001, June24-28, 2001, Roanoke, VA.4. Dublin Core Metadata Initiative 5. IEEE/LTSC. .6.IMS Learning Resource Metadata /metadata/7. Laleuf, J.R. and Spalter, A.M. A ComponentRepository for Learning Objects: A Progress Report.JCDL 2001, June24-28, 2001, Roanoke, VA.8. Saddik, A.E., Fischer, S. and Steinmetz, R. Reusabilityand Adaptability of Interactive Resources in Web-Based Educational Systems. ACM Journal ofEducational Resources in Computing 1(1), Spring2001.9. Wiley, D.A. Learning Object Design and SequencingTheory. Ph.D. Thesis. Brigham Young University,2000.。
Knowledge Engineering Ontologies, Constructivist Epistemology, Computer Rhetoric: A Trivium for the Knowledge Age.Arthur StuttKnowledge Media Institute, The Open University, UKA.Stutt@Abstract: In this (largely speculative) paper I want to suggest that certain ideas whichhave recently been developed by knowledge engineers can fruitfully be applied in theeducational context. I have two main reasons for thinking that this is so: firstly, ideas fromknowledge engineering have been applied in the past with some success; secondly, overthat last five years or so there has been a paradigm shift in knowledge engineering awayfrom a mining-and-transfer view of knowledge acquisition to a constructivist, modellingapproach which has broad similarities to currently prevalent ideas in educational theory.Briefly put, knowledge engineering can provide the conceptual and other tools which canfacilitate the construction of knowledge models. The relevant knowledge engineering ideasare illustrated with a knowledge model from the archaeology domain and the possiblebenefits for learners are outlined.1.IntroductionOur own Middle Ages, it has been said, will be an age of “permanent transition” for which newmethods of adjustment will have to be employed. [Eco 1987 p. 84]In the middle ages, students had to acquire skills in a set of three communicative arts (the trivium of grammar, dialectic and rhetoric) as well as studying the mathematical quadrivium (arithmetic, geometry, astronomy and music). In our postmodern age the variety of sources and types of knowledge has expanded considerably and is continually expanding. In this paper I want to suggest that recent developments in knowledge engineering have much to offer as the basis for a modern trivium which will enable learners to cope with this expansion. This trivium consists of ontological and other (problem solving) categories from knowledge engineering, a basically constructivist rather than mirroring approach to knowledge and, finally, computational models of and computer support for argumentational structures. I want to suggest that, while there are other possibilities for constructivist tools, any such tool intended to aid students in learning reasoning and problem solving skills would do well to incorporate recent knowledge engineering ideas. The paper outlines the use of knowledge engineering in education, introduces constructivist theory, discusses recent developments in knowledge engineering and illustrates these with a knowledge model from the archaeology domain. The benefits for learners and a possible tool are outlined.2.The Use of Knowledge Engineering for LearningGUIDON [Clancey 1982] exemplifies the use of a previously existing knowledge base or expert system as a teaching tool. GUIDON employs the rules from the MYCIN bacterial infection diagnoser to teach problem solving skills. More straightforward uses of expert systems as teaching systems can be exemplified by the work of King and McAulay [King & Mcaulay 1991] who used expert systems as a tool in teaching management students aspects of standard costing. Other researchers have investigated the utility of active participation in knowledge based system (KBS) development. Wideman and Owston [Wideman & Owston 1993] examined the gains in cognitive skills displayed by students involved in creating expert systems for weather prediction while Law and Ogburn [Law & Ogburn 1994] encouraged their students to build rule-bases describing commonsense knowledge about objects and motion. Trollip and Lippert [Trollip & Lippert 1987] used the development of a system for the design of Computer Aided Instruction screens to teach students what was involved in this kind of design. Other uses of expert systems (in Scottish schools) are reported in Conlon and Bowman [Conlon & Bowman 1995] and Conlon and Pain [Conlon & Pain 1996].Wideman and Owston offer qualified support for the "viability and efficacy of knowledge base creation as an instructional strategy for fostering cognitive development" (p. 194). Law and Ogburn point out (p. 511) that building expert systems "provided an opportunity to probe into a completely different aspect of the students'knowledge structures: the epistemic plane of operation represented by the elicited knowledge…" Trollip and Lippert (p. 46) describe the “process [a]s highly motivating” and conclude (p. 47): “The result for the student is not only a refinement of stored knowledge, but a refinement of cognitive skills and the conscious awareness of their extent and limitation”. On the other hand Clark [Clark 1990] declares that neither instruction with computers nor instruction in computing languages “makes any necessary psychological contribution to learning or to the transfer of what is learned during instruction” (p. 267). However, there is a prima facie case for assuming that the immersion of a learner in an environment which differentiates between different problem solving situations and indicates how these are linked to domain contents, will be of some use in overcoming the inadequacies in other approaches to the acquisition of meta-cognitive skills (especially skills involved in the selection of problem solving strategies).3.Constructivism and its ToolsIn the papers discussed above it is generally agreed that when students are given tools which allow them to construct their own versions of some real-life domain in interaction with other students and experts, their interest level and cognitive skills improve. These results are broadly compatible with the constructivist paradigm in educational practice. Constructivism can be seen as the view that meanings and knowledge are constructed by learners actively interpreting their experience [Nicaise & Barnes 1996; Jonassen et al. 1993a; Perkins 1991; Duffy & Jonassen 1991]. While social constructivists focus on this construction as a communal activity [see Prawat & Floden 1994], collaboration is part of constructivism generally. The implication for learning (as Prawat & Floden point out) is that teachers become facilitators for knowledge construction rather than knowledge transmitters: “This notion has led to calls for a dramatic shift in classroom focus away from the traditional transmission model of teaching toward one which is much more complex and interactive.” (p. 37) Many constructivists [e.g. Jonassen et al. 1993a; Knuth & Cunningham 1991] also espouse the idea that students should be expected to cope with “authentic” problems; an idea from research on situated cognition [Brown, Collins & Duguid 1989].The use of constructivist tools has been discussed by a variety of researchers [e.g., Duffy et al. 1991; Duffy & Jonassen 1991; Perkins 1991; Johassen et al. 1993a]. As a constructivist approach to knowledge gains ground over objectivism we can observe two main responses on the part of builders of computer systems for educational use: I will refer to them as the communication-centred and the development-centred.The communication-centred approach is exemplified by Chee's work on the system MIND BRIDGES [Chee 1996]. MIND BRIDGES is a system for "collaborative knowledge building" (p. 137) which allows students to collaborate on the building of a shared representation of their knowledge about some area. There are six "knowledge-building environments" and a range of "thematic spaces". Thematic spaces correspond to traditional disciplinary areas such as physics and literature while the environments (Overview, Explanation, Processes, Application, Conjecture, Location or Time) are included as a means of focusing the students communications on a particular aspect of a problem (such as conjectures about the habitability of Saturn in the astronomy thematic space, to use their example).The work of Johassen and his colleagues [Johassen et al. 1993a] can be described as development-centred. The applications they discuss “include expert systems as feedback facilitators, personal knowledge representation tools and cognitive study tools" (p. 93). As an example of the KBS as computer-based cognitive study tool, Jonassen and his colleagues outline the development of a rule base which models the decision-making used in deciding to authorise the Hiroshima bomb. They conclude (p. 93): “…it is clear that building expert systems requires learners to synthesize knowledge by making explicit their own reasoning”.Systems such as MIND BRIDGES represent a necessary but not sufficient response to the needs of the knowledge age. They provide the means of expressing knowledge models which, as Rowland [Rowland 1995] points out (p. 351), is necessary in the postmodern age. In addition, we need more complex tools for the creation of knowledge models. This need is met in the development-centred approach by coopting knowledge engineering ideas.4.Recent Developments in Knowledge EngineeringAs Conlon and Bowman [Conlon & Bowman 1995] point out (p. 129): “The current generation of educational shells is based on ideas that in wider KBS terms are out of date.” What are these developments? In contrast to earlier views which saw knowledge as a stuff to be extracted and poured into representation formalisms, recent research sees the process as one in which the engineer, in cooperation with the expert andusers, creates multiple models of the problem solving activities in a domain. The KADS knowledge engineering methodology [Schreiber et al. 1993; Tansley & Hayball 1993] is centred on the development of a series of models and is explicitly constructivist, at least in its recent forms [Breuker & Van de Velde 1994].Figure 1: Knowledge modellingUsing ideas from KADS, a knowledge model is constructed in the following manner. From an informal description of a domain and/or some problems associated with it, the modeller selects both a description of the problem to be solved (the problem type) using a library (with types such as Interpretation, Diagnosis and Planning), and a pre-defined template for the contents of the domain (the domain ontology). The domain ontology is composed of a vocabulary for a domain (e.g., for the medical domain, diseases, temperature levels and so on) and a schema [see Schreiber et al. 1993] or way of structuring domain contents in terms of relations (e.g., again in the medical domain, “Bacterial infection causes high temperature”). Using the problem type (and domain ontology), the modeller selects a means of carrying out the processing needed to solve the problem (the problem solving method or PSM) from a library of PSMs (such as Heuristic Classification, Systematic Refinement and Parametric Design). At the same time, using the domain ontology (and problem type), the modeller produces a detailed domain description; i.e. the facts, relations and heuristics which comprise the domain, couched in the vocabulary given by the ontology. Finally, the modeller produces a set of mappings between the problem solving method and the domain description. This is needed since the problem solving method uses its own terminology and these need to be associated with the appropriate parts of the domain description. For example, the problem solving term “symptom” might be associated with assertions of the form “There’s a problem with starting the car” in the car maintenance domain.5.Modelling Components IllustratedIt is possible to construct a knowledge model of the interpretation process used by archaeologists in identifying structures in air photos in terms of a sequence of primitive tasks (in this case, abstraction, matching and specialisation) which operate on a domain described in terms of expected structures (such as Romano-British military remains) and the features extracted from air photos (such as the shape, size, orientation and location of structures). In this case, the problem solution is largely a matter of mapping from the features to the expected structures.Problem Types: The problem type (or task) is given trivially by the description of the problem. In this instance the task is the identification (or classification) of the building structures contained in an air photo. In other archaeological problem solving situations the problem type might be planning (an excavation), assessment (of a theory), verification (of an hypothesis), design (of a series of excavation trenches) and so on. The distinction between problem types enables the learner to think clearly about the different types of activities carried out in a domain.Ontologies: The vocabulary which is relevant to the domain is an intersection of the set of terms for describing air photos (e.g., shadows, linear features, irregular features) and those for describing Romano-British building structures (e.g., domestic, military, commercial, fort, fortlet, camp). The domain schema here goes beyond the vocabulary to capture the possible relations between the domain concepts as given by the terms (e.g., fortlet is-smaller-than fort). The domain schema (and its instances) has much in common with the work on structural knowledge by Jonassen and his colleagues [Jonassen et al. 1993b]. While vocabularies provide the learner with a clear and consistent domain terminology which can be used in collaborative model construction, schemas provide a means of structuring the domain concepts and are part of the conceptual framework of any domain. Domain Descriptions: In accordance with the vocabulary and schema given above the domain description consists of a series of instantiated relations. The construction of these will involve the learner in a process of knowledge acquisition (in the engineering sense). In archaeology, the outcome of the process will be equivalent to the section of an archaeological report which includes full details of some aspect of the archaeological record. In addition, the engineering process assists in the acquisition of various relations between concepts which would not commonly figure in an archaeological report. This immersion of the learner in a domain should ensure that at least this static knowledge is retained.PSM: Given a view of air photo interpretation in which particular features (e.g., rectangular foundations) are evidence for particular archaeological structures when other evidence, such as the relation to other structures, is taken into account, the most likely problem solving family is that of Classification and, in particular, Heuristic Classification, since more than one of its reasoning components are clearly present—lines, areas and features are abstracted into higher level descriptions and descriptions mapped to structures. The basic structure for reasoning using Heuristic Classification is given below in [Fig. 2] which is based on that given in [Tansley & Hayball 1993]. See [Clancey 1985] for a full account. In [Fig. 2] ellipses represent tasks while rectangles represent domain roles (so-called because they indicate the sort of role to be played by domain items in problem solving).Figure 2: The PSM for Heuristic ClassificationWhether constructed or retrieved from libraries, PSMs are especially useful since they are not usually recorded or represented in normal disciplinary discourse. In addition, while PSMs are primarily models of problem solving skills, it is possible to model meta-cognitive processes such as PSM selection and validation since these are also problem solutions.Mappings: In this example, mappings need to be made between the PSM for Heuristic Classification shown in [Fig. 2] and the domain description. In this case there are two mappings needed for the domain roles Observables and Solutions. Here Observables map onto domain relations (such as, “structure11 has_shape square”) while Solutions maps onto relations such as “structure11 is_a auxiliary-fort”. We also need mappings between the three Heuristic Domain Roles, abstraction-knowledge, matching-knowledge and specialisation-knowledge, and domain-specific or generic heuristics. Mappings are useful for the learner in that they show how problem solving knowledge gets a purchase on domain knowledge and how different types of domain role play a variety of roles in problem solving.6.Learning as Knowledge ModellingAs Conlon and Bowman [Conlon & Bowman 1995] point out (following Clancey), “[t]he analogy between pupils and knowledge engineers can be taken too far” (p. 114). While there is a difference between knowledge engineering (where the goal is the production of models which can be used in the design and implementation of a fully working system) and education (where the goal is an increase in the ability of learners to differentiate between types of knowledge, to acquire new knowledge, to communicate about their knowledge and to applytheir skills in real life situations), nonetheless the following are the likely benefits to the learner in making use of knowledge modelling components:1. Knowledge models can be collaboratively developed (i.e. constructivism in practice) with opportunities forstudent role-playing;2. The model components and domain description vocabularies will provide students with a common language(which can be more or less precise depending on their educational level);3. Students interacting with knowledge level models will come to understand that there are many differenttypes of knowledge, that domain descriptions can be used for many different problems and that the same problem solving strategies can be used across a number of domains.4. These models are useful as a means of learning domain contents, problem solving skills, meta-cognitivestrategies, and domain discourse(s).7.Future workI have formulated some theoretical and practical questions which need to be settled before moving to a final version of any system based on the knowledge modelling approach:• How should relations (and other components) be represented (graphically, in natural language, in some formal notation) and how expressive should this notation be?• Does this approach help with the acquisition of problem solving skills and/or meta-cognitive skills such as problem solving strategy selection, or does it only teach modelling skills? That is, to what extent are meta-cognitive processes “only …relevant in the context of a particular learning process” as [Jonassen et al.1993a] claim (p. 92)?• To what extent does the basic modelling component set need to be extended for learner domain modelling?• Do different age groups and learners with different aims need different kinds of knowledge modelling system?• Are learners not able to benefit from theories of problem solving as [Conlon & Bowman 1995] imply (p.114)?In order to resolve at least some of these I intend to produce a new knowledge modelling tool which will combine constructivist ideas, knowledge engineering components and the results of research into computer supported argumentation (thus extending the rhetorical aspects of systems such as MINDBRIDGES). There will be three phases in the development of the tool. In the first, a knowledge modeller (akin to Jonassen’s personal knowledge representation tool) will be developed. This tool will include graphical and textual editors for models composed of the modelling components discussed above along with libraries of PSMs, problem types and domain ontologies, and an animation tool. The second phase will include learner collaboration while, in the third phase, a means of attaching the decision rationales (or arguments) of modellers will be implemented, thus completing our trivium by adding the full rhetorical dimension. This system will have much in common with that discussed in Conlon and Bowman [Conlon & Bowman 1995] and [Conlon & Pain 1996] where classification is selected as the most useful task (here a conflation of my problem type and PSM) to embody in a system for constructing learners’ knowledge bases. The tool I envisage differs in that the full range of modelling components, the entire set of problem types and a library of associated methods will be made available to students. This is partly for flexibility in modelling but also because it is more likely that problem solving and meta-cognitive skills will be acquired if a range of problem types and problem solving methods is encountered.8.ConclusionA tool built using recent ideas from knowledge engineering is not only largely consistent with constructivist ideas, it may provide the means for achieving some of the larger ambitions of constructivist researchers (such as the development of meta-cognitive skills). I say “largely consistent” since I am unsure to what extent the modelling framework provided by knowledge engineering is flexible enough for truly constructivist modelling. I suspect however that some structure (and therefore inflexibility) is needed if an environment is to act as a modelling environment. Given this, it is largely up to the learner to acquire the modelling and other cognitive skills necessary for the effective use of the new tool just as medieval students had to learn logic, rhetoric and grammar. The question of whether knowledge engineering provides the appropriate structure will be resolved when a system based on these ideas is built and tried out by learners.References[Breuker & Van de Velde 1994] Breuker, J. A. & Van de Velde, W. (1994). eds. The CommonKADS Library for Expertise Modelling, Amsterdam: IOS Press.[Brown, Collins & Duguid 1989] Brown, J.S., Collins, A. & Duguid, P. (1989). Situated Cognition and the Culture of Learning. Educational Researcher, 18(1), 32–42.[Chee 1996] Chee, Y.S. (1996). MIND BRIDGES: A Distributed, Multimedia Learning Environment for Collaborative Knowledge Building. International Journal of Educational Telecommunications, 2(2/3), 137-153.[Clancey 1982] Clancey, W.J. (1982). Tutoring rules for guiding a case method dialogue. In Sleeman, D. & Brown, J.S. eds. Intelligent Tutoring Systems. London: Academic Press. 201–225.[Clancey 1985] Clancey, W.J. (1985). Heuristic Classification. Artificial Intelligence, 27, 289–350.[Clark 1990] Clark, R.E. (1990). Facilitating Domain-General Problem Solving: Computers, Cognitive Processes and Instruction. In de Corte, E., Linn, M.C., Mandl, H. & Verschaffel, L., eds. Computer-Based Learning Environments and Problem Solving. Berlin: Springer-Verlag. 265–285.[Conlon & Bowman 1995] Conlon, T. & Bowman, N. (1995). Expert Systems, Shells, and Schools: Present Practice, Future Prospects. Instructional Science, 23, 111–131.[Conlon & Pain 1996] Conlon, T. & Pain, H. (1996). Persistent Collaboration: A Methodology for Applied AIED. Journal of Artificial Intelligence in Education, 7(3/4), 219–252.[Duffy & Jonassen 1991] Duffy, T.M. & Jonassen, D.H. (1991). Constructivism: New Implications for Instructional Technology? Educational Technology, 31(5), 7–12.[Duffy et al. 1991] Duffy, T.M., Lowyck, J., Jonassen, D.H. & Welsh, T.M. (1991). Designing Environments for Constructive Learning. Berlin: Springer-Verlag.[Eco 1987] Eco, U. (1987). Travels in Hyperreality: Essays. London: Picador-Pan.[Jonassen et al. 1993a] Jonassen, D.H., Wilson, B.G., Wang, S. & Grabinger, R.S. (1993). Constructivist Uses of Expert Systems to Support Learning. Journal of Computer-Based Instruction, 20(3), 86-94.[Jonassen et al. 1993b] Jonassen, D.H., Beissner, K. and Yacci, M. (1993). Structural Knowledge: Techniques for Representing, Conveying, and Acquiring Structural Knowledge. Hillsdale, NJ: Lawrence Erlbaum.[King & McAulay 1991] King, M. and McAulay, L. (1991). Experiments with expert systems in management education. Journal of Information Technology, 6, 34–38.[Knuth & Cunningham 1991] Knuth, R.A. and Cunningham, D.J. (1991). Tools for Constructivism. In [Duffy et al. 1991]. 163–188.[Law & Ogburn 1994] Law, N. and Ogburn, J. (1994). Students as Expert System Developers: A Means of Eliciting and Understanding Commonsense Reasoning. Journal of Research on Computing in Education, 26(4), 497-513. [Nicaise & Barnes 1996] Nicaise, M. & Barnes, D. (1996). The union of technology, constructivism, and teacher education. Journal of Teacher Education, 47, 205–212.[Perkins 1991] Perkins, D.N. (1991). Technology meets Constructivism: Do they make a marriage? Educational Technology, 31(5), 18–23.[Prawat & Floden 1994] Prawat, R.S. & Floden, R.E. (1994). Philosophical Perspectives on Constructivist Views of Learning. Educational Psychology, 29(1), 37–48.[Rowland 1995] Rowland, R.C. (1995). In Defense of Rational Argument: A Pragmatic Justification of Argumentation Theory and Response to the Postmodern Critique. Philosophy and Rhetoric, 28(4), 350–364.[Schreiber et al. 1993] Schreiber, G., Wielinga, B. & Breuker, J, eds. (1993). KADS: A Principled Approach to Knowledge-Based System Development, London: Academic Press.[Tansley & Hayball 1993] Tansley, D.S.W. & Hayball, C.C. (1993). Knowledge-Based Systems Analysis and Design: A KADS Developer's Handbook. Hemel Hempstead: Prentice Hall.[Trollip & Lippert 1987] Trollip, S.R. & Lippert, R.C. (1987). Constructing Knowledge Bases: A Promising Instructional Tool. Journal of Computer-Based Instruction, 14(2), 44–48.[Wideman & Owston 1993] Wideman, H.H. & Owston, R.D. (1993). Knowledge Base Construction as a Pedagogical Activity. J. Educational Computing Research, 9(2), 165-196.AcknowledgementsMany thanks to Enrico Motta and Stuart Watt from the Knowledge Media Institute for commenting on earlier versions of this paper. I have incorporated many of their suggestions.。
Logics for Knowledge RepresentationBernhard NebelAlbert-Ludwigs-Universit¨a t Freiburg,Germany1IntroductionKnowledge representation and reasoning plays a central role in Artificial Intelli-gence.Research in Artificial Intelligence(henceforth AI)started off by trying to identify the general mechanisms responsible for intelligent behavior.However,it quickly became obvious that general and powerful methods are not enough to get the desired result,namely,intelligent behavior.Almost all tasks a human can per-form which are considered to require intelligence are also based on a huge amount of knowledge.For instance,understanding and producing natural language heavily relies on knowledge about the language,about the structure of the world,about social relationships etc.One way to address the problem of representing knowledge and reasoning about it is to use some form of logic.While this seems to be a natural choice,it took a while before this“logical point of view”became the prevalent approach in the area of knowledge representation.Below,we will give a brief sketch of how thefield of knowledge representation evolved and what kind logical methods have been used. Int.Encyc.Social and Behavioral Sciences27February2001In particular,we will argue that the important point about using formal logic is the logical method.2Logic-Based Knowledge Representation:A Historical AccountMcCarthy(1968)stated very early on that mathematical,formal logic appears to be a promising tool for achieving human-level intelligence on computers.In fact, this is still McCarthy’s(2000)vision,which he shares with many researchers in AI.However,in the early days of AI,there were also a number of researchers with a completely different opinion.Minsky(1975),for example,argued that knowl-edge representation formalisms should beflexible and informal.Moreover,he ar-gued that the logical notions of correctness and completeness are inappropriate in a knowledge representation context.While in those days heated arguments of the suitability of logic were exchanged,by the end of the eighties,the logical perspective seem to have gained the upper hand (Brachman1990).During the nineties almost all research in the area of knowledge representation and reasoning was based on formal,logical methods as demonstrated by the papers published in the bi-annual international conference on Principles of Knowledge Representation and Reasoning,which started in1989.It should be noted,however,that two perspectives on logic are possible.Thefirst perspective,taken by McCarthy(1968),is that logic should be used to represent knowledge.That is,we use logic as the representational and reasoning tool insidethe computer.Newell(1982)on the other hand proposed in his seminal paper on the knowledge level to use logic as a formal tool to analyze knowledge.Of course, these two views are not incompatible.Furthermore,once we accept that formal logic should be used as a tool for analyzing knowledge,it is a natural consequence to use logic for representing knowledge and for reasoning about it as well.3Knowledge Representation Formalisms and Their SemanticsSaying that logic is used as the main formal tool does not say which kind of logic is used.In fact,a large variety of logics(Gabbay,Hogger and Robinson1995)have been employed or developed in order to solve knowledge representation and rea-soning problems.Often,one started with a vaguely specified problem,developed some kind knowledge representation formalism without a formal semantics,and only later started to provide a formal ing this semantics,one could then analyze the complexity of the reasoning problems and develop sound and complete reasoning algorithms.I will call this the logical method,which proved to be very fruitful in the past and has a lot of potential for the future.3.1Description LogicsOne good example for the evolution of knowledge representation formalisms is the development of description logics,which have their roots in so-called struc-tured inheritance networks formalisms such as KL-ONE(Brachman1979).These networks were originally developed in order represent word meanings.A conceptnode connects to other concept nodes using roles.Moreover,the roles could be structured as well.These networks permits for,e.g.,the definition of the concept of a bachelor.Later on,these structured inheritance networks were formalized as so-called con-cept languages,terminological logics,or description logics.Concepts were inter-preted as unary predicates,roles as binary relations,and the connections between nodes as so-called value restrictions.This leads for most such description logics to a particular fragment offirst-order predicate logic,namely,the fragment.In this fragment only two different variable symbols are used.As it turns out,this is a decidable fragment offirst-order logic.However,some of the more involved description logics go beyond.They con-tain,e.g.,relational composition or transitive closure.As it turns out,such descrip-tion logics can be understood as variants of multi-modal logics(Schild1991),and decidability and complexity results from these multi-modal logics carry over to the description logics.Furthermore,description logics are very close to feature log-ics as they are used in unification-based grammars.In fact,description logics and feature logics can be viewed as members of the same family of representation for-malisms(Nebel and Smolka1990).All these insights,i.e.,determination of decidability and complexity as well as the design of decision algorithms(e.g.Donini,Lenzerini,Nardi and Nutt1991),are based on the rigorous formalization of the initial ideas.In particular,it is not justone logic that it is used to derive these results,but it is the logical method that led to the success.One starts with a specification of how expressions of the language or formalism have to be interpreted in formal terms.Based on that one can specify when a set of formulae logically implies a formula.Then one can start tofind similar formalisms(e.g.modal logics)and prove equivalences and/or one can specify a method to derive logically entailed sentences and prove them to be correct and complete.3.2Nonmonotonic LogicsAnother interesting area where the logical method has been applied is the devel-opment of the so-called non-monotonic logics.These are based on the intuition that sometimes a logical consequence should be retracted if new evidence becomes known.For example,we may assume that our car will not be moved by somebody else after we have parked it.However,if new information becomes known,such as the fact that the car is not at the place where we have parked it,we are ready to drop the assumption that our car has not been moved.This general reasoning pattern was used quite regularly in early AI systems,but it took a while before it was analyzed from a logical point of view.In1980,a special issue of the Artificial Intelligence journal appeared,presenting different ap-proaches to non-monotonic reasoning,in particular Reiter’s(1980)default logic and McCarthy’s(1980)circumscription approach.A disappointing fact about nonmonotonic logics appears to be that it is very difficult to formalize a domain such that one gets the intended conclusions.In particular,in the area of reasoning about actions,McDermott(1987)has demonstrated that the straightforward formalization of an easy temporal projection problem(the“Yale shooting problem”)does not lead to the desired consequences.However,it is pos-sible to get around this problem.Once all underlying assumptions are spelled out, this and other problem can be solved(Sandewall1994).It took more than a decade before people started to analyze the computational com-plexity(of the propositional versions)of these logics.As it turned out,these log-ics are usually somewhat more difficult than ordinary propositional logic(Gottlob 1992).This,however,seems tolerable since we get much more conclusions than in standard propositional logic.Right at the same time,the tight connection between nonmonotonic logic and belief revision(G¨a rdenfors1988)was noticed.Belief revision–modeling the evolution of beliefs over time–is just one way to describe how the set of nonmonotonic consequences evolve over time,which leads to a very tight connection on the formal level for these two forms of nonmonotonicity(Nebel1991).Again,all these results and insights are mainly based on the logical method to knowledge representation.4OutlookThe above description of the use of logics for knowledge representation is nec-essarily incomplete.For instance,we left out the area of qualitative temporal and spatial reasoning completely.Nevertheless,one should have got an idea of how log-ics are used in the area of knowledge representation.As mentioned,it is the idea of providing knowledge representation formalisms with formal(logical)semantics that enables us to communicate their meaning,to analyze their formal properties, to determine their computational complexity,and to devise reasoning algorithms.While the research area of knowledge representation is dominated by the logical approach,this does not mean that all approaches to knowledge representation must be based on logic.Probabilistic(Pearl1988)and decision theoretic approaches, for instance,have become very popular lately.Nowadays a number of approaches aim at unifying decision theoretic and logical accounts by introducing a qualita-tive version of decision theoretic concepts(Benferhat,Dubois,Fargier,Prade and Sabbadin2000).Other approaches(Boutilier,Reiter,Soutchanski and Thrun2000) aim at tightly integrating decision theoretic concepts such as Markov decision pro-cesses with logical approaches,for instance.Although this is not pure logic,the two latter approaches demonstrate the generality of the logical method:specify the formal meaning and analyze!BibliographyAllen,J. A.,Fikes,R.and Sandewall, E.(eds):1991,Principles of Knowledge Representation and Reasoning:Proceedings of the2nd International Conference(KR-91),Morgan Kaufmann,Cambridge,MA.Benferhat,S.,Dubois,D.,Fargier,H.,Prade,H.and Sabbadin,R.:2000,Decision, nonmonotonic reasoning and possibilistic logic,in Minker(2000),pp.333–360. Boutilier,C.,Reiter,R.,Soutchanski,M.and Thrun,S.:2000,Decision-theoretic,high-level agent programming in the situation calculus,Proceedings of the17th National Conference of the American Association for Artificial Intelligence(AAAI-2000),MIT Press,Austin,TX.Brachman,R.J.:1979,On the epistemological status of semantic networks,in N.V.Findler (ed.),Associative Networks:Representation and Use of Knowledge by Computers, Academic Press,New York,NY,pp.3–50.Brachman,R.J.:1990,The future of knowledge representation,Proceedings of the8th National Conference of the American Association for Artificial Intelligence(AAAI-90),MIT Press,Boston,MA,pp.1082–1092.Donini,F.M.,Lenzerini,M.,Nardi,D.and Nutt,W.:1991,The complexity of concept languages,in Allen,Fikes and Sandewall(1991),pp.151–162.Gabbay,D.M.,Hogger,C.J.and Robinson,J.A.(eds):1995,Handbook of Logic in Artificial Intelligence and Logic Programming–Vol.1–5,Oxford University Press, Oxford,UK.G¨a rdenfors,P.:1988,Knowledge in Flux—Modeling the Dynamics of Epistemic States, MIT Press,Cambridge,MA.Gottlob,G.:1992,Complexity results for nonmonotonic logics,Journal for Logic and Computation2(3),397–425.McCarthy,J.:1968,Programs with common sense,in M.Minsky(ed.),Semantic Information Processing,MIT Press,Cambridge,MA,pp.403–418.McCarthy,J.:1980,Circumscription—a form of non-monotonic reasoning,Artificial Intelligence13(1–2),27–39.McCarthy,J.:2000,Concepts of logical AI,in Minker(2000),pp.37–58. McDermott,D.V.:1987,A critique of pure reason,Computational Intelligence3(3),151–160.Minker,J.(ed.):2000,Logic-Based Artificial Intelligence,Kluwer,Dordrecht,Holland. Minsky,M.:1975,A framework for representing knowledge,in P.Winston(ed.),The Psychology of Computer Vision,McGraw-Hill,New York,NY,pp.211–277. Nebel,B.:1991,Belief revision and default reasoning:Syntax-based approaches,in Allen et al.(1991),pp.417–428.Nebel,B.and Smolka,G.:1990,Representation and reasoning with attributive descriptions, in K.-H.Bl¨a sius,U.Hedtst¨u ck and C.-R.Rollinger(eds),Sorts and Types in Artificial Intelligence,V ol.418of Lecture Notes in Artificial Intelligence,Springer-Verlag, Berlin,Heidelberg,New York,pp.112–139.Newell,A.:1982,The knowledge level,Artificial Intelligence18(1),87–127.Pearl,J.:1988,Probabilistic Reasoning in Intelligent Systems:Networks of Plausible Inference,Morgan Kaufmann,San Francisco,CA.Reiter,R.:1980,A logic for default reasoning,Artificial Intelligence13(1),81–132. Sandewall,E.:1994,Features and Fluents,Oxford University Press,Oxford,UK. Schild,K.:1991,A correspondence theory for terminological logics:Preliminary report, Proceedings of the12th International Joint Conference on Artificial Intelligence (IJCAI-91),Morgan Kaufmann,Sydney,Australia,pp.466–471.。
Ontology-based Reasoning about Lexical ResourcesJan Scheffczyk,Collin F.Baker,Srini NarayananInternational Computer Science Institute1947Center St.,Suite600,Berkeley,CA,94704jan,collinb,snarayan@AbstractReasoning about natural language most prominently requires combining semantically rich lexical resources with world knowledge, provided by ontologies.Therefore,we are building bindings from FrameNet–a lexical resource for English–to various ontologies depending on the application at hand.In this paper we show thefirst step toward such bindings:We translate FrameNet to the Web Ontology Language OWL DL.That way,FrameNet and its annotations become available to Description Logic reasoners and other OWL tools.In addition,FrameNet annotations can provide a high-quality lexicalization of the linked ontologies.1.IntroductionCombining large lexical resources with world knowledge, via ontologies,is a crucial step for reasoning over natu-ral language,particularly for the Semantic Web.Concrete applications include semantic parsing,text summarization, translation,and question answering.For example,ques-tions like“Could Y have murdered X?”may require sev-eral inference steps based on semantic facts that simple lexicons do not include.Moreover,they require so-called open-world semantics offered by state-of-the art Descrip-tion Logic(DL)reasoners,e.g.,FaCT(Horrocks,1998) or Racer(Wessel and M¨o ller,2005).The FrameNet lex-icon(Ruppenhofer et al.,2005)has a uniquely rich level of semantic detail;thus,we are building bindings from FrameNet to multiple ontologies that will vary depending on the application.That way,we enable reasoners to make inferences over natural-language text.In this paper,we report on thefirst step toward this goal:we have automatically translated a crucial portion of FrameNet to OWL DL and we show how state-of-the-art DL reasoners can make inferences over FrameNet-annotated sentences. Thus,annotated text becomes available to the Semantic Web and FrameNet itself can be linked to other ontologies. This work gives a clear motivation for the design of our pro-posed ontology bindings and defines the baseline for mea-suring their benefits.This paper proceeds as follows:In Sect.2.we briefly intro-duce FrameNet–a lexical resource for English.We present our design decisions for linking FrameNet to ontologies in Sect.3.Sect.4.includes the heart of this paper:A formal-ization of FrameNet and FrameNet-annotated sentences in OWL DL.In Sect.5.we show how our OWL DL represen-tation can be used by the DL reasoner RacerPro in order to implement tasks of a question answering system,based on reasoning.We evaluate our approach in Sect.6.Sect.7. concludes and sketches directions for future research.2.The FrameNet Lexicon FrameNet is a lexical resource for English,based on frame semantics(Fillmore,1976;Fillmore et al.,2003; Narayanan et al.,2003).A semantic frame(hereafter sim-ply frame)represents a set of concepts associated with an event or a state,ranging from simple(Arriving,Placing)to complex(Revenge,Criminalaffect(which in turn inherits from the frames Transitiveact).In addition,Attack uses the frame Hos-tileact frame.3.Linking FrameNet to Ontologies forReasoningNLP applications using FrameNet require knowledge about the possiblefillers for FEs.For example,a semantic frame parser needs to know whether a certain chunk of text(or a named entity)might be a properfiller for an FE–so it will check whether thefiller type of the FE is compatible with the type of the named entity.Therefore,we want to provide constraints onfillers of FEs,so-called semantic types(STs). Currently,FrameNet itself has defined about40STs that are ordered by a subtype hierarchy.For example,the Assailant FE and the Victim FE in the Attack frame both have the ST Sentient,which in turn is a subtype of Animatething,Physical entity. It is obvious that FrameNet STs are somewhat similar to the concepts(often called classes)defined in ontologies like SUMO(Niles and Pease,2001)or Cyc(Lenat,1995). Compared to ontology classes,however,FrameNet STs are much more shallow,have fewer relations between them(we only have subtyping and no other relations),and are notFigure1:Abridged example frame Attack and some connected frames.context specific.Naturally,in a lexicographic project like FrameNet STs play a minor role only.Therefore,we want to employ the STs from existing large ontologies such as SUMO or Cyc;in this way we will gain a number of advantages almost for free:AI applications can use the knowledge provided by the target ontology.We can provide different STs suitable for particular applications by bindings to different ontologies.We can use ontologies in order to query and analyze FrameNet data.For example,we can measure the se-mantic distance between frames based on different tar-get ontologies or we can check consistency and com-pleteness of FrameNet w.r.t.some target ontology.The target ontologies would benefit from FrameNet, supplementing their ontological knowledge with a proper lexicon and annotated example sentences. Compared to other lexicon-ontology bindings(Niles and Pease,2003;Burns and Davis,1999),our bindings offer a range of advantages due to specific FrameNet character-istics:FrameNet models semantic and syntactic valences plus the predicate-argument structure.FrameNet includes many high-quality annotations,providing training data for machine learning.In contrast to WordNet synset annota-tions,our annotations include role labelling.Frame seman-tics naturally provides cross-linguistic abstraction plus nor-malization of paraphrases and support for null instantiation (NI).Notice that a detour via WordNet would introduce ad-ditional noise through LU lookup(Burchardt et al.,2005). In addition,WordNet synset relations are not necessarily compatible with FrameNet relations.The bindings from FrameNet to ontologies should be de-scribed in the native language of the target ontologies,i.e., KIF(for bindings to SUMO),CycL(for bindings to Cyc), or OWL(for bindings to OWL ontologies).This allows the use of standard tools like reasoners directly,without any intermediate steps.Also,arbitrary class expressions can be used and ad-hoc classes can be defined if no exact corre-sponding class could be found in the target ontology.We expect this to be very likely because FrameNet is a lexico-graphic project as opposed to ontologies,which are usually driven by a knowledge-based approach.Finally,the bind-ing should be as specific as possible for the application at hand.For example,in a military context we would like to bind FEs to classes in an ontology about WMD or terror-ism instead of using a binding to SUMO itself,which only provides upper level classes.2The vital precondition for any such bindings is,however, to have FrameNet available in an appropriate ontology language(e.g.,KIF,CycL,or OWL).A representation of FrameNet in an ontology language bears the additional ad-vantages of formalizing certain properties of frames and FEs,and enabling us to use standard tools to view,query, and reason about FrameNet data.For querying,one could, e.g.,use the ontology query language SPARQL.Next,we describe a formalization of a portion of FrameNet in OWL DL,which easily generalizes to more expressive ontology languages like KIF or CycL.4.Formalizing FrameNet in OWL DL Our major design decisions for representing FrameNet as an ontology are:1.to represent frames,FEs,and STs formally as classes,2.to model relations between frames and FEs via exis-tential property restrictions on these classes,and3.to represent frame and FE realizations in FrameNet-annotated texts as instances of the appropriate frame and FE classes,respectively.Building on(Narayanan et al.,2003),we have chosen OWL DL as representation language mainly because better tools are available for it(particularly for reasoning)than for OWL Full or other similarly expressive languages.Our representation differs from many WordNet OWL represen-tations,which represent synsets as instances and hence can-not use class expressions for ontology bindings.3Instead, WordNet bindings to SUMO employ a proprietary mecha-nism,4which cannot be used“out of the box”by ontology tools like reasoners.In order to keep the size of our ontology manageable, we have chosen to split it into the FrameNet Ontology and Annotation Ontologies.The FrameNet Ontology in-cludes FrameNet data like frames,FEs,and relations be-tween them.Annotation Ontologies represent FrameNet-annotated sentences and include parts of the FrameNet On-tology that are necessary.4.1.The FrameNet OntologyFig.2shows a simplified excerpt of the FrameNet On-tology.The subclasses of the Syntax class are used for annotations and are connected to frames and FEs via the evokes andfillerOf relations,respectively.Frames and FEs are connected via binary relations,e.g.,the usesF prop-erty or the hasFE property,which connects a frame to its FEs.Consider our example frame Attack,which inher-its from the frame Intentionallyencounter.We model frame and FE inheritance via subclassing and other frame and FE relations via existen-tial property restrictions(owl:someValuesFrom).Thus,the class Attack is a subclass of Intentionallyencounter connected via the usesF property.The FEs of Attack are connected via an ex-istential restriction on the hasFE property.FE relations are modeled similarly to frame relations.Recall that class restrictions are inherited.Therefore,the class Attack inherits the restrictions hasFE Patient and hasFE Agent from the class Intentionallyaffect has exactly one instance of type Patient con-nected via the hasFE property:Intentionallyaffect hasFE or Intentionally5see /2001/sw/BestPractices/OEP/QCR/6Even now,the FrameNet Ontology reaches a critical size of 100,000triples.class Sentient.We intend to use this mechanism for link-ing FrameNet to other ontologies also.So we can use ar-bitrary OWL DL class expressions for our bindings and at the same time achieve a homogeneous formal representa-tion that OWL tools can make use of.One could use the FrameNet Ontology for querying and reasoning over FrameNet itself.For reasoning over natu-ral language text,however,we mustfind a way to incorpo-rate this text into the FrameNet Ontology.We do this by means of Annotation Ontologies,which we generate from FrameNet-annotated text.4.2.Annotation OntologiesFrameNet-annotated text provides textual realizations of frames and FEs,i.e.,the frames and FEs cover the se-mantics of the annotated sentences.In ontological terms, FrameNet-annotated text constitutes instances of the appro-priate frame and FE classes,respectively.From an anno-tated sentence we generate an Annotation Ontology,which includes parts of the FrameNet Ontology and fulfills all its class restrictions.In other words,the FrameNet Ontology provides a formal specification for Annotation Ontologies. Consider an example sentence,which we derived from an evaluation exercise within the AQUINAS project called “KB Eval;”where sentences for analysis were contributed by various members of the consortium.S48Kuwaiti jetfighters managed to escape the Iraqi invasion.7This sentence has three annotation sets:1.The target word invasion evokes the Attack frame,where Iraqifills the Assailant FE.The Victim FE has nofiller,i.e.,it is null instantiated(NI).2.The target word escape evokes the Avoiding frame,with FEfillers48Kuwaiti jetfighters Agent,the Iraqi invasion Undesirableaction frame,with FEfillers48Kuwaiti jetfightersProtagonist,to escape the Iraqi invasion Goal. From this annotated sentence wefirst create a syntactic de-pendency graph and generate the appropriate frame and FE instances as shown in Fig.3A Span represents a chunk of text that can evoke a frame or provide afiller for an FE. We derive Spans,syntactic subsumption,and the relations to frames and FEs based on the annotations.For example, invasion evokes the Attack frame.Thus we(1)generate a Span that represents the text invasion and place it properly into the Span dependency graph,(2)generate the frame in-stance Attack S(with type Attack),and(3)connect the Span to Attack S via the evokes property.We proceed similarly with the FEfiller Iraqi Agent.Here we generate the FE instance Agent S,connect it to its frame instance Attack S via the hasFE property,and connect the Span representing Iraqi to Agent S via thefillerOf property.Finally,we iden-tify FEs that are evoked by the same Span via owl:sameAs.Figure2:Part of the FrameNet Ontology for the Attack frame and some connected frames.Figure3:Annotation Ontology for:48Kuwaiti jetfighters managed to escape the Iraqi invasion.(Step1)We can do this purely based on syntactic evidence.For ex-ample,the FE instances Protagonist S and Agent S are iden-tified because the are bothfilled by the Span representing the text48Kuwaiti jetfighters.This significantly aids rea-soning about FrameNet-annotated text.8The second step in generating an Annotation Ontology is to satisfy the class restrictions of the FrameNet ontology, i.e.,to generate appropriate instances and to connect them properly.Thus,for a frame instance of type we1.travel along each existential class restriction on a prop-erty to a class(),2.generate an instance of type,3.connect the instances and via the property,and4.proceed with instance.Fig.4illustrates this algorithm for our example frame instance Attack.We generate the frame instance Hos-tile1S and Sideencounter S via usesF.Sim-ilarly,we connect Assailant S to Side2S via usesFE.In addition,we identify the connected FE instances via owl:sameAs,which expresses the seman-tics of FE mappings:The Victim in an Attack is the Side encounter,i.e.,theirfillers are the same.In addition to the class restrictions,we also travel along the inheritance hierarchy,which could be useful,e.g.,for paraphrasing.Therefore,we generate the instance Inten-tionally8Alternatively,we could formalize a SWRL rule fillerOffillerOf owl:sameAs.We do not do so because not all reasoners provide a SWRL implementa-tion.9see succeeds,then the Spans bound to these FEs contain the an-swer,otherwise the question cannot be answered from the text.Consider three example questions.Q1How many Kuwaiti jetfighters escaped the Iraqi inva-sion?Q2How many Kuwaiti jetfighters escaped?Q3Did Iraq clash with Kuwait?Q4Was there a conflict between Iraq and Kuwait? Partial Annotation Ontologies for these questions are illus-trated in Fig.5.Given the Annotation Ontology of the question,we let Rac-erPro perform the following queries,which can be formal-ized in nRQL.10In the following we will use question Q1 as an example of how the algorithm works.1.For the question get the evoked frames instances,theirFEs,and Spans.Avoiding Q1Undesirables.Q1Undesirables.Undesirables.S the Iraqi invasionAgent S48Kuwaiti jetfightersAssailant S IraqiVictim S NIFigure4:Connecting the Attack instance(Step2of Annotation Ontologygeneration) Figure5:Abridged Annotation Ontologies for example questionsSince RacerPro is a reasoner(and no NLP tool), checking the compatibility of Spans is limited to checking syntactic equality.Therefore,the Span48 Kuwaiti jetfighters does not match the Span How many Kuwaiti jetfighters.We can,however,easily de-termine the Spans that are supposed to be compatible in order to yield an answer.Then Span compatibility can be determined by other NLP tools such as question type recognizers.Question Q2is simpler than Q1because we are asking for only one frame in which one FE is null instantiated.In this case our approach only using a reasoning engine yields the final answer:Undesirableencounter.Our ap-proach proceeds as follows:1.Get evoked frames instances,FEs,and Spans:Hostile1Q3IraqSideencounter Q3Hostile1Q3Side2Q3Sideencounter Hostile1Side2Side1S IraqiSide1S isthe same as Assailant S and Side1and Side1and Side1and Side11see /services/named entity recognizers.Also,we plan to evaluate the utility of DL reasoners in a fullyfledged question answer-ing system.Finally,we will translate FrameNet to other ontology languages such as KIF or CycL,in order to link FrameNet to SUMO or Cyc ontologies.AcknowledgementsThefirst author enjoys funding from the German Aca-demic Exchange Service(DAAD).The FrameNet project is funded by the AQUINAS project of the AQUAINT pro-gram.8.ReferencesA.Burchardt,K.Erk,and A.Frank.2005.A WordNet detour to FrameNet.In Proceedings of the GLDV2005 Workshop GermaNet II,Bonn.K.J.Burns and A.R.Davis.1999.Building and maintain-ing a semantically adequate lexicon using cyc.In Eve-lyne Viegas,editor,Breadth and Depth of Semantic Lex-icons.Kluwer.K.Erk and S.Pad´o.2005.Analysing models for semantic role assignment using confusability.In Proceedings of HLT/EMNLP-05,Vancouver,Canada.K.Erk and S.Pad´o.2006.Shalmaneser–a toolchain for shallow semantic parsing.In Proceedings of LREC-06, Genova,Italy.to appear.C.J.Fillmore,C.R.Johnson,and M.R.L.Petruck.2003. Background to FrameNet.International Journal of Lex-icography,16(3):235–250.C.J.Fillmore.1976.Frame semantics and the nature of language.Annals of the New York Academy of Sciences, (280):20–32.I.Horrocks.1998.The FaCT system.In H.de Swart, editor,Automated Reasoning with Analytic Tableaux and Related Methods:International Conference Tableaux’98,number1397in Lecture Notes in Artificial Intelligence,pages307–312.Springer-Verlag,May. D.B.Lenat.1995.Cyc:a large-scale investment in knowl-edge mun.ACM,38(11):33–38.K.Litowski.2004.Senseval-3task:Automatic labeling of semantic roles.In Senseval-3:Third International Work-shop on the Evaluation of Systems for the Semantic Anal-ysis of Text,pages9–12.Association for Computational Linguistics.S.Narayanan and S.McIlraith.2003.Analysis and simu-lation of Web works,42(5):675–693.S.Narayanan,C.F.Baker,C.J.Fillmore,and M.R.L. Petruck.2003.FrameNet meets the semantic web:Lexi-cal semantics for the web.In The Semantic Web—ISWC 2003,pages771–787.Springer-Verlag,Berlin.S.Narayanan.1999.Moving right along:A computational model of metaphoric reasoning about events.In Pro-ceedings of the/National Conference on Artificial Intel-ligence(AAAI’99),pages121–128.AAAI Press.I.Niles and A.Pease.2001.Towards a standard upper on-tology.In Proceedings of the2nd International Confer-ence on Formal Ontology in Information Systems(FOIS-2001),Ogunquit,Maine.I.Niles and A.Pease.2003.Linking lexicons and ontolo-gies:Mapping wordnet to the suggested upper merged ontology.In Proceedings of the2003International Conference on Information and Knowledge Engineering (IKE03).J.Ruppenhofer,M.Ellsworth,M.R.Petruck, and C.R.Johnson,2005.FrameNet: Theory and Practice.ICSI Berkeley. /framenet/book/book.html.J.Scheffczyk and M.Ellsworth.2006.Improving the qual-ity of FrameNet.In Proc.of the Wkshp.on Quality assurance and quality measurement for language and speech resources,Genoa,Italy.to appear.M.Wessel and R.M¨o ller.2005.A high performance se-mantic web query answering engine.In Proc.Interna-tional Workshop on Description Logics.。
Ontologies and the Configuration of Problem-Solving MethodsRudi Studer1,Henrik Eriksson2,John Gennari3,Samson Tu3, Dieter Fensel1,4,and Mark Musen31Institute AIFB, University of Karlsruhe, D-76128 Karlsruhee-mail: studer@aifb.uni-karlsruhe.de2Department of Computer and Information Science, Linköping University, S-58183 Linköpinge-mail: her@ida.liu.se3Section on Medical Informatics, Knowledge Systems Laboratory, Stanford University School of Medicine, Stanford, CA 94305-5479, USAe-mail: {gennari,tu,musen}@4Department SWI, University of Amsterdam, NL-1018 WB Amsterdame-mail: dieter@swi.psy.uva.nlAbstractProblem-solving methods model the problem-solving behavior of knowledge-based systems. The PROTÉGÉ-II framework includes a library of problem-solving methods that can be viewed as reusable components. For developers to use these components as building blocks in the construction of methods for new tasks, they must configure the components to fit with each other and with the needs of the new task. As part of this configuration process, developers must relate the ontologies of the generic methods to the ontologies associated with other methods and submethods. We present a model of method configuration that incorporates the use of several ontologies in multiple levels of methods and submethods, and we illustrate the approach by providing examples of the configuration of the board-game method.1. IntroductionProblem-solving methods for knowledge-based systems capture the problem-solving behavior required to performing the system's task (McDermott, 1988). Because certain tasks are common (e.g., planning and configuration), and are approachable by the same problem-solving behavior, developers can reuse problem-solving methods in several applications ((Chandrasekaran and Johnson, 1993), (Breuker and Van de Velde, 1994)). Thus, a library of reusable methods would allow the developer to create new systems by selecting, adapting and configuring such methods. Moreover, development tools, such as PROTÉGÉ-II (Puerta et al., 1992), can support the developer in the reuse of methods.Problem-solving methods are abstract descriptions of problem-solving behavior. The development of problem solvers from reusable components is analogous to the general approach of software reuse. In knowledge engineering as well as software engineering, developers often duplicate work on similar software components, which are used in different applications. The reuse of software components across several applications is a potentially useful technique that promises to improve the software-development process (Krueger, 1992). Similarly, the reuse of problem-solving methods can improve the quality, reliability, and maintainability of the software (e.g., by the reuse of quality-proven components). Of course, software reuse is only financially beneficial in the end if the indexing and configuration overhead is less than the effort that is needed to create the required component several times from scratch.Although software reuse is an appealing approach theoretically, there are serious practical problems associated with reuse. Two of the most important impediments to software reuse are (1) the problem of finding reusable components (e.g., locating appropriate components in a library), and (2) the problem of adapting reusable components to their task and to their environment. The firstproblem is sometimes called the indexing problem, and the second problem is sometimes called the configuration problem. These problems are also present in the context of reusable problem-solving methods. In the remainder of this paper we shall focus on the configuration problem.Method configuration is a difficult task, because the output of one method may not correspond to the input of the next method, and because the method may have subtasks, which are solved by submethods offering a functionality that is different from the assumptions of the subtask. Domain-independent methods use a method ontology, which, for example, might include concepts such as states, transitions, locations, moves, and constraints, whereas the user input and the (domain-specific) knowledge-base use a domain-oriented ontology, which might include concepts such as office workers, office rooms, and room-assignment constraints. Thus, a related issue to the configuration problem is the problem of mappings between ontologies (Gennari et al., 1994).In this paper, we shall address the configuration problem. The problems of how to organize a library of problem-solving methods and of how to select an appropriate method from such a library are beyond the scope of the paper. We shall introduce an approach for handling method ontologies when configuring a method from more elementary submethods. In our framework, configuration of a method means selecting appropriate submethods for solving the subtasks of which a method is composed. We introduce the notion of a subtask ontology in order to be able (i) to make the ontology of a method independent of the submethods that are chosen to solve its subtasks, and (ii) to specify how a selected submethod is adapted to its subtask environment. Our approach supports such a configuration process on multiple levels of subtasks and submethods. Furthermore, the access of domain knowledge is organized in such a way that no mapping of domain knowledge between the different subtask/submethod levels is required.The approach which is described in this paper provides a framework for handling some important aspects of the method configuration problem. However, our framework does not provide a complete solution to this problem. Furthermore, the proposed framework needs in the future a thorough practical evaluation by solving various application tasks.The rest of this paper is organized as follows. Section 2 provides a background to PROTÉGÉ-II, the board-game method, and the MIKE approach. Section 3 introduces the notions of method and subtask ontologies and discusses their relationships with the interface specification of methods and subtasks, respectively. Section 4 analyses the role of ontologies for the configuration of problem-solving methods and presents a model for configuring a problem-solving method from more elementary submethods that perform the subtasks of the given problem-solving method. In Sections 5 and 6, we discuss the results, and draw conclusions, respectively.2. Background: PROTÉGÉ-II, the Board-Game Method, and MIKEIn this section, we shall give a brief introduction into PROTÉGÉ-II and MIKE (Angele et al., 1996b) and will describe the board-game method (Eriksson et al., 1995) since this method will be used to illustrate our approach.2.1 Method Reuse for PROTÉGÉ-IIPROTÉGÉ-II (Puerta et al., 1992, Gennari et al., 1994, Eriksson et al., 1995) is a methodology and suite of tools for constructing knowledge-based systems. The PROTÉGÉ-II methodology emphasizes the reuse of components, including problem-solving methods, ontologies, and knowledge-bases.PROTÉGÉ-II allows developers to reuse library methods and to generate custom-tailored knowledge-acquisition tools from ontologies. Domain experts can then use these knowledge-Inputs ->Figure 2-1: Method-subtask decomposition in PROTÉGÉ-IIacquisition tools to create knowledge bases for the problem solvers. In addition to developing tool support for knowledge-based systems, PROTÉGÉ-II is also a research project aimed at understanding the reuse of problem-solving methods, and at alternative approaches to reuse.Naturally, the configuration of problem-solving methods for new tasks is a critical step in the reuse process, and an important research issue for environments such as PROTÉGÉ-II.The model of reuse for PROTÉGÉ-II includes the notion of a library of reusable problem-solving methods (PSMs) that perform tasks . PROTÉGÉ-II uses the term task to indicate the computations and inferences a method should perform in terms of its input and output. (Note that the term task is used sometimes in other contexts to indicate the overall role of an application system, or the application task .) In PROTÉGÉ-II problem-solving methods are decomposable into subtasks .Other methods, sometimes called submethods, can perform these subtasks. Primitive methods that cannot be decomposed further are called mechanisms . This decomposition of tasks into methods and mechanisms is shown graphically in Figure 2-1.Submethods and mechanisms should be reusable by developers as they build a solution to a particular problem. Thus, the developer should be able to select a generic method that performs a task, and then configure this method by selecting and substituting appropriate submethods and mechanisms to perform the method’s subtasks. Note that because the input-output requirements of tasks and subtasks often differ from the input-output assumptions of preexisting methods and mechanisms, we must introduce mappings among the task (or subtask) and method (or submethod)ontologies.PROTÉGÉ-II uses three major types of ontologies for defining various aspects of the knowledge-based system: domain , method , and application ontologies . Domain ontologies model concepts and relationships for a particular domain of interest. Ideally, these ontologies should be partititioned so as to separate those parts that may be more dependent on the problem-solving method. Method ontologies model concepts related to problem-solving methods, including input and output assumptions. To enable reuse, method ontologies should be domain-independent. In most situations, reusable domain and method ontologies by themselves are insufficient for a completeapplication system. Thus, PROTÉGÉ-II uses an application ontology that combines domain and method ontologies for a particular application. Application ontologies are used to generate domain-specific, method-specific knowledge-acquisition tools.The focus of this paper is on the configuration of problem-solving methods and submethods. Thus, we will describe method ontologies (see Section 3), rather than domain or application ontologies, and the mappings among tasks and methods that are necessary for method configuration (see Section 4).2.2 The Board-Game Method (BGM)We shall use the board-game method ((Eriksson et al., 1995), (Fensel et al., 1996a)) as a sample method to illustrate method configuration in the PROTÉGÉ-II framework. The basic idea behind the board-game method is that the method should provide a conceptual model of a board game where game pieces move between locations on a board (see Figure 2-2). The state of such a board game is defined by a set of assignments specifying which pieces are assigned to which locations. Developers can use this method to perform tasks that they can model as board-game problems.Figure 2-2: The board-game method provides a conceptual model where pieces movebetween locations.To configure the board-game method for a new game, the developer must define among other the pieces, locations, moves, the initial state and goal states of the game. The method operates by searching the space of legal moves, and by determining and applying the most promising moves until the game reaches a goal state. The major advantage of the board-game method is that the notion of a board game as the basis for the method configuration, makes the method convenient to reuse for the developer.We have used the board-game method in different configurations to perform several tasks. Examples of such tasks are the towers-of-Hanoi, the cannibals-and-missionaries, and the Sisyphus room-assignment (Linster, 1994) problem. By modeling other types of tasks as board games, the board-game method can perform tasks beyond simple games. The board-game method can perform the Sisyphus room-assignment task, for instance, if we (1) model the office workers as pieces, (2) start from a state where all the workers are at a location outside the building, and (3) move the workers one by one to appropriate rooms under the room-assignment constraints.2.3 The MIKE ApproachThe MIKE approach (Model-based and Incremental Knowledge Engineering) (Angele et al., 1996b) aims at providing a development method for knowledge-based systems covering all steps from knowledge acquisition to design and implementation. As part of the MIKE approach theKnowledge Acquisition and Representation Language KARL (Fensel et al., 1996c), (Fensel, 1995) has been developed. KARL is a formal and operational knowledge modeling language which can be used to formally specify a KADS like model of expertise (Schreiber et al., 1993). Such a model of expertise is split up into three layers:The domain layer contains the domain model with knowledge about concepts, their features, and their relationships. The inference layer contains a specifiation of the single inference steps as well as a specification of the knowledge roles which indicate in which way domain knowledge is used within the problem solving steps. In MIKE three types of knowledge roles are distinguished: Stores are used as containers which provide input data to inference actions or collect output data generated by inference actions. Views and terminators are used to connect the (generic) inference layer with the domain layer: Views provide means for delivering domain knowledge to inference actions and to transform the domain specific terminology into the generic PSM specific terminology. In an analogous way, terminators may be used to write the results of the problem solving process back to the domain layer and thus to reformulate the results in domain specific terms. The task layer contains a specification of the control flow for the inference steps as defined on the inference layer.For the remainder of the paper, it is important to know that in KARL a problem solving method is specified in a generic way on the inference and task layer of a model of expertise. A main characteristic of KARL is the integration of object orientation into a logical framework. KARL provides classes and predicates for specifying concepts and relationships, respectively. Furthermore, classes are characterized by single- and multi-valued attributes and are embedded in an is-a hierarchy. For all these modeling primitives, KARL offers corresponding graphical representations. Finally, sufficient and necessary constraints, which have to be met by class and predicate definitions, may be specified using first-order formulae.Currently, a new version of KARL is under development which among others will provide the notion of a method ontology and will provide primitives for specifying pre- and postconditions for a PSM (Angele et al., 1996a). Thus, this new version of KARL includes all the modeling primitives which are needed to formally describe the knowledge-level framework which shall be introduced in Sections 3 and 4. However, this formal specification is beyond the scope of this paper.3. Problem-Solving Method OntologiesWhen describing a PSM, various characteristic features may be identified, such as the input/output behavior or the knowledge assumptions on which the PSM is based (Fensel, 1995a), (Fensel et al., 1996b). In the context of this paper, we will consider a further characteristic aspect of a PSM: its ontological assumptions. These assumptions specify what kind of generic concepts and relationships are inherent for the given PSM. In the framework of PROTÉGÉ-II, these assumptions are captured in the method ontology (Gennari et al.,1994).Subsequently, we define the notions of a method ontology and of a subtask ontology, and discuss the relationship between the method ontology and the subtask ontologies associated with the subtasks of which the PSM is composed. For that discussion we assume that a PSM comes with an interface specification that describes which generic external knowledge roles (Fensel, 1995b) are used as input and output. Each role includes the definition of concepts and relationships for specifying the terminology used within the role.Fig. 3-1 shows the interface specification of the board-game method. We see, for instance, that knowledge about moves, preferences among moves, and applicability conditions for moves is provided by the input roles "Moves", "Preferred_Moves", and "Applic_Moves" (applicable moves), respectively; within the role "Moves" the concept "moves" is defined, whereas forexample within the role "Preferred_Moves" the relationship "prefer_m" is defined which specifies a preference relation between two moves for a given state (see Fig. 3-3).external knowledge role data flowFigure 3-1: The interface of the board-game methodSince the context in which the method will be used is not known in advance, one cannot specify which input knowledge is delivered from other tasks as output and which input knowledge has to be taken from the domain. Therefore, besides the output role "Solution", all other input roles are handled homogenously (that is, as external input knowledge roles).The interface description determines in which way a method can be adapted to its calling environment: a subset of the external input roles will be used later on as domain views (that is for defining the mapping between the domain-specific knowledge and the generic PSM knowledge).3.1 Method OntologiesWe first consider the situation that a complete PSM is given as a building block in the library of PSMs. In this case, a PSM comes with a top-level ontology, its method ontology, specifying all the generic concepts and relationships that are used by the PSM for providing its functionality. This method ontology is divided into two parts:(i) Global definitions, which include all generic concept and relationship definitions that are partof the interface specification of the PSM (that is, the external input and output knowledge roles of the PSM, respectively). For each concept or relationship definition, it is possible to indicate whether it is used as input or as output (however, that does not hold for subordinate concept definitions, i.e. concepts that are just used as range restrictions of concept attributes, or for high-level concepts that are just used for introducing attributes which are inherited bysubconcepts). Thus, the ontology specifies clearly which type of generic knowledge is expected as input, and which type of generic knowledge is provided as output.(ii) Internal definitions, which specify all concepts and relationships that are used for defining the dataflow within the PSM (that is, they are defined within stores).Within both parts, constraints can be specified for further restricting the defined terminology. It should be clear that the global definitions are exactly those definitions that specify the ontological assumptions that must be met for applying the PSM.We assume that a PSM that is stored in the library comes with an ontology description at two levels of refinement. First, a compact representation is given that just lists the names of the concepts and relationships of which the ontology is composed. This compact representation also includes the distinctions of global and internal definitions. It is used for providing an initial, not too detailed overview about the method ontology.Fig. 3-2 shows this compact representation of the board-game method ontology. We see that for instance "moves" is an input concept, "prefer_m" (preference of moves) is an input relationship, and "goal-states" is an output concept; "assignments" is an example of a subordinate concept which is used within state definitions, whereas "movable_objects" is an example of a high level concept. Properties of "movable_objects" are for instance inherited by the concept "pieces" (see below). As we will see later on, the concept "current-states" is part of the internal definitions, since it is used within the board game method for specifying the data flow between subtasks (see Section 3.2)Figure 3-2: The compact representation of the ontology of the board-game method Second, a complete specification of the method ontology is given. We use KARL for formally specifying such an ontology which provides all concept and relationship definitions as well as all constraints. Fig. 3-3 gives a graphic KARL representation of the board-game method ontology(not including constraints). We can see that for instance a move defines a new location for a given piece (single-valued attribute "new_assign" with domain "moves" and range "assignments") or that the preference between moves is state dependent (relationship "prefer_m"). The attribute "assign" is an example of a multi-valued attribute since states consist of a set of assignments.: is-aFigure: 3-3: The graphic KARL representation of the board-game method ontology When comparing the interface specification (Fig. 3-1) and the method ontology (Fig. 3-3) we can see that the union of the terminology of the external knowledge roles is equal to the set of global definitions being found in the method ontology.3.2 Subtask OntologiesIn general, within the PROTÉGÉ-II framework, a PSM is decomposed into several subtasks. Each subtask may be decomposed in turn again by choosing more elementary methods for solving it. We generalize this approach in the sense that, for trivial subtasks, we do not distinguish between the subtasks and the also trivial mechanisms for solving them. Instead, we use the notion of an elementary inference action (Fensel et al., 1996c). In the given context, such an elementary inference action may be interpreted as a "hardwired" mechanism for solving a subtask. Thus, for trivial subtasks, we can avoid the overhead that is needed for associating a subtask with its corresponding method (see below). That is, in general we assume that a method can be decomposed into subtasks and elementary inference actions.When specifying a PSM, a crucial design decision is the decomposition of the PSM into its top-level subtasks. Since subtasks provide the slots where more elementary methods can be plugged in, the type and number of subtasks determine in which way a PSM can be configured from otherdata flow subtaskinternal storeexternal knowledge rolemethods. As a consequence, the adaptability of a PSM is characterized by the knowledge roles of its interface description, and by its top-level task decomposition.For a complete description of the decomposition structure of a PSM, one also has to specify the interfaces of the subtasks and inference actions, as well as the data and control flow among these constituents. The interface of a subtask consists of knowledge roles, which are either external knowledge roles or (internal) stores , which handle the input/output from/to other subtasks and inference actions. Some of these aspects are described for the BGM in Figures 3-4 and 3-5,respectively.Figure 3-4: The top-level decomposition of the board-game methodinto elementary inference actions and subtasksFig. 3-4 shows the decomposition of the board-game method into top-level subtasks andFigure 3-5: The interface of the subtask "Apply_Moves"elementary inference actions. We can see two subtasks ("Apply_Moves" and "Select_Best_State") and three elementary inference actions ("Init_State", "Check_Goal_State", "Transfer_Solution"). This decomposition structure specifies clearly that the board-game method may be configured by selecting appropriate methods for solving the subtasks "Apply_Moves" and "Select_Best_State". In Fig. 3-5 the interface of the subtask "Apply_Moves" is shown. The interface specifies that "Apply_Moves" receives (board-game method) internal input from the store "Current_State" and delivers output to the store "Potential_Successor_States". Furthermore, three external knowledge roles provide the task- and/or domain-specific knowledge that is required for performing the subtask "Apply_Moves".Having introduced the notion of a method ontology, the problem arises how that method ontology can be made independent of the selection of (more elementary) methods for solving the subtasks of the PSM. The basic idea for getting rid of this problem is that each subtask is associated with a subtask ontology. The method ontology is then essentially derived from these subtask ontologies by combining the different subtask ontologies, and by introducing additional superconcepts, like for example the concept "movable_objects" (compare Fig. 3-2). Of course, the terminology associated with the elementary inference actions has to be considered in addition.Figure: 3-6: The method ontology and the related subtask ontologiesThe notion of subtask ontologies has two advantages:1) By building up the method ontology from the various subtask ontologies, the method ontologyis independent of the decision which submethod will be used for solving which subtask. Thus,a configuration independent ontology definition can be given for each PSM in the library.2) Subtask ontologies provide a context for mapping the ontologies of the submethods, which areselected to solve the subtasks, to the global method ontology (see Section 4).The type of mapping that is required between the subtask ontology and the ontology of the method used to solve the subtask is dependent on the distinction of input and output definitions within the subtask ontology (see Section 4). Therefore, we again separate the subtask ontology appropriately. In addition, a distinction is made between internal and external input/output. The feature "internal" is used for indicating that this part of the ontology is used within the method of which the subtask is a part (that is, for defining the terminology of the stores used for the data flow among the subtasks which corresponds to the internal-definitions part of the method ontology). The feature "external" indicates that input is received from the calling environment of the method of which the subtask is a part, or that output is delivered to that calling environment (that corresponds to the global-definitions part of the method ontology). This distinction can be made on the basis of the data dependencies defined among the various subtasks (see Fig. 3-6).In Fig. 3-7, we introduce the ontology for the subtask "Apply_Moves". According to the interface description which is given in Fig. 3-5, we assume that the current state is an internal input for "Apply_Moves", whereas the potential successor states are treated as internal output. Moves, preferences among moves, and applicability conditions for moves are handled as external input.Figure 3-7: The compact representation of the "Apply_Moves" subtask ontologyWhen investigating the interface and ontology specification of a subtask, one can easily recognize that even after having introduced the subtask decomposition, it is still open as to what kind of external knowledge is taken from the domain and what kind of knowledge is received as output from another task. That is, in a complex application task environment, in which the board-game method is used to solve, for example, a subtask st1, it depends on the calling environment of st1 whether, e.g., the preference of moves ("prefer_m") has to be defined in a mapping from the domain, or is just delivered as output from another subtask st2, which is called before st1 is called.4. Configuring Problem Solving Methods from More Elementary MethodsThe basic idea when building up a library of PSMs is that one does not simply store completely defined PSMs. In order to achieve more flexibility in adapting a PSM from the library to its task environment, concepts are required for configuring a PSM from more elementary building blocks (i.e., from more elementary methods (Puerta et al., 1992)). Besides being more flexible, such a configuration approach also provides means for reusing these building blocks in different contexts.Based on the structure introduced in Section 3, the configuration of a PSM from building blocks requires the selection of methods that are appropriate for solving the subtasks of the PSM. Since we do not consider the indexing and adaptation problem in this paper, we assume subsequently, that we have found a suitable (sub-)method for solving a given subtask, e.g. by exploiting appropriate semantic descriptions of the stored methods. Such semantic descriptions could, for instance, be pre-/postconditions which specify the functional behavior of a method (Fensel et al., 1996b). By using the new version of KARL (Angele et al., 1996a) such pre-/postconditions can be specified in a completely formal way.。
Prototype System for Knowledge Problem DefinitionAhmed M.Al-Ghassani,M.ASCE 1;John M.Kamara,M.ASCE 2;Chimay J.Anumba,M.ASCE 3;andPatricia M.Carrillo 4Abstract:Attitudes to knowledge management ͑KM ͒have changed considerably as organizations are now realizing its benefits.Imple-mentation,however,has been facing serious difficulties attributed either to not being able to anticipate the barriers when planning KM strategies or to using inappropriate methods and tools for implementation.These difficulties are more critical in construction due to the fragmented nature of the industry.This paper suggests that proper definition of a KM problem at the early stages of developing the KM initiatives will result in better control over the KM barriers.A methodology for identifying KM problems within a business context is then introduced.The methodology is encapsulated into a prototype software system,which facilitates its deployment in organizations and provides online help facilities.The methodology,development,operation,and evaluation of the prototype are described.The paper concludes that the prototype offers considerable potential for delivering a clarified KM problem and a distilled set of issues for an organization to address.This represents a significant first step in any KM initiative.DOI:10.1061/͑ASCE ͒0733-9364͑2006͒132:5͑516͒CE Database subject headings:Information technology ͑IT ͒;Information management;Knowledge-based systems;Construction industry .IntroductionKnowledge relates to data and information and this relationship needs to be understood first.Data can be described as discrete facts about events which when processed and given relevant as-sociations and patterns become information ͑Blumentritt and Johnston 1999͒.Knowledge is therefore information with context that provides the basis for actions and decision making ͑Kanter 1999͒.There are several definitions of knowledge management ͑KM ͒.It can be defined from a process perspective,an outcome perspec-tive,or a combination.An example of a process perspective definition sees KM as a process of controlling the creation,dis-semination,and utilization of knowledge ͑Newman 1991;Kazi et al.1999͒.Another considers KM as the “identification,optimiza-tion,and active management of intellectual assets,either in the form of explicit knowledge held in artifacts or as tacit knowledgepossessed by individuals or communities to hold,share,and grow the tacit knowledge”͑Snowden 1998͒.The outcome perspective,on the other hand,focuses on the benefits that an organization gets from managing its knowledge.An example is a definition by Kanter ͑1999͒,who sees KM to be concerned with the way an organization gains competitive advantage and builds an innova-tive and successful organization.Another example considers KM as the “management of organizational knowledge for creating business value and generating competitive advantage”͑Tiwana 2000͒.A third example defines KM as “the ability to create and retain greater value from core business competencies”͑Klasson 1999͒.A combined definition describes both the process by which knowledge is managed and the outcome that is likely to result.Tiwana ͑2000͒,for example,states that “Knowledge management enables the creation,communication,and application of knowl-edge of all kinds to achieve business goals.”Another definition by Scarbrough et al.͑1999͒states that KM is any process or practice of creating,acquiring,capturing,sharing,and using knowledge,wherever it resides,to enhance learning and performance in or-ganizations.Regardless of the different perspectives for defining KM,all definitions focus on the fact that knowledge is a valuable asset that must be managed,and that KM is important to provide strategies to retain knowledge and to improve performance.The concept of KM is better understood when compared to other concepts such as the organization of work,learning organi-zation,the management of people,sources of wealth creation in contemporary society,the emergence of the information age,or-ganizational learning,the strategic management of core compe-tencies,the management of knowledge-intensive firms,the value of intellectual capital,and the management of research and devel-opment ͑Scarbrough et al.1999͒.For example,the information age is more broadly focused and affects all firms while the man-agement of research and development is more narrowly focused and is relevant to a limited number of firms.On the other hand,KM is more narrowly focused on the ways in which firms facing highly turbulent environments can mobilize their knowledge base1Assistant Dean for Academic Affairs,Al-Musanna College of Technology,P.O.Box 191,P.C.314,Al-Muluddah,Oman.E-mail:ahmed@ 2Senior Lecturer,School of Architecture,Planning and Landscape,Univ.of Newcastle upon Tyne,3-4Claremont Terrace,Newcastle upon Tyne NE24AE,U.K.E-mail:j.m.kamara@ 3Professor,Dept.of Civil and Building Engineering,Loughborough Univ.,Loughborough,Leicestershire,LE113TU,U.K.E-mail:c.j.anumba@ 4Professor,Dept.of Civil and Building Engineering,Loughborough Univ.,Loughborough,Leicestershire,LE113TU,U.K.E-mail:p.m.carrillo@Note.Discussion open until October 1,2006.Separate discussions must be submitted for individual papers.To extend the closing date by one month,a written request must be filed with the ASCE Managing Editor.The manuscript for this paper was submitted for review and pos-sible publication on January 24,2002;approved on May 18,2005.This paper is part of the Journal of Construction Engineering and Manage-ment ,V ol.132,No.5,May 1,2006.©ASCE,ISSN 0733-9364/2006/5-516–524/$25.00.D o w n l o a d e d f r o m a s c e l i b r a r y .o r g b y W u h a n U n i v e r s i t y o n 11/13/13. C o p y r i g h t A S CE .F o r p e r s o n a l u s e o n l y ; a l l r i g h t s r e s e r v e d .͑or knowledge assets ͒in order to ensure continuous innovation in projects ͑Scarbrough et al.1999͒.This paper presents a methodology developed for identifying KM problems,within a business context,for construction and manufacturing organizations.It then describes the development,operation,and evaluation of a prototype system produced to en-capsulate this methodology.The paper first reviews current KM practices.Current Knowledge Management Practices Overview of Research SurveysKnowledge management is making its mark on organizations of all sizes and in all sectors.Surveys show that the attitude to KM has changed since 1998͑KPMG 1998͒.This change in attitude had several causes,mainly the successful implementation of KM in some organizations and the publishing of two important books on KM by Nonaka and Takeuchi ͑1995͒and Davenport and Pru-sak ͑1998͒.A survey by the Information Systems Research Center of Cranfield School of Management in September 1997found that almost one-third of respondents thought that KM was just a pass-ing fad ͑KPMG 1998͒.Surveys after 1998show that there is a strong belief in the benefits of KM and that it is no longer seen as a passing fad ͑TFPL 1999;Gottschalk 1999;Robinson et al.2001͒.The results of a number of surveys that investigated the awareness of KM and the breadth with which it is implemented are discussed below.A survey of leading anizations representing different industry sectors with turnover exceeding £200million ͑$360mil-lion ͒a year was undertaken by KPMG Management Consulting ͑KPMG 1998͒.The results of the 100respondents show that KM is not seen as a fad any more but increasingly taken seriously.This was confirmed by 43%of the respondents who considered their organizations to have KM initiatives in place.The survey also shows that the awareness of KM increases with the size of the organization.It also informs us that some organizations imple-menting KM have already seen real benefits while others had difficulties.Another survey by TFPL Ltd.covered 500organizations implementing KM or equivalent initiatives from all business sec-tors around the world ͑TFPL 1999͒.Results show that users be-lieve that the concept of KM relies on other concepts such as succeeding in business and enabling organizations to meet their corporate objectives.Of the 80respondents,29%had corporate-wide KM programs and 18%were planning a corporate-wide KM program.50%had no corporate-wide KM program,but of them,42%had another corporate program with similar objectives.The survey concludes that the level of interest in KM and the number of organizations implementing its initiatives was growing expo-nentially.Moreover,many chief executives placed KM as second on their list of “must-dos”after globalization.The survey also shows that most of the KM literature before 1998concentrated on selling the concept of KM using examples of a few innovative adopters and that much of the focus was on demonstrating the value of intellectual capital,the leveraging of knowledge for com-petitive advantage,structural issues that underpin KM,and that KM is mostly about people and culture.Other surveys that focused on particular industry sectors show similar results.A survey covering 73respondents from 256Nor-wegian law firms and interviews with ten of the largest firms shows that there was a strong belief in the potential benefits ofKM ͑Gottschalk 1999͒.Another survey covering 170construction organizations ͑consultants and contractors ͒in the United Kingdom shows that about 40%already had a KM strategy,an-other 41%had plans to have a strategy within a year,and 19%did not have a strategy ͑Robinson et al.2001͒.The research also found that about 50%of U.K.construction organizations had al-ready appointed a knowledge manager or a special group with responsibility for implementing their KM strategy.It is therefore agreed that KM is an important concept that attracts an increasing number of organizations and that those or-ganizations that implement it first will gain more benefits.Tiwana ͑2000͒believes that organizations which decide to wait until KM becomes a hot issue are likely to fall behind,perhaps never to recover.Knowledge Management in ConstructionAlthough the term “KM”is relatively new to construction organi-sations ͑Carrillo et al.2000͒,many have adopted strategies for its implementation because the design and construction of projects involve several knowledge-intensive activities.These activities ͑e.g.,preliminary design,analysis,detailed design,planning and managing the construction,maintenance,etc.͒are influenced by factors that are linked to human intelligence and knowledge,such as experience and engineering judgment.The availability of knowledge is an important factor that affects construction projects as even experienced staff face difficulties when the required knowledge is not available,and this can result in assumptions or judgments that may be disproved when knowledge becomes available ͑Kumar and Topping 1991͒.Construction knowledge takes many forms,e.g.,experiences,best practices,lessons learned,drawings,documents,etc.This knowledge is developed within design offices or on construction sites.Knowledge developed within a design office is easy to ac-cess by individuals or groups working within the same office but those located in geographically dispersed offices will not have the same ease of access to this knowledge.Knowledge generated on a construction site is rarely shared and this can result in loss of this knowledge.For example,if a problem relevant to the perfor-mance of a structure occurs on a construction site,engineers in the design office need to know about this problem ͑its nature,why it occurred,and how it was solved ͒so that they do not repeat mistakes in future designs.Improving the “knowledge flow”within an organization adds value,increases the ability to com-pete,and helps to improve the quality of future projects.Construction organizations can benefit from KM by imple-menting initiatives that help in capturing knowledge that is gen-erated during the different stages of a project life cycle to make it available and accessible in a timely fashion,throughout the orga-nization.Managing construction knowledge not only contributes to increased safety and improved stability but also saves time spent in design and construction and provides scope for innova-tion.Time can be saved,for example,by reducing the number of design cycles or by reducing the time used for searching for knowledge.Therefore,there is need for an approach that facili-tates knowledge sharing within the construction industry.Barriers to Knowledge Management Implementation All the aforementioned surveys agree that organizations imple-menting KM face some barriers.The presence and strength of these barriers are dependent on many factors,such as the type of business processes,products,and clients.Construction organiza-D o w n l o a d e d f r o m a s c e l i b r a r y .o r g b y W u h a n U n i v e r s i t y o n 11/13/13. C o p y r i g h t A S CE .F o r p e r s o n a l u s e o n l y ; a l l r i g h t s r e s e r v e d .tions need to carefully investigate the barriers that are more rel-evant to construction activities in order to develop reliable KM strategies.Long lists of barriers are identified in the literature.However,they can be categorized into three main “barrier groups:”status of knowledge,location of knowledge,and culture surrounding the knowledge.Other barriers,such as the high cost of KM systems,are not considered as main barriers because they are easier to address,e.g.,by allocating a budget for implement-ing KM.The main barrier groups are discussed in turn below.Status of KnowledgeKnowledge exists in a tacit or explicit status ͑Nonaka and Takeuchi 1995͒and the allowance for an easy conversion from one status to another is important.Each status has its own char-acteristics,which may support or resist the conversion process.Knowledge,whether tacit or explicit,is either fully developed or still developing and this can also affect the conversion process as will be discussed below.Tacit Knowledge.This is stored in peoples’brains as mental models,experiences,and skills and is difficult to communicate externally ͑Vail 1999͒.The conversion of tacit knowledge takes two forms ͑Nonaka and Takeuchi 1995͒.It can be converted to other tacit knowledge through socialization in face-to-face inter-actions or to explicit knowledge through externalization by codi-fying an individual’s knowledge.Capturing tacit knowledge and codifying it is one of the biggest challenges in KM ͑Bair and O’Connor 1998͒.This knowledge,whether still “developing”͑in ongoing projects ͒or already “developed”͑in previous projects ͒,requires methodologies to help manage it wherever it exists.Cap-turing the developing knowledge during the life cycle of a project creates additional tasks in the congested agenda of employees.One of the evolving solutions is to integrate KM into the organi-zational processes.However,this could lead to several delays in finishing the work on time due to added activities.Furthermore,capturing the developed tacit knowledge is far more difficult and requires advanced tools and strategies.Explicit Knowledge.This is encoded in organizational formal models,rules,documents,drawings,products,services,facilities,systems,and processes and is easily communicated externally ͑Vail 1999͒.Its conversion also takes two forms ͑Nonaka and Takeuchi 1995͒.It can be converted to tacit knowledge through internalization when an individual reads and understands well-coded knowledge.It can also be converted to another type of explicit knowledge through combining more than one form of knowledge to generate new knowledge.Although conversion of explicit knowledge is easier than that of tacit knowledge,it still requires several resources such as time,technology,and commit-ment.Many organizations find that developed explicit knowledge is more difficult to manage than developing knowledge because it is hard to manage knowledge that was developed ͑in the past ͒by other individuals.In the long term,however,developed knowl-edge is more critical to the organization’s knowledge base as it constitutes the memory of the organization.The difficulty associ-ated with managing developed knowledge is with regard to allo-cating resources,especially time and people to find and store this knowledge.Location of Knowledge and Its Intended UsersKnowledge,whether tacit or explicit,exists in different sources and is required by different users.A source of knowledge can be a user of another knowledge,for example a person is a sourcewhen he contributes to a knowledge base but a user when he reads from the knowledge base.The transformation of knowledge from sources to users is facilitated or resisted by several factors.Understanding the relationships between these sources and users and their associated enablers and resistors is a difficult task re-quiring extensive examination.Moreover,it becomes more diffi-cult to manage knowledge when its sources or users are geo-graphically dispersed.Geographically Dispersed Sources.In the construction con-text,these include:offices,employees,designers,contractors,suppliers,customers,and partners.The coordination of the pro-cess of capturing knowledge from geographically dispersed sources is obstructed by many factors.For example,capturing knowledge from people spending most of their time on construc-tion sites is opposed by limited time and difficulty in getting technologies for capturing knowledge into the sites.Furthermore,having dispersed sources of knowledge ͑e.g.,staff scattered in different countries ͒requires a central unit that plans and monitors the process to ensure that correct and valid knowledge is collected ͑Conheeney et al.2000͒.Geographically Dispersed anizations with geo-graphically dispersed offices and employees need to ensure that the captured knowledge is safely delivered to users whenever they require it.Making this knowledge available to its intended users has several challenges with regard to the methodology of trans-ferring and sharing,supporting technology,and availability of time for accessing the transferred knowledge ͑Patel et al.2000͒.CultureCultural barriers that obstruct the implementation of KM refer to the culture of individuals and organizational culture.Culture of individuals is about the way individuals act and respond.For example,not all employees in an organization will have the same willingness to share their anizational culture is more about the processes used within the organization.For ex-ample,some organizations consider review meeting as part of the design process and this encourages knowledge sharing.The cul-tural barriers that affect the implementation of KM are given below.Willingness to Share.“Why would I give my knowledge to others?”This is a typical question that is asked by people when they are asked to share their knowledge.Most of the literature focuses on the organizational benefits of KM,neglecting the fact that tacit knowledge cannot be captured unless its holders realize that they also benefit ͑Scarbrough et al.1999͒.Employees need to understand that shared knowledge multiplies and that everyone needs knowledge from others.Second,people like to be rewarded for their contribution to the organizational knowledge base and this necessitates the adoption of carefully developed reward sys-tems that do not focus only on monetary figures but also include other schemes such as promotion,recognition,etc.Availability of Time.Employees find themselves under pres-sure of increased job tasks and delivery deadlines.Codifying tacit knowledge is difficult and time consuming and even storing and indexing codified knowledge is time intensive.Many organiza-tions therefore find difficulties in allocating time to their staff to contribute their knowledge and to manage it.Employees also need time for training in order to understand the KM system so that they can use it efficiently.Employees who need to search forD o w n l o a d e d f r o m a s c e l i b r a r y .o r g b y W u h a n U n i v e r s i t y o n 11/13/13. C o p y r i g h t A S CE .F o r p e r s o n a l u s e o n l y ; a l l r i g h t s r e s e r v e d .an answer to a question also find that they do not have enough time to search the knowledge base.Instead,they normally prefer to ask an experienced colleague ͑Carrillo et al.2000͒.Type and Nature of anizations differ according to the industry sector within which they perform.Some organiza-tions have most of their work indoors,e.g.,design offices,while others have most of their work outdoors,e.g.,civil engineering projects.Capturing knowledge from outdoor projects is very dif-ficult due to two types of mobility.One is with regard to the formation of new teams for each project and the second is with regard to mobility of staff within the project site.Some project sites do not have a suitable environment for capturing knowledge or even for accessing a knowledge base.Technology Infrastructure.Although technology is only a facilitator to KM,many of the KM processes depend on it to allow for faster storage,retrieval,and transfer ͑Tiwana 2000͒.Technology therefore provides support for KM with a range of hardware and software tools.However,many organizations find it difficult to identify the tools that address their needs since this requires an understanding of the KM requirements for the orga-nization and what these tools can ing the wrong tools can result in a technology infrastructure that is not compatible with the existing technology within the organization or that does not address the organization’s business goals.Size of Organization.The amount of knowledge available in an organization may be considered proportional to the size of the organization.Knowledge within a large organization is scattered throughout its offices and is therefore more difficult to manage.This necessitates that organizations identify what knowledge they need to manage and where it exists in order to achieve the orga-nizational business goals ͑CIRIA 2000͒.This identification may not be a straightforward process as managing more than one knowledge type may be required.In this case,an organization needs to prioritize these types of knowledge based on the priori-ties of its business goals.Rewarding Schemes May Also Create a Barrier.Many KM initiatives focus on some kind of rewarding schemes,especially short-term schemes.The problem with these schemes is that they consider rewarding people for what they do.This can create a new barrier because people may become more materialistic andwill not accept new work unless they are explicitly rewarded for doing it.Therefore reward schemes need to be carefully designed so that they do not create a new barrier.Overcoming the BarriersLiterature and surveys show that many barriers obstruct the implementation of KM ͑Davenport 1997;KPMG 1998;Gottschalk 1999;TFPL 1999;Scarbrough et al.1999;Carrillo et al.2000;CIRIA 2000;Patel et al.2000;Storey and Barnet 2000;Tiwana 2000;Robinson et al.2001͒.These barriers,if not prop-erly addressed,can result in losing trust in the concept of KM ͑McConalogue 1999;Storey and Barnet 2000͒.New barriers can also result from poor practices or badly designed systems.Over-coming KM barriers is not an easy task and requires extensive planning ͑Al-Ghassani et al.2001b ͒.This necessitates fully un-derstanding the nature of knowledge that needs to be managed and the barriers that resist its implementation.One of the best ways for understanding existing barriers is by having clear iden-tification of KM problems at the early stages of developing strat-egies for KM ͑Al-Ghassani et al.2000b ͒.Early identification of KM problems is crucial because systems are more flexible during their design phase,while rectifying or altering a KM system at a later stage is difficult if at all possible,time consuming,and ex-pensive ͑CIRIA 2000͒.Organizations therefore need a tool that helps capture and define the KM problem so that reliable KM initiatives can be developed.Although numerous KM tools exist to support some elements of the KM process ͑Ruggles 1997;Jackson 1998;Bair and O’Connor 1999;Tiwana 2000;Al-Ghassani et al.2001a;Carrillo et al.2000͒,no tool is found to facilitate understanding and defining the KM problem.Methodology for Identifying Knowledge Management Problems BackgroundDefining a KM problem requires intensive examination of several issues.A methodology for identifying KM problems is therefore developed within the cross-sectoral learning in the virtual enter-prise research ͑CLEVER ͒project ͑Kamara et al.2001͒.The CLEVER framework ͑Fig.1͒is developed to support KM in con-struction and manufacturing organizations.It aims to clarifyaFig.1.The CLEVER framework for knowledge managementD o w n l o a d e d f r o m a s c e l i b r a r y .o r g b y W u h a n U n i v e r s i t y o n 11/13/13. C o p y r i g h t A S CE .F o r p e r s o n a l u s e o n l y ; a l l r i g h t s r e s e r v e d .vague KM problem into a set of specific KM issues,established within a business context in order to provide appropriate and relevant processes to solve the identified problems by•Defining the KM problem and linking it to business drivers or goals;•Creating the desired characteristics of KM solutions;•Identifying the critical migration paths to achieve the desired characteristics;•Selecting appropriate KM processes to use on those paths;and •Developing a strategy for implementing the selected KM processes.The framework addresses these objectives through the four main stages illustrated in Fig.1.The first stage,“Identify KM Problem,”aims to clarify the overall KM problem within a busi-ness context to deliver a refined KM problem and a set of KM issues from the overall problem.The second stage,“Identify Cur-rent and Required KM Characteristics,”introduces a series of knowledge dimensions, e.g.,explicit/tacit,critical/auxiliary,generic/project specific,etc.This stage asks users to identify cer-tain characteristics about the knowledge that they are interested in managing.Following this,users are asked to identify existing knowledge characteristics and the desired knowledge characteris-tics.The third stage,“Identify Critical Knowledge Migration Paths,”aims to allow users to select how best they would like to migrate from the existing characteristics to the desired character-istics,i.e.,either directly,or using intermediate steps.The last stage,“Select Generic KM Processes,”aims to ensure that the organization is in a position to implement KM and helps in se-lecting the appropriate KM processes which,when tailored to a particular organization’s need,will address the stated KM problem.The subsequent sections discuss,in some detail,the method-ology,development,and operation of a prototype developed to address the first stage “Identify KM Problem”through introduc-ing the problem definition template ͑PDT ͒.The Problem Definition TemplateThe problem definition template represents a methodology devel-oped for clarifying the overall KM problem within an organiza-tional business context.It aims to assist users to “think through”the problem in a “structured”way.It therefore covers issues that can cause,if not properly addressed,a weak definition of KM problems.These issues are identified by Al-Ghassani et al.͑2004͒as improper identification of the type and nature of knowledge that needs to be managed;unclear business goals from imple-menting KM initiatives;improper identification of the character-istics of knowledge;and poor understanding of the relationships between sources and users of knowledge and their associated en-ablers and resistors.In order to address these issues,the process consists of several stages,each comprising a set of investigations that address relevant KM issues.The developed approach allows organizations to•describe a “vague”KM problem which does not need to be too specific at this stage;•investigate the problem against the organizational business drivers;•characterize the knowledge problem;•identify the sources and users of the knowledge;•identify the enablers and resistors for transferring the knowl-edge from sources to users;•link the knowledge problem to the relevant KM process ͑es ͒;and finally•refine the previously stated KM problem.Using this approach allows organizations to identify if their business drivers have a KM dimension and enables them to de-velop strategies to address the KM problem.This new approach of gathering information for the identification of KM problems is simple to use and cost-effective.It is applicable to large organi-zations as well as small-to-medium-sized organizations,whichFig.2.Section of the paper version of problem definition templateD o w n l o a d e d f r o m a s c e l i b r a r y .o r g b y W u h a n U n i v e r s i t y o n 11/13/13. C o p y r i g h t A S CE .F o r p e r s o n a l u s e o n l y ; a l l r i g h t s r e s e r v e d .。
Building a Large Knowledge Base from a Structured Source:The CIA World Fact BookGleb FrankAdam FarquharRichard Fikes{frank,farquhar,fikes}@Knowledge Systems LaboratoryComputer Science DepartmentStanford UniversityAbstractThe on-line world is populated by an increasing number of knowledge-rich resources. Furthermore, there is a growing trend among authors to provide semantic markup of these resources. This presents a tantaliz-ing prospect. Perhaps we can leverage the person-years of effort invested in building these knowledge-rich resources to create large-scale knowledge bases.The World Fact Book knowledge base has been an experiment in the construction of a large-scale knowl-edge base from a source authored using semantic markup. The content of the knowledge base is, in large part, derived from the CIA World Fact Book, and covers a broad range of information about the world’s nations. The World Fact Book is a highly structured document with a complex underlying ontology. The structure makes it possible to parse the document in order to carry out the knowledge extraction. However, irregularities of the text written by humans and the complexity of the domain make the knowledge extrac-tion process non-trivial.We describe the process we used to construct the World Fact Book knowledge base, including parsing the source, refining the implicit knowledge, constructing a substantial supporting ontology, and reusing exist-ing ontologies. We also discuss some of the key representational issues addressed and show how the re-sulting axioms can be used to answer a variety of queries.We hope that the broad accessibility of the resulting knowledge base and its neutral representational format will enable others to work with and extend the content, as well as explore issues of structuring and infer-encing in large-scale knowledge bases.1 IntroductionThe on-line world is being populated by an increasing number of knowledge rich sources including dictionaries, encyclopedias, and documents with specialized information on many domains. Furthermore, there is a growing trend to provide these sources in a highly structured form that includes both syntactic markup, in a language such as the hypertext markup language (HTML), and semantic markup, in a language such as the extensible markup lan-guage (XML or SGML). These documents are a critical resource for the construction of the next generation of large scale knowledge bases. They provide the opportunity to leverage many person-years of effort to create useful high-quality knowledge bases in a variety of domains. Of course, these documents are not, in themselves, knowledge bases. They use natural language fragments, have many irregularities in their structure, and rely on the reader to provide substantial amounts of background knowledge to interpret their content. We must overcome these barriers in order to exploit this potentially valuable knowledge work.This paper describes an effort to build a very large-scale knowledge base (KB) from a knowledge rich source. The content of the knowledge base derives from the CIA’s World Fact Book (WFB), which is a collection of geographi-cal, economic, sociological and other facts about the countries and territories of the world [1]. The version of the WFB that we used is specified using SGML, which makes substantial portions of its content accessible. We have converted and refined the textual content of the WFB into an extensive knowledge base with an underlying formal ontology. The result is a computer-usable very large-scale knowledge base that, besides being an artifact of intrinsic interest, can be used as a testbed for exploring the scalability of reasoning tools and problem solving architectures.1Much of the research in Knowledge Representation and Reasoning has been organized around the solution of small canonical problems such as the Yale shooting problem, the Nixon diamond, cascading bathtubs, and so on. A good canonical problem encapsulates the critical issues that must be addressed in order to solve an important class of real world problems. There are, however, important classes of problems whose answers can only be empirically deter-mined by working with large knowledge bases. For example, we would like to understand how to structure many thousands of classes in order to support efficient inference and human understanding; we want to develop efficient inference methods in the presence of many thousands of irrelevant axioms. Many current knowledge representation systems are effectively incapable of working with knowledge bases of the scale of the WFB KB. We hope that the WFB KB can serve as a substrate upon which these and other empirical questions can be resolved. We think that the use of large-scale knowledge bases such as the WFB KB can help to close the gap between small canonical problems and the real world. Two notable efforts that have developed large scale knowledge bases are the botany knowledge base project [2] and the CYC project [3].The WFB KB is also an experiment in knowledge reuse. We have reused definitions and axioms from the Ontolin-gua library [4], as well as the Upper Level ontology that has been developed within the High Performance Knowl-edge Base (HPKB) project of the Defense Advanced Research Projects Agency. This reduces the task of designing equivalent representations and increases the likelihood of interoperation with other knowledge bases. Knowledge reuse experiments also provide a foundation for current research on establishing the better ways to structure and combine ontologies.1.1 ArchitectureThe process of creating the WFB KB consisted of three stages. The first stage was knowledge extraction. During the knowledge extraction stage, knowledge was extracted from the SGML source using a custom parser written spe-cifically for this purpose. The second stage was knowledge refinement. During the knowledge refinement stage, many of the terms identified in the first phase were organized and linked into taxonomies. The third stage was on-tology construction. During the ontology construction phase, new classes and axioms were written that expand the definitions introduced during the knowledge refinement stage. The facts extracted during the first stage were also connected by lifting axioms to the ontology.The first stage was largely automatic and consisted of using a parser to transliterate the SGML document into rela-tional sentences in the knowledge interchange format (KIF) [9]. The parser generated two major types of informa-tion: sentences expressing the explicit facts from the WFB and terminological information about the constants used in the sentences expressing the facts. For example, a WFB entry might express the explicit fact that "Oil makes up 90% of the exports of Saudi Arabia" and thereby imply the terminological information that "Oil is a trading com-modity". The terms described in the terminological information are those that appear in the WFB. No attempt was made in this stage to translate those terms into the terminology of existing ontologies.The parser we used to perform this knowledge extraction also produced files that contained all of the data values from the SGML input file; an exceptions file that contained those irregular entries that could not be parsed; and a “source” file that contained the parsed entries in the following format:(wfb::source-text <field> <country> [<subfield>] “<data>”),e.g.,(wfb::source-text exports haiti partners "US 81%, Europe 12% (1993)" ) The second stage of the knowledge extraction process consisted of producing an ontology for representing the im-plicit facts and was done by hand. The flat lists of terms of a particular type that the parser had generated (e.g., lists of natural resources and languages) were transformed into taxonomies. We also represented obvious functional connections between terms of different types, e.g., between a natural resource and an industry that exploits it or a language and an ethnic entity that speaks it. We then organized and modularized these taxonomies, and connected the classes and relations in them to the corresponding super-classes and super-relations in the HPKB upper level ontology. Finally, we did the difficult and large task of designing ontologies that could be used to express the WFB facts in standard vocabularies and in a form that enables their effective use by general-purpose reasoners.The third stage of the extraction process consisted of writing new axioms covering the terms introduced during re-finement and lifting axioms that transform the KIF sentences produced by the parser in the first stage into sentences expressed in these ontologies. Instead of transforming the parser output into a new format, we decided to keep the KIF files produced by the knowledge extraction process, and connect them to the target OKBC-compliant [6] ontol-ogy through lifting axioms. The facts are kept in the format of the parser output. The queries are formulated in the2terms of the target frame-language ontology. They are then reformulated into the terms of the facts file through the corresponding lifting axiom.Figure 1: The diagram shows the flow of information during the extraction process. Solid black arrows indicate parsing. The block arrow from terms to taxonomies indicates the manual construction of taxonomies from flat lists of terms. Dashed arrows indicate runtime information flow.2 Knowledge ExtractionKnowledge extraction from the WFB is possible because it is a highly structured document. The WFB is available in several different forms: as an HTML document, as a text document, as an interactive web service, and in SGML. Of these forms, the SGML provides both the most regular syntactic structure as well as the most semantic markup. Although the SGML version that we used is not the most current (we had access to the 1995 WFB in SGML), its advantages outweighed any disadvantages, so we chose it as the basis for our work. Our primary interest was the task of representing the knowledge in the WFB, rather than assuring the currency and accuracy of the information contained therein.Knowledge extraction from the WFB is possible because it is a highly structured document. The information about countries is organized alphabetically. The facts about each country are structured into fields and, sometimes, sub-fields. The fields and subfields are almost completely regular, that is all countries have almost identical sets of fields and subfields.To each field (and subfield, if it is present) corresponds a data value, which is a numeric value, a fragment of Eng-lish text, or a list of data values. Unfortunately, the format is different for almost all fields, so that data values for each field must be parsed in a separate way. Furthermore, the format of data values is often quite irregular even within a given field, and the same type of information about different countries may be presented in different ways.A typical entry in the WFB within the section about Italy can look like this:<field>Population growth rate</field><data>0.21%</data>.This is parsed into the KIF formula:(wfb::population-growth-rate italy (* 0.21 %) )We tried to extract as much information as we could without involving serious natural language processing tech-niques. The amount and quality of information therefore depended on the regularity of the format of the correspon-dent data values in the input document. Those fields whose data values were unstructured English text, such as the3descriptions of climate and terrain, and those fields whose data values were extremely irregular were not parsed. The data values in such cases were simply left as English text fragments.Even those fields that were structured and regular enough to be parsed often were not uniform throughout the input file. If the majority of the data values for a particular field displayed a particular structure, and only several entries were constructed in a different way, these abnormal entries were left unparsed and were redirected in a special "ex-ceptions" file. Many of these records just say that the information is not available.When the information in the data values described an object with a relation to a country (e.g., a city), or a class of such objects (e.g., coal deposits), the parser created a symbolic constant for the object or class. That is, the symbolic constant, as opposed to the string, was used in the corresponding KIF fact and the appropriate taxonomic informa-tion was placed in the terms file. Extra attention was necessary to prevent name clashes. For example, the word “textile” appears both as a commodity and as an industry. The method of solving name clashes depended on the context in which the objects appeared. For example, if two countries both had a natural resource called "coal", it was represented by a single constant. When two countries had a city called Santiago, however, two distinct constants were created.Another problem of object creation was the existence of synonyms. Some objects in the WFB are referred to by several distinct names (e.g., US, USA, United States, and United States of America), and others are close semantic neighbors (e.g., wolfram and tungsten). Most of the synonymy relationships cannot be established at the parsing stage. As a consequence, the problem of synonyms and related problems are resolved during the knowledge re-finement stage.In the case that a data value specifies a list of items (e.g, for a nation's industries or natural resources), it may be dif-ficult to decompose the list into its constituents. For example, the value “small deposits of coal and natural gas”means “small deposits of coal” and “small deposits of natural gas”, whereas “coal and natural gas” means “coal” and “natural gas”. Such problematic cases introduce a single constant, which is decomposed manually during the knowledge refinement stage.3 Knowledge RefinementThe raw facts contained in the WFB can be quite useful by themselves, however they also provide an opportunity for reasoning that is much more powerful. The knowledge extraction process produces terms, organized into other-wise unstructured groups. For example, the group of industries includes the terms manufacturing and automobile manufacturing. If a group contains individuals, then the group will correspond to a class in the target ontology (e.g., cities), otherwise it will correspond to a metaclass (e.g., industries). Clearly, if we brought outside knowledge to bear on these term groups, we could define a richer structure that would support stronger inference.While there are thousands of terms in the WFB, the number of terms in each group is much more limited. A typical group will include less than 500 terms. This makes it possible to organize the terms by hand. This is the primary task of knowledge refinement. The knowledge refinement stage, however, involves three subtasks. First, synonyms are eliminated through the use of simple rewrite rules that establish a preferred term in each set of synonyms (e.g.,(preferred-term oil petroleum)). Second, the remaining unique terms are organized into a taxon-omy. This allows a system to answer more concrete queries (e.g., “What fossil fuels are there in Saudi Arabia?” as opposed to “What natural resources are there in Saudi Arabia?”). It also allows a system to answer more general queries (e.g., “Are there any fossil fuels in Saudi Arabia?” as opposed to “Is there any petroleum in Saudi Arabia?”). Finally, the list elements that the parser was unable to decompose are manually split.3.1 Taxonomy constructionThe parser emits several types of objects for which meaningful taxonomies can be built. They are industries, com-modities, natural resources, languages, religions and ethnic groups. The information how to taxonomize these ob-jects is not in the WFB; at this stage we relied exclusively on outside knowledge.Many of the classes in the WFB are non-primitive. That is, they are defined by sufficient conditions. The obvious reason for that is that these classes were originally introduced by people, they are not “natural” classes. When the authors of the WFB associated these classes with countries, they based their decisions on some implicit criteria. For example, the ground for the statement that India’s industrial sectors include machinery production is probably that there are a significant number of businesses in India that can be described as producing machinery. These rules are sufficient conditions of membership in the class of machinery-production businesses. Therefore, the machinery in-dustrial sector, viewed as a class of businesses, is a non-primitive class. Many of the classes in the WFB can be4naturally organized into taxonomies according to multiple attributes. Oil resources, for example, can be classified into offshore and inland deposits, according to the size and significance of deposit, or how fully they have been ex-ploited. One solution is to classify objects along several orthogonal dimensions, introduce classes for each such dimension, and then subclass each WFB class from the appropriate dimensions. For example, the WFB class "small unexploited deposits of iron" is a subclass of iron deposits, of low-abundance natural resources, and of unexploited natural resources.Another way to accomplish this would be to introduce attributes for these dimensions. For example, we could introduce the attributes Abundance-Of-Natural-Resource and Is-Exploited, and assign "small unexploited deposits of iron" the values Low-Abundancy and False correspondingly. Note that the two approaches can be made semanti-cally equivalent by introducing axioms such as:(<=> (low-abundance-natural-resource ?x)(abundance-of-natural-resource ?x low-abundance))3.2 Ontology development goals and criteriaTaking into account that taxonomies of these types of objects can have their own values, we decided to distinguish between two separate tasks: taxonomizing the terms in the WFB, and building a useful ontology of, say, natural re-sources. The goal was to do both. However, many of the terms in the WFB are not useful enough to be present in a general-purpose ontology. Some of them are direct intersections of other classes ("small deposits of coal"), and some terms are just too bizarre ("two small steel rolling mills" as an industrial sector of Saudi Arabia). To achieve both goals, the result of this stage of knowledge base building was subdivided into two smaller subontologies for all taxonomies constructed. The terms that were present in the WFB, but had no particular value as classes in a general-purpose ontology were exiled to one of the subontologies. Each of these terms is either an alias for a term in the main ontology, or a subclass that was considered too narrow to be retained (e.g. "Coastal-Climate-And-Soils-Allow-For-Important-Tea-And-Citrus-Growth" was abstracted to “fertile soil”).If the reasoner that is using the WFB has an effective means to deal with equalities, then the simplest solution is to assert such redundant terms equal to their more useful synonyms. However, with our reasoning tools we used simple rewrite rules: before importing the knowledge base into the theorem prover the facts file undergoes a separate pars-ing step when the unwanted synonyms are replaced with appropriate preferred terms,Some terms have a particular relation to terms in some different WFB taxonomy, for example the hydropower po-tential as a natural resource is used by energy industry, and the refined product of bauxites is aluminum. We tried to capture these connections.3.3 Criteria for introducing new termsSometimes several classes present in the WFB have an evident common superclass that fits nicely in the general taxonomy, but does not occur explicitly in the original document. In these cases, we introduced the missing class. An example is fossil fuels as a class of natural resources, with subclasses: petroleum, coal, natural gas and shale oil.4 Ontology ConstructionThe base representation used by the WFB, and retained by the parser is not always ontologically well motivated, well suited for automated inference, or compatible with existing ontologies, such as the HPKB Upper Level ontol-ogy or the ontologies found in the Ontolingua ontology library [4]. For this reason, we shift the representation used in the WFB into one that is compatible with existing ontologies and is well suited for inference. It would be possi-ble to effect this shift within the parser itself. We chose, however, not to burden the parser with the task of refor-mulating the semantic representation. Instead, we use lifting axioms, which lift the facts from the base representa-tion into our target ontology. The target representation is strongly object oriented and includes definitions from the HPKB Upper Level ontology as well as from the Ontolingua ontology library. The use of lifting axioms to shift representations is common in knowledge rich approaches to information integration.Consider the following typical simple lifting axiom:(=>(and(wfb::area ?georef total-area (* ?n sq-km))(number ?n))5(has-total-area ?georef (* ?n square-kilometer)))There are three important properties of this axiom. First, it shifts from a relational model in which area is a three place predicate, to an object model. Second, the predicate vocabulary in the base facts is distinct from the predicate vocabulary used elsewhere in the knowledge base. Predicates used in the base facts are prefixed with “wfb::”. Third, the object vocabulary is actually retained during the representation shift. In this case the ?georef object will be present in both the base and target vocabularies. Finally, the axiom uses a representation of quantities and units from the Standard-Units ontology [8] from the Ontolingua library. In this ontology, mathematical operations, such as multiplication, are polymorphically extended to apply to units which have dimensionality, as well as numbers. Thus, (* 3 meter) is distinct from (* 3 liter), but identical to (* 0.003 kilometer). An additional function, magnitude, may be applied to a quantity and a unit to compute the numeric magnitude of that quantity given the unit.The following lifting axiom is somewhat more complex(=>(and(wfb::age-structure-by-sex ?georef 0-14-years female ?number)(number ?number)(population ?georef ?pop))(exists ?af(and(fraction-of-country-population ?af)(fraction-of ?af ?pop)(fraction-defining-set ?af age-0-14)(fraction-defining-set ?af female-person)(:cardinality fraction-defining-set ?af 2)(set-cardinality ?af ?number))))The WFB decomposes populations by both age and gender. This axiom lifts the base predicate wfb::age-structure-by-sex into the object oriented model used in the target ontology. In order to accomplish this, it must introduce a new object, ?af, that represents the subset of the population of a country. The fraction vocabulary introduced in Section 5.2 is used for this purpose.The classes and relations in the WFB ontology are hooked up to the HPKB Upper-Level Ontology. This is a fairly large ontology (on the order of 2000 classes and 800 relations). It is used as a common representation within the High Performance Knowledge Base project. Only a small part of the Upper Level ontology was used in the WFB KB (on the order of 20 terms out of more than 3000). This is not altogether unexpected, because the Upper Level ontology covers a much broader range than the WFB. Consequently, most of its classes and relations were irrele-vant for our purposes. The other ontology that we used was the KSL Standard-Units ontology for the representation of the physical quantities and units of measure [8].5 Representation issues5.1 TimeThe representation of time in the WFB KB is a topic deserving careful attention. The WFB itself is temporal in na-ture. A new release of the WFB is made each year. In addition to changes in format, one anticipates changes in the values of many of the fields. For example, the GDP, population, birth rate, and many of the quantitative measures change on an annual basis. Other facts may also change on a regular, but less frequent, basis. For example, the membership of the UN Security Council changes every two years (except for the six permanent members). Other facts change irregularly and with less frequency. For example, the flag of the United States changes when new states are added. Still others change with even less frequency. For example, Greece has bordered on the Mediterra-nean Sea, and Egypt has produced textiles since ancient times.There are several options for representing the temporal aspects of the WFB KB contents. The simplest is to assume that the facts in the KB hold over the year with which the WFB associated and to explicitly assert that each fact holds during that year. For example, every fact in the 1995 WFB might be asserted to hold throughout the year 1995. Even this simple approach admits to several possible realizations, such as (holds-in 1995 (capital-of France6Paris)) using a modal operator holds-in, or (capital-of 1995 France Paris) using an explicit temporal argument to each predicate. These temporal assertions, of course, require an accompanying set of axioms to relate the truth of a proposition over the year to its truth during the year. One could, for example, state that each predicate holds densely throughout the time interval:(=> (and (capital-of ?t1 ?x ?y) (subinterval-of ?t1 ?t2))(capital-of ?t2 ?x ?y))Although this sort of axiom holds for the relation capital-of, it does not, in general, hold for many of the predicates within the WFB.•Many predicates apply to an entire year as an aggregate. For example, the GDP of Japan during 1995 is not the same as its GDP during the month of July. Other measures, such as birth rate use mid-year estimates to com-pute a full-year measure, and are not meaningfully applied to any interval other than the full year.•Some values are not measured annually. For such predicates, the value listed in the WFB may actually be the value for a previous year. The old value is retained because it is the best available. For example, the budget fig-ures for Austria are stamped with year 1993.•Other values are computed based on estimates from previous years. GDP growth, for example, is the difference between the GDP for the current year and GDP for the previous year.Thus, it is simply not correct to associate each fact with the temporal interval associated with a particular edition of the WFB. It is more correct to observe that each fact is true in the context of a particular edition. We can make this notion of context explicit by using a context logic and asserting each fact within a context associated with a specific edition of the WFB. For example: (is-true (wfb 1995) (capital-of France Paris)). In the case that all of the facts are true in a single context and there are no assertions explicitly relating different contexts within the WFB, the context assertion can effectively be elided. This is the choice that has been made in Version 0.8 of the WFB KB. Another motivation for this choice is that for many problem-solving scenarios, the time interval under consideration is much smaller than a year. In such scenarios, it is desirable to treat the content of the WFB KB as fixed or atemporal. It is primarily for problem solving scenarios in which the temporal interval under consideration spans multiple years that the explicit representation of time is valuable.The next step in the WFB KB development, however, is to make the contextual and temporal properties of each predicate explicit. By combining the contextual information with properties at the level of predicates, we intend to support a reformulation process in which the WFB KB contents can be rewritten using either modal operators, ex-plicit temporal arguments and axioms, or atemporally as in Version 0.8.5.2 Sets and QuantitiesWhen we examine the information in the WFB, we observe that many of the data values measure properties of sets of objects associated with a country (e.g., the number of airports). Furthermore, many of these sets are partitioned into subsets. Sometimes there are several partitions according to different criteria. For example, the airports are par-titioned into paved and unpaved, as well as into three classes according to the runway length. Another example is a country’s exports, which are partitioned into classes based on commodities, as well as classes based on the partner country.Representing the relationships between objects, the sets associated with these objects, and the properties of these sets is critical in the WFB. Cardinality is one example of a property associated with some of these sets. We can identify a class of such measures, most of which are additive.The general case seems to be when there is some set associated with the country. The set has some additive quantity related to it, which we call the additive set function. In the general case it is a weighted sum over the set’s elements. For example, for population statistics, the weights are all equal to one and the additive set function is the cardinality of the set, i.e. the number of people in the country. For a country’s exports, the additive set function is the total dol-lar amount of sale. If the export is viewed as a set of business transactions, then the weight of an element will be the dollar amount of a particular transaction.7。
OWL Ontologies and SWRL Rules Applied to Energy Management Ana Rossell´o-Busquet,Lukasz J.Brewka,Jos´e Soler and Lars Dittmann,IEEE members Networks Technology&Service Platforms group,Department of Photonics Engineering, Technical University of Denmark,2800Kgs.Lyngby,Denmark{aros,ljbr,joss,ladit}@fotonik.dtu.dkAbstract—Energy consumption has increased considerably in the last years.How to reduce and make energy consumption more efficient in home environments has become of great interest for researchers.This could be achieved by introducing a Home Energy Management System(HEMS)into user residences.This system might allow the user to control the devices in the home network through an interface and apply energy management strategies to reduce and optimize their consumption.Further-more,the number of devices and appliances found in users resi-dences is increasing and these devices are usually manufactured by different producers and can use different communication technologies.This paper tackles this problem.A home gateway which integrates different devices with a HEMS based on set of semantic-web tools providing a common control interface and energy management strategies.Index Terms—Home gateway,Home energy management sys-tems,ontology,OSGi,engine rule,electrical grid,reasonerI.I NTRODUCTIONA home network is a residential local area network(LAN) containing all the devices in the home environment and may also provide connection to the internet or other networks. This network can be composed by different types of devices which provide a variety of functionalities and applications. In addition,these devices can be manufactured by different producers and can use different communication technologies. Therefore,the main challenge in home networks is the variety of technologies,providing different communication methods, as well as the diversity of producers,providing different types of devices and services.A Home Energy Management System(HEMS)[1]is a system from which the user can control the devices in the home network through an interface and apply energy man-agement strategies to reduce and optimize the devices energy consumption.The herein presented home gateway accesses a knowledge base data repository from where devices capabil-ities can be obtained.Furthermore,based on the knowledge implemented by an ontology,the home devices are classified according to their functionalities and capabilities.Moreover, the ontology also contains a classification of the different commands and status a home device can have.This offers interoperability between the different devices at the service level.Energy management strategies can be performed by applying a set of rules which are based on home devices consumption,information from the electrical grid and user’s preferences.In addition,the home gateway offers a graphical user interface(GUI)from where the user can interact with all the devices in the home network.The users can also define rules to apply energy management strategies to minimize the energy consumption of their home devices.In this article,each logical component contained in the the developed home gateway is described.The logical component containing the knowledge based data repository is explained in detailed.This paper gives implementation details of how the ontology is access and queried to obtain information about the devices contained in the home network.Furthermore,rules are applied to create the energy management strategies which makes this home gateway suitable for HEMS.The rest of this article is organized as follows:section II introduces relevant related works,while section III briefly de-scribes the OSGi framework.Section IV provides an overview of the home gateway architecture and section V gives a detailed description about the logical component containing the knowledge based data repository.Section VI describes the implementation of SQWRL queries and section VII gives details about the integration of SWRL into the system to apply energy management strategies.Finally section VIII provides final remarks and conclusions.II.R ELATED WORKThe implemented home gateway uses Open Service Gate-way Initiative(OSGi)[2]running over Equinox which is an Eclipse project that provides a certified implementation of the OSGi.In addition,OWL ontologies[3]have been used to create the knowledge base data repository for the implementa-tion by incorporating Prot´e g´e-OWL API3.4.4[4]mechanisms into our source code.OSGi and ontologies have been used previously to create home gateways.An example of this is Domotic OSGi Gateway(DOG)[5]by Politecnico di Torino. DogOnt[6]ontology is used to implement the knowledge base data repository.This ontology is reused in this implementation as it provides a good classification of the devices that can be found in a home environment.However,DOG is centered in domotics while the application of this implementation is focus on energy management.This is archived by offering the users the opportunity to create their own energy management system by creating,modifying and deleting rules which may reduce the total electricity consumption.Another example of a home gateway developed using OSGi can be found in[7].However,ontologies nor rules are not used there,therefore energy management strategies cannot be applied.The authors in[8]describe the IntelliDomos learning model which is an ontology-based system able to control a home automation system and to learn users periodic patterns when using home appliances.However,the authors do not describe2011 UKSim 13th International Conference on Modelling and Simulationhow they deal with the fact that an ontology cannot be dynamically modified[9],which we will explain in more detail in section VII.III.OSG IThe OSGi Framework is an open service platform for the delivery and control of different Java-based applications, called bundles.Bundles are JARfiles containing extra meta-data which makes them fully self-describing.Bundles consist of packages,which containing Java classes,and a MAN-IFEST.MF,which contains the metadata.Packages are by default private,however,they can be used by other bundles if they are exported by the bundle containing them.Exporting a package makes it visible to other bundles,but theses other bun-dles must import the package in order to use it.Furthermore, OSGi framework allows bundles to register services so that other bundles can use them.A service is defined using a public Java interface which must reside in an exported package.A bundle then implements the service interface,instantiates it and then registers the instance using the OSGi Service Registry under the service interface name.The class that implements the service is usually private and not available to the other bundles.However,by using the Service Registry and importing the service interface,other bundles can import and consume the service.IV.H OME G ATEWAY A RCHITECTUREThe home gateway has been developed using OSGi-Equinox to control the devices of a home network and to apply energy management strategies.DogOnt ontology is used to create the initial knowledge base data repository of our home gateway. However,this article will not describe this ontology as an extensive description can be found in[6].Furthermore,the home gateway consists of the following bundles,which are shown in Fig.1:•Knowledge Base Bundle:This bundle handles all the interactions with the knowledge base data repository and rule engine.An ontology is used to implement the home gateway knowledge base data rmation about the devices,such as their capabilities,status and network parameters among others can be obtained from the knowledge base data repository by using queries.In addition this bundle contains a rule engine which executes rules in order to apply the energy management strategy chosen.•Interface Bundle:This bundle is used to create a graphical user interface(GUI),from where the users can communicate to the devices and obtain information about them.Moreover,users with advanced knowledge about the underlying ontology can introduce more devices into the knowledge base data repository and modify the rules in the system.•Network n Bundle:As mentioned before more than one communication technology,for example power line or wireless,maybe be found in the home network.Each of these n bundles will handle the communication with theFig.1.Home Gateway Architecturedevices in the subnetwork n in the home network.These bundles will send messages to the devices and forward notification messages from the devices,behaving as tech-nology bridges.For example,a notification message could contain information about a device changing status.These bundles have not been implemented in the work herein described as the aim of this paper is to test the bundles design and the interaction with the ontology.However,a bundle to support communication with KNX networks is under construction.•Networks Manager Bundle:This bundle handles the communication with the different Network bundles.When a message has to be send to a device,this bundle will receive a notification and forward the message to Network bundle containing the recipient device.•Manager Bundle:This is the central bundle which handles the interaction between the different bundles that make up the home gateway implemented.•Network Emulator Bundle:As currently no devices are connected to the home gateway,this bundle emulates the network and its devices and will allow us to test the home gateway implementation and its specific application. More information about the home gateway architecture can be found in[10],where the interaction between bundles and each bundle is explained in more detail.V.K NOWLEDGE B ASE B UNDLEThe Knowledge Base bundle is the bundle in charge of the knowledge base data repository and rule engine.It manages the interaction between the knowledge base data repository and the rule engine with the rest of the architecture.An ontology has been used as knowledge base data repos-itory.This ontology contains the information of the different devices that can be found in the home network,including the devices features and functionalities.The ontology used is DogOnt.For each device found in the home network an OWLIndividual is created.This OWLIndividual is an instance of the corresponding class in the ontology,which represent the actual device found in the home network.Furthermore, OWLIndividuals representing the device’s functionalities and status are also created and linked by object properties to the OWLIndividual of the device as seen in rmationabout devices is accessed by using SQWRL(Semantic Query-Enhanced Web Rule Language)[11],which query the knowl-edge base data repository to obtain the necessary information. Users are not able to access or modify the ontology or the queries directly.However,a user can obtain information about the devices and their capabilities and introduce new OWLIn-dividuals into the knowledge base data repository through the GUI.In addition,in this home gateway,rules can be defined to reduce the energy consumption of the devices creating therefore a HEMS.A rule will be triggered when certain conditions are satisfied and will apply the energy saving strategy defined.Jess Rule Engine[12]is used to execute rules.The rules are written in SWRL(Semantic Web Rule Language)[13].The user is able to create,modify and delete SWRL rules in the system through the GUI.In order for users to create valid rules,they need to have some knowledge about the ontology and SWRL language.For accessing the OWLfiles and creating the knowledge base data repository Prot´e g´e-OWL API3.4.4[4]is used.This API provides methods to load the ontology from an OWLfile, modify the ontology and get classes and OWLIndividuals.In addition,Prot´e g´e-OWL API allows to run SQWRL and define SWRL rules that are executed by using Jess Rule Engine.This API is integrated into this bundle to enable all this capabilities.A.Knowledge Base Operator ServiceThe Knowledge Base Operator Service is offered by the Knowledge Base bundle.This bundle exports the package containing the Java interface of this service.Moreover,it implements the service interface,instantiates it and then reg-isters the instance using the OSGi Service Registry under the service interface name.The Knowledge Base Operator service offers access to the knowledge base data repository allowing the subscribed bundles to query and obtain information about the devices and energy management rules.This service is only used by the Manager bundle as a mid-step to query the knowledge base data repository.This separation between the Knowledge Base Operator service,which directly accesses and queries the knowledge base data repository,and the Manager bundle,used by bundles that need to obtain information contained is done to facilitate the modularity of the system. This is done to avoid interoperability problems if the Knowledge Base bundle is modified.Hence,a separation between the Knowledge Base Operator service,which directly accesses and queries the knowledge base data repository,and the Manager Functions service,used by bundles that need to obtain information contained in the knowledge base data repository is made.VI.Q UERYING THE K NOWLEDGE B ASE D ATAR EPOSITORYProt´e g´e-OWL API provides also methods to load,create and execute SQWRL queries,which are used to obtain information from the knowledge base data repository.SQWRL is based onFig.2.System ArchitectureSWRL language for querying OWL ontologies and provides SQL-like operations to retrieve knowledge from them.The SQWRLQueryAPI[14]is used to run and retrieve the result of SQWRL queries.This API is available in the Prot´e g´e-OWL API.However,in order to run SQWRL queries,Jess rule engine is needed.Jess is not open source and must be downloaded separately.Furthermore,Jess rule engine requires a license,which is free for academic use.Fig.2shows,in section a,part of DogOnt ontology and a partial sample lamp model in DogOnt in section b.The Controllable class is the class representing devices or home appliances that can be controlled.Any OWLIndividual of the class Controllable is linked to another OWLIndividual of the class Functionality with the object property hasFunctionality. The Functionality class is used to represent the functionalities of the device.Furthermore,to select between the different functionalities of a device a command is needed,which is represented with the Command class.Functionality OWLin-dividuals are linked to Command OWLIndividuals with the object property hasCommand.RealCommandName is a data property which links the Command OWLIndividuals to a string which is a human readable string.In addition,Control-lable OWLIndividuals are also linked to State OWLIndivid-uals by hasState and to BuildingEnvironment by isIn.The State class is used to represent the state of the device and contains a data property,value,which represents the actual state of the device.The BuildingEnvironment class is the representation of domestic environments,such as rooms in the user premises.For instance,consider that the knowledge base data repos-itory contains a device Lamp Bedroom,which is located in the bedroom and has only On/Off functionality.To obtain the state in which the Lamp Bedroom is at a certain point in time without sending a command to it the following SQWRL query can be used(to simplify DogOnt namespaces have been omitted):hasState(Lamp Bedroom,?Lstate)∧value(?Lstate,?value)→sqwrl:select(?value)This query searches the knowledge base data repository to find the State OWLIndividual linked to the Lamp Bedroom device and returns the string stored in the data property value.SQWRL queries are also use to retrieve some other infor-mation from the knowledge base data repository.For instance, the following query is used to retrieve the valid commands that can be send to a the device Lamp Bedroom.hasFunctionality(Lamp Bedroom,?Lfunc)∧hasCommand(?Lfunc,?Lcmd)∧realCommandName(?Lcmd,?LcmdName)→sqwrl:select(?LcmdName)This query will return two results one will return the value On and the result will return the value Off.Depending on how many commands the device has,this query will have different number of results.Furthermore,by changing Lamp Bedroom for?device the query will return all the possible commands for all the devices found in the knowledge base data repository.This query is used by the GUI when it is initialized as all the commands are displayed in a combo box.VII.R ULE E NGINESWRL(Semantic Web Rule Language)is based on a combination of the OWL-DL and OWL-Lite,which are sub-languages of the OWL.SWRL can be used to reason about OWLIndividuals by using OWLClasses and properties terms. SWRL is therfore used to create rules which will be used to achieve energy management strategies.To implement energy management strategies a rule engine is used.The user can create,modify and delete rules through the GUI.SWRLRuleEngineAPI[15],[16]is provided in Prot´e g´e-OWL API for creating and manipulating SWRL rule engines.Prot´e g´e-OWL API supports inference with SWRL rules using the Jess Rule Engine.SWRLJessBridge[17]is also provided in Prot´e g´e-OWL API to allow the interaction between the knowledge base data repository,SWRL rules and Jess rule engine.SWRLJessBridge is used to perform syntax transformation from OWL syntax to Jess syntax.Furthermore, SWRLJessBridge calls Jess Rule Engine,which performs Jess reasoning and returns assertion results to SWRL Bridge. Even though SWRLJessBridge is provided in Prot´e g´e-OWL API,Jess rule engine must be downloaded separately and an academic license must be obtained in order for it to work. SWRL rules have a drawback for this home gateway imple-mentation.The state of the devices is saved in the knowledge base data repository by using OWLIndividuals of the State class.As stated in[9],”SWRL rules cannot be used to modify existing information in an ontology”.For example,assume we have a rule that indicates that if the light intensity in a room is higher than50%of the capacity of the sensor,the lamp in that room should be turned off(to simplify the DogOnt namespaces have been omitted):BuildingEnvironment(?room)∧Lamp(?lamp)∧isIn(?room, ?lamp)∧hasState(?lamp,?Lstate)∧value(?Lstate,?value)∧swrlb:stringEqualsIgnoreCase(?value,”On”)∧LightSensor(?sensor)∧isIn(?room,?sensor)∧hasState(?sensor,?Sstate)∧value(?Sstate,?intens)∧swrlb:greaterThan(?intens,50)→value(?Hstate,”Off”)Fig.3.Rules WindowThis rule will add the value of”Off”to the value property for lamp state.This rule will not change the existing value for that property,so a successfulfiring of this rule will result in the property value having two values.In order to solve this problem SWRLBuiltInBridge[18],contained in Prot´e g´e-OWL API,is used.SWRLBuiltInBridge provides a mechanism that allows the use of user-defined methods in rules.These methods are called built-ins and can be seen as functions of the rule system.These user-built-ins are implemented in a java class and dynamically loaded into the rule engine,which can then invoke them when necessary.In order to create user-defined built-ins,first an ontology containing the definition of these user-built-ins must be created and imported into the java implementation.The package name should end with the name of the ontology defining the built-ins.Additionally,the SWRL-BuiltInLibraryImpl class must extend the abstract base class AbstractSWRLBuiltInLibrary.The user-built-ins definition for our implementation are contained in owlmod.owl and imple-mented in the java SWRLBuiltInLibraryImpl class contained in edu.stanford.smi.protegex.owl.swrl.bridge.builtins.owlMod which must be included in the Prot´e g´e-OWL API.These user-built-ins can be used in SWRL rules by calling them using owlmod:prefix and then the name of the method found in the java class.This class will also import the knowledge base data repository in order to be able to change the information about the devices.In order for the above rule to work dogont:value(?Lstate,”Off”)is replace by owlmod:changestate(?Lstate,”Off”)All rules will be evaluated every time the knowledge base data repository is changed.For instance,if the home gateway receives a notification of a sensor sending a status update,Jess rule engine will evaluate and trigger the necessary rules. The GUI offers the user the possibility to view the system rules,create new rules,modify them,or delete them at any moment.This window is shown in Fig.3.Furthermore,when the user clicks on create or modify a new window opens where the rule chosen,in case of pressing modify,will be displayed as shown in Fig.4.In case ofFig.4.Modify Windowpressing create the window will be empty and the user can introduce the new rule.If the rule the user introduces is invalid an error message will be display giving the user details about the error in the rule.The method of creating energy management strategies by using rules gives users the freedom to define different rules adapting the system to the users requirements and personal preferences.VIII.C ONCLUSIONIn this article we have presented the home gateway devel-oped using OSGI-Equinox framework and Prot´e g´e-OWL API. An overview of the bundles composing the home gateway has been given.Furthermore,SQWRL has been used to create queries to extract information from the knowledge base data repository.On the other hand,SWRL is used for rules which can be created,deleted and modified to apply en-ergy management strategies according to the users’priorities. SWRLBuiltInBridge has been used to create our own built-ins to enable rules to modify the knowledge base data repository when necessary.This home gateway provides the means to create energy management strategies which makes this home gateway suitable for HEMS.R EFERENCES[1] A.Rossell´o-Busquet,Georgios,J.Soler,and L.Dittmann,“TowardsEfficient Energy Management:Defining HEMS,AMI and Smart Grid objectives,”in The Tenth International Conference on Networks,Jan.2011.[2]OSGi Alliance,“OSGi Service Platform Core Specification Release4,”Accessed Dec.2010.[Online].Available:[3]W3C OWL Working Group,“OWL2Web Ontology LanguageDocument Overview,”Accessed Dec.2010.[Online].Available: /TR/owl2-overview/[4]H.Knublauch,“Protege-OWL API Programmer’s Guide,”AccessedDec.2010.[Online].Available:/wiki/ProtegeOWL API Programmers Guide[5] D.Bonino,E.Castellina,and F.Corno,“The DOG gateway:enablingontology-based intelligent domotic environments,”Consumer Electron-ics,IEEE Transactions on,november2008.[6] D.Bonino and F.Corno,“DogOnt-Ontology Modeling for IntelligentDomotic Environments,”The Semantic Web-ISWC2008,2008. [7]S.Ok and H.Park,“Implementation of initial provisioning functionfor home gateway based on open service gateway initiative platform,”in Advanced Communication Technology,2006.ICACT2006.The8th International Conference,feb.2006.[8]P.Valiente-Rocha and A.Lozano-Tello,“Ontology and SWRL-BasedLearning Model for Home Automation Controlling,”in Ambient Intel-ligence and Future Trends-International Symposium on Ambient Intel-ligence(ISAmI),ser.Advances in Soft Computing.Springer Berlin/ Heidelberg,2010.[9]M.J.O’Connor,“SWRL LanguageFAQ,”Accessed Dec.2010.[Online].Available:/cgi-bin/wiki.pl?SWRLLanguageFAQ [10] A.Rossell´o-Busquet,J.Soler,and L.Dittmann,“A Novel HomeEnergy Management System Architecture,”UKSim13th International Conference on Computer Modelling and Simulation,2011.[11]M.J.O’Connor,“SQWRL,”Accessed Dec.2010.[Online].Available:/cgi-bin/wiki.pl?SQWRL[12]Sandia National Laboratories,“Jess Rule Engine,”Accessed Dec.2010.[Online].Available:/[13]I.Horrocks,P.F.Patel-Schneider,H.Boley,S.Tabet,B.Grosof,andM.Dean,“S.”[14]M.J.O’Connor,“SQWRLQueryAPI,”Accessed Dec.2010.[Online].Available:/cgi-bin/wiki.pl?SQWRLQueryAPI [15]——,“SWRLRuleEngineAPI,”Accessed Dec.2010.[Online].Available:/cgi-bin/wiki.pl?SWRLRuleEngineAPI [16]——,“SWRLRuleEngineBridgeFAQ,”Accessed Dec.2010.[Online].Available:/cgi-bin/wiki.pl?SWRLRuleEngineBridgeFAQ[17]——,“SWRLJessBridge,”Accessed Dec.2010.[Online].Available:/cgi-bin/wiki.pl?SWRLJessBridge [18]——,“SWRLBuiltInBridge,”Accessed Dec.2010.[Online].Available:/cgi-bin/wiki.pl?SWRLBuiltInBridge。
Diligence in learning is the cornerstone of success.It is a wellknown fact that the path to success is paved with hard work and dedication.In the realm of education,this principle holds true more than ever.Here is an essay that delves into the importance of diligent study and how it lays the foundation for a strong and successful future.Title:The Importance of Diligent StudyThe pursuit of knowledge is a journey that requires unwavering commitment and a relentless spirit of inquiry.Diligent study is not just about spending countless hours poring over textbooks or attending lectures it is about cultivating a mindset that seeks to understand and master the subject matter.This essay will explore the significance of diligent study in building a solid foundation for academic and professional success.The Role of Diligence in LearningDiligence is the driving force behind the acquisition of knowledge.It is the quality that enables students to delve deeper into their subjects,to ask questions,and to seek answers beyond the superficial.Diligent students are characterized by their ability to focus,to persevere through challenges,and to continually refine their understanding of complex concepts.Developing Critical Thinking SkillsOne of the key benefits of diligent study is the development of critical thinking skills.By engaging deeply with the material,students learn to analyze information,to identify patterns,and to draw logical conclusions.These skills are not only essential for academic success but are also invaluable in the professional world,where the ability to think critically can lead to innovative solutions and effective decisionmaking. Enhancing Memory and RetentionAnother advantage of diligent study is its impact on memory and retention.When students invest time and effort into understanding a topic,they are more likely to remember it.This is because the brain forms stronger connections when it is actively engaged in learning.As a result,diligent students often find that they can recall information more easily and accurately than their less dedicated peers.Building a Strong Work EthicDiligent study also instills a strong work ethic.Students who are committed to theireducation learn the value of hard work and the satisfaction that comes from achieving their goals.This work ethic extends beyond the classroom,providing a foundation for success in all areas of life.Overcoming ChallengesIn the face of academic challenges,diligent students are better equipped to persevere. They understand that setbacks are part of the learning process and that persistence is key to overcoming them.This resilience is a critical skill in a world that is constantly evolving and presenting new obstacles.Preparing for the FutureFinally,diligent study prepares students for the future by equipping them with a broad base of knowledge and a range of skills.Whether they go on to higher education or enter the workforce,the habits of mind and work habits developed through diligent study will serve them well.ConclusionIn conclusion,diligent study is the cornerstone of a successful educational experience.It fosters critical thinking,enhances memory and retention,builds a strong work ethic,and prepares students to overcome challenges and succeed in their future endeavors.By embracing the principles of diligent study,students lay a solid foundation for a lifetime of learning and achievement.。
Knowledge is an essential component of human life,shaping our understanding of the world and enabling us to navigate through various challenges.Here are some key points that highlight the importance of knowledge in our lives:1.Foundation for Learning:Knowledge serves as the foundation for learning new skills and concepts.It is the building block upon which we construct our understanding of various subjects.2.Cognitive Development:Knowledge stimulates cognitive development,enhancing our ability to think critically,solve problems,and make informed decisions.3.Career Advancement:In the professional world,knowledge is often a determinant of career progression.It equips individuals with the necessary skills to excel in their chosen fields and adapt to changes in the job market.4.Innovation and Creativity:Knowledge fuels innovation and creativity.It provides the raw materials for new ideas and allows individuals to think outside the box,leading to advancements in technology,science,and the arts.5.Cultural Understanding:Knowledge of different cultures,languages,and histories fosters a deeper understanding and appreciation of the worlds diversity.This understanding can lead to greater empathy and cooperation among people from different backgrounds.6.Personal Growth:Knowledge contributes to personal growth by broadening ones horizons and encouraging selfreflection.It can lead to a more fulfilling and meaningful life.7.Social Progress:Knowledge is a driving force for social progress.It helps societies to evolve,addressing issues such as poverty,inequality,and environmental concerns through informed policies and actions.8.Empowerment:Knowledge empowers individuals by providing them with the tools to make informed choices,advocate for their rights,and contribute to their communities.9.Lifelong Learning:In an everchanging world,the pursuit of knowledge becomes a lifelong endeavor.It ensures that individuals remain adaptable and relevant in a rapidly evolving global landscape.10.Global Citizenship:Knowledge helps individuals become global citizens,aware ofglobal issues and capable of contributing to international dialogue and cooperation.In conclusion,knowledge is not just a collection of facts it is a transformative force that shapes individuals,societies,and the world at large.It is a lifelong pursuit that should be cherished and nurtured for the betterment of all.。
A Practical Guide To Building OWL Ontologies Using Prot_eg_e 4 and CO-ODE ToolsEdition 1.3Matthew HorridgeContributorsv 1.0 Holger Knublauch , Alan Rector , Robert Stevens , Chris Wroev 1.1 Simon Jupp, Georgina Moulton, Robert Stevensv 1.2 Nick Drummond, Simon Jupp, Georgina Moulton, Robert Stevens v 1.3 Sebastian BrandtThe University Of ManchesterCopyright c The University Of ManchesterMarch 24, 2011Contents1 Introduction 71.1 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Requirements 93 What are OW L Ontologies? 103.1 Components of OW L Ontologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.1.1 Individuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.1.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.1.3 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Building An OW L Ontology 134.1 Named Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.2 Disjoint Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.3 Using Create Class Hierarchy To Create Classes . . . . . . . . . . . . . . . . . . . . . . . . 19 4.4 OWL Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.5 Inverse Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.6 OWL Object Property Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.6.1 Functional Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.6.2 Inverse Functional Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.6.3 Transitive Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.6.4 Symmetric Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.6.5 Asymmetric properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.6.6 Reexive properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.6.7 Irreexive properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.7 Property Domains and Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.8 Describing And De_ning Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.8.1 Property Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.8.2 Existential Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.9 Using A Reasoner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.9.1 Invoking The Reasoner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.9.2 Inconsistent Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494.10 Necessary And Su_cient Conditions (Primitive and De_ned Classes) . . . . . . . . . . . . 53 4.10.1 Primitive And De_ned Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.11 Automated Classi_cation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.12 Universal Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.13 Automated Classi_cation and Open World Reasoning . . . . . . . . . . . . . . . . . . . . 624.13.1 Closure A xioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.14 Value Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674.14.1 Covering A xioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684.15 Adding Spiciness to Pizza Toppings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 704.16 Cardinality Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734.17 Quali_ed Cardinality Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755 Datatype Properties 766 More On Open World Reasoning 837 Creating Other OWL Constructs In Prot_eg_e 4 897.1 Creating Individuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 897.2 hasValue Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 917.3 Enumerated Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 937.4 Annotation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 947.5 Multiple Sets Of Necessary & Su_cient Conditions . . . . . . . . . . . . . . . . . . . . . . 96A Restriction Types 97A.1 Quanti_er Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97A.1.1 someValuesFrom { Existential Restrictions . . . . . . . . . . . . . . . . . . . . . . 98A.1.2 allValuesFrom { Universal Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . 98A.1.3 Combining Existential And Universal Restrictions in Class Descriptions . . . . . . 99 A.2 hasValue Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99A.3 Cardinality Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100A.3.1 Minimum Cardinality Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . 100A.3.2 Maximum Cardinality Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . 100A.3.3 Cardinality Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101A.3.4 The Unique Name Assumption And Cardinality Restrictions . . . . . . . . . . . . 101B Complex Class Descriptions 102B.1 Intersection Classes (u) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102B.2 Union Classes (t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102C Plugins 104C.1 Installing Plugins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104C.2 Useful Plugins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104C.2.1 Matrix Plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104AcknowledgementsI would like to acknowledge and thank my colleagues at the University Of Manchester and also StanfordUniversity for proof reading this tutorial/guide and making helpful comments and suggestions as to howit could be improved. In particular I would like to thank my immediate colleagues: Alan Rector, Nick Drummond, Hai Wang and Julian Seidenberg at the Univeristy Of Manchester, who suggested changes to early drafts of the tutorial in order to make things clearer and also ensure the technical correctness of the material. Alan was notably helpful in suggesting changes that made the tutorial ow more easily. I am grateful to Chris Wroe and Robert Stevens who conceived the original idea of basing the tutorial on an ontology about pizzas. Finally, I would also like to thank Natasha Noy from Stanford University for using her valuable experience in teaching, creating and giving tutorials about Prot_eg_e to provide detailedand useful comments about how initial drafts of the tutorial/guide could be made better.This work was supported in part by the CO-ODE project funded by the UK Joint Information Services Committee and the HyOntUse Project (GR/S44686) funded by the UK Engineering and Physical ScienceResearch Council and by 21XS067A from the National Cancer Institute.Chapter 1IntroductionThis guide introduces Prot_eg_e 4 for creating OW L ontologies. Chapter 3 gives a brief overview of theOWL ontology language. Chapter 4 focuses on building an OW L-DL ontology and using a Description Logic Reasoner to check the consistency of the ontology and automatically compute the ontology class hierarchy. Chapter 7 describes some OWL constructs such as hasValue Restrictions and Enumerated classes, which aren't directly used in the main tutorial.1.1 ConventionsClass, property and individual names are written in a sans serif font like this.Names for user interface views are presented in a style `like this'.Where exercises require information to be typed into Prot_eg_e 4 a type wr iter font is used like this. Exercises and required tutorial steps are presented like this:Exercise 1: Accomplish this1. Do this.2. Then do this.pre-sented like this.Explanation as to what things mean are presented like this.Potential pitfalls and warnings are presented like this.General notes are presented like this.Vocabulary explanations and alternative names are presented like this.Chapter 2RequirementsIn order to follow this tutorial you must have Prot_eg_e 4, which is available from the Prot_eg_e website 1,and the Prot_eg_e Plugins which are available via the CO-ODE web site 2. It is also recommended (but notnecessary) to use the OWLViz plugin, which allows the asserted and inferred classi_cation hierarchiestobe visualised, and is available from the CO-ODE web site, or can be installed when Prot_eg_e 4 is installed.For installation steps, please see the documentation for each component.Chapter 3What are OWL Ontologies?Ontologies are used to capture knowledge about some domain of interest. An ontology describes the concepts in the domain and also the relationships that hold between those concepts. Di_erent ontology languages provide di_erent facilities. The most recent development in standard ontology languages is OWL from the World Wide Web Consortium (W3C)1. Like Prot_eg_e, OWL makes it possible to describeconcepts but it also provides new facilities. It has a richer set of operators - e.g. intersection, union and negation. It is based on a di_erent logical model which makes it possible for concepts to be de_ned as well as described. Complex concepts can therefore be built up in de_nitions out of simpler concepts. Furthermore, the logical model allows the use of a reasoner which can check whether or not all of the statements and de_nitions in the ontology are mutually consistent and can also recognise which concepts_t under which de_nitions. The reasoner can therefore help to maintain the hierarchy correctly. This is particularly useful when dealing with cases where classes can have more than one parent.3.1 Components of OWL OntologiesOWL ontologies have similar components to Prot_eg_e frame based ontologies. However, the terminologyused to describe these components is slightly di_erent from that used in Prot_eg_e. An OWL ontology consists of Individuals, Properties, and Classes, which roughly correspond to Prot_eg_e frames Instances,Slots and Classes.3.1.1 IndividualsIndividuals, represent objects in the domain in which we are interested 2. An important di_erence betweenProt_eg_e and OWL is that OW L does not use the Unique Name Assumption (UNA). This means that two di_erent names could actually refer to the same individual. For example, \Queen Elizabeth", \The Queen" and \Elizabeth Windsor" might all refer to the same individual. In OW L, it must be explicitly stated that individuals are the same as each other, or di_erent to each other | otherwise they might be the same as each other, or they might be di_erent to each other. Figure 3.1 shows a representation of some individuals in some domain|in this tutorial we represent individuals as diamonds in diagrams.Individuals are also known as instances. Individuals can be referred to as being`instances of classes'.3.1.2 PropertiesProperties are binary relations3 on individuals - i.e. properties link two individuals together4. For example, the property hasSibling might link the individual Matthew to the individual Gemma, or the property hasChild might link the individual Peter to the individual Matthew. Properties can have inverses.For example, the inverse of hasOwner is isOwnedBy. Properties can be limited to having a single value i.e. to being functional. They can also be either transitive or symmetric. These `property characteristics' are explained in detail in Section 4.8. Figure 3.2 shows a representation of some properties linking some individuals together.Properties are roughly equivalent to slots in Prot_eg_e. They are also known as rolesin description logics and relations in UML and other object oriented notions. InGRA IL and some other formalis ms they are called attributes.。
3Knowledge Representation and OntologiesLogic,Ontologies and Semantic Web LanguagesStephan Grimm1,Pascal Hitzler2,Andreas Abecker11FZI Research Center for Information Technologies,University of Karlsruhe,Germany {grimm,abecker}@fzi.de2Institute AIFB,University of Karlsruhe,Germanyhitzler@aifb.uni-karlsruhe.deSummary.In Artificial Intelligence,knowledge representation studies the formalisation of knowl-edge and its processing within machines.Techniques of automated reasoning allow a computer sys-tem to draw conclusions from knowledge represented in a machine-interpretable form.Recently, ontologies have evolved in computer science as computational artefacts to provide computer systems with a conceptual yet computational model of a particular domain of interest.In this way,computer systems can base decisions on reasoning about domain knowledge,similar to humans.This chapter gives an overview on basic knowledge representation aspects and on ontologies as used within com-puter systems.After introducing ontologies in terms of their appearance,usage and classification,it addresses concrete ontology languages that are particularly important in the context of the Semantic Web.The most recent and predominant ontology languages and formalisms are presented in relation to each other and a selection of them is discussed in more detail.3.1Knowledge RepresentationAs a branch of symbolic Artificial Intelligence,knowledge representation and reasoning aims at designing computer systems that reason about a machine-interpretable representa-tion of the world,similar to human reasoning.Knowledge-based systems have a computa-tional model of some domain of interest in which symbols serve as surrogates for real world domain artefacts,such as physical objects,events,relationships,etc.[45].The domain of interest can cover any part of the real world or any hypothetical system about which one desires to represent knowledge for computational purposes.A knowledge-based system maintains a knowledge base which stores the symbols of the computational model in form of statements about the domain,and it performs reasoning by manipulating these symbols.Applications can base their decisions on domain-relevant questions posed to a knowledge base.3.1.1A Motivating ScenarioTo illustrate principles of knowledge representation in this chapter,we introduce an exam-ple scenario taken from a B2B travelling use case.In this scenario,companies frequently38Stephan Grimm,Pascal Hitzler,Andreas Abeckerbook business trips for their employees,sending them to international meetings and con-ference events.Such a scenario is a relevant use case for Semantic Web Services,since companies desire to automate the online booking process,while they still want to bene-fit from the high competition among various travel agencies and no-frills airlines that sell tickets via the internet.Automation is achieved by computational agents deciding about whether an online offer of some travel agencyfits a request for a business trip or not,based on the knowledge they have about the offer and the request.Knowledge represented in this domain of“business trips”is aboutflights,trains,booking,companies and their employees, cities that are source or destination for a trip,etc.Knowledge-based systems use a computational representation of such knowledge in form of statements about the domain of interest.Examples of such statements in the busi-ness trips domain are“companies book trips for their employees”,“flights and train rides are special kinds of trips”or“employees are persons employed at some company”.This knowledge can be used to answer questions about the domain of interest.From the given statements,and by means of automated deduction,a knowledge-based system can,for ex-ample,derive that“a person on aflight booked by a company is an employee”or“the company that booked aflight for a person is this person’s employer”.In this way,a knowledge-based computational agent can reason about business trips, similar to the way a human would.It could,for example,tell apart offers for business trips from offers for vacations,or decide whether the destination city for a requestedflight is close to the geographical region specified in an offer,or conclude that a participant of a businessflight is an employee of the company that booked theflight.3.1.2Forms of Representing KnowledgeIf we look at current Semantic Web technologies and use cases,knowledge representation appears in different forms,the most prevalent of which are based on semantic networks, rules and logic.Semantic network structures can be found in RDF graph representations [30]or Topic Maps[41],whereas a formalisation of business knowledge often comes in form of rules with some“if-then”reading,e.g.in business rules or logic programming formalisms.Logic is used to realise a precise semantic interpretation for both of the other forms.By providing formal semantics for knowledge representation languages,logic-based formalisms lay the basis for automated deduction.We will investigate these three forms of knowledge representation in the following.Semantic NetworksOriginally,semantic networks stem from the“existential graphs”introduced by Charles Peirce in1896to express logical sentences as graphical node-and-link diagrams[43].Later on,similar notations have been introduced,such as conceptual graphs[45],all differing slightly in syntax and semantics.Despite these differences,all the semantic network for-malisms concentrate on expressing the taxonomic structure of categories of objects and the relations between them.We use a general notion of a semantic network,abstracting from the different concrete notations proposed.A semantic network is a graph whose nodes represent concepts and whose arcs rep-resent relations between these concepts.They provide a structural representation of state-ments about a domain of interest.In the business trips domain,typical concepts would be3Knowledge Representation and Ontologies39“Company”,“Employee”or“Flight”,while typical relations would be“books”,“isEm-ployedAt”or“participatesIn”.Figure3.1shows an example of a semantic network for the business trips domain.Fig.3.1.A Semantic Network for Business TripsSemantic networks provide a means to abstract from natural language,representing the knowledge that is captured in text in a form more suitable for computation.The knowledge expressed in the network from Figure3.1coincides with the content of the following natural language text.“Employees of companies are persons,while both persons and companies are le-gal panies book trips for their employees.These trips can beflights or train rides which start and end in cities of Europe or the panies them-selves have locations which can be cities.The company UbiqBiz books theflight FL4711from London to New York for Mister X.”Typically,concepts are chosen to represent the meaning of nouns in such a text,while relations are mapped to verb phrases.The fragment Company books−−−−−→Trip is read as “companies book trips”,expressed as a binary two However, this is not mandatory;the relation books−−−−−→could also be“lifted”to a concept Booking with relations hasActor−−−−−−−−→pointing to Company,−−−−−−−−→,hasParticipant−−−−−−−−−−−−→and hasObjectEmployee and Trip,respectively.In this way,its ternary character wouldthe original network where the information about an employee’s involvement in booking is implicit.In principle,the concepts and relations in a semantic network are generic and could stand for anything relevant in the domain of interest.However,some particular relations for some standard knowledge representation and reasoning cases have evolved.40Stephan Grimm,Pascal Hitzler,Andreas AbeckerThe semantic network in Figure3.1illustrates the distinction between general concepts, like Employee,and individual concepts,like MisterX.While the latter represent con-crete individuals or objects in the domain of interest,the former serve as classes to group together such individuals that have certain properties in common,as e.g.all employees.The particular relation which links individuals to their classes is that of instantiation,denoted by isA−−−−→.Thus,MisterX is called an instance of the concept employee.The lower part of the network is concerned with knowledge about individuals,reflecting a particular situation of the employee MisterX participating in a certainflight,while the upper part is concerned with knowledge about general concepts,reflecting various possible situations.The most prominent type of relation in semantic networks,however,is that of subsump-tion,which we denote by kindOf−−−−−−→.A subsumption link connects two general concepts and expresses specialisation or generalisation,respectively.In the network in Figure3.1,a flight is said to be a special kind of trip,i.e.Trip subsumes Flight.This means that any flight is also a trip,however,there might be other trips which are notflights,such as train rides.Subsumption is associated with the notion of inheritance in that a specialised concept inherits all the properties from its more general parent concepts.For example,from the net-work one can read that a company can be located in a European city,since locatedAt−−−−−−−−→points from Company to Location while EUCity is a kind of City which is itself a kind of Location.The concept EUCity inherits the property of being a potential location for a company from the concept Location.Other particular relations that can be found in semantic network notations are,for ex-ample,partOf−−−−−−→to denote part-whole relationships,etc.Semantic networks are closely related to another form of knowledge representation called frame systems.In fact,frame systems and semantic networks can be identical in their expressiveness but use different representation metaphors[43].While the semantic network metaphor is that of a graph with concept nodes linked by relation arcs,the frame metaphor draws concepts as boxes,i.e.frames,and relations as slots inside frames that can befilled by other frames.Thus,in the frame metaphor the graph turns into nested boxes.The semantic network form of knowledge representation is especially suitable for cap-turing the taxonomic structure of categories for domain objects and for expressing general statements about the domain of interest.Inheritance and other relations between such cate-gories can be represented in and derived from subsumption hierarchies.On the other hand, the representation of concrete individuals or even data values,like numbers or strings,does notfit well the idea of semantic networks.RulesAnother natural form of expressing knowledge in some domain of interest are rules that re-flect the notion of consequence.Rules come in the form of IF-THEN-constructs and allow to express various kinds of complex statements.Rules can be found in logic programming systems,like the language Prolog[31],in deductive databases[34]or in business rules systems.The following is an example of rules expressing knowledge in the business trips do-main,specified in their intuitive if-then-reading.3Knowledge Representation and Ontologies41(1)IF something is aflight THEN it is also a trip(2)IF some person participates in a trip booked by some companyTHEN this person is an employee of this company(3)FACT the person MisterX participates in aflight booked by the company UbiqBiz(4)IF a trip’s source and destination cities are close to each otherTHEN the trip is by trainThe IF-part is also called the body of a rule,while the THEN-part is also called its head.Typically,rule-based knowledge representation systems operate on facts,which are often formalised as a special kind of rule with an empty body.They start from a given set of facts,like rule(3)above,and then apply rules in order to derive new facts,thus“drawing conclusions”.However,the intuitive reading with natural language phrases is not suitable for compu-tation,and therefore such phrases are formalised to predicates and variables over objects of the domain of interest.A formalisation of the above rules in the typical style of rule languages looks as follows.(1)Trip(?t):−Flight(?t)(2)Employee(?p)∧isEmployedAt(?p,?c):−Trip(?t)∧books(?c,?t)∧Company(?c)∧participatesIn(?p,?t)∧Person(?p)(3)Person(MisterX)∧participatesIn(MisterX,FL4711)∧Flight(FL4711)∧books(UbiqBiz,FL4711)∧Company(UbiqBiz):−(4)TrainRide(?t):−Trip(?t)∧startsFrom(?t,?s)∧endsIn(?t,?d)∧close(?s,?d) In most logic programming systems a rule is read as an inverse implication,starting with the head followed by the body,which is indicated by the symbol:−that resembles a backward arrow.In this formalisation,the intuitive notions from the text,that were concepts and relations in the semantic network case,became predicates linked through variables and constants that identify objects in the domain of interest.Variables start with the symbol? and take as their values the constants that occur in facts such as(3).Rule(1)captures inheritance–or subsumption–between trips andflights by stating that“everything that is aflight is also a trip”.Rule(2)draws conclusions about the status of employment for participants of businessflights.From the facts(3),these two rules are able to derive the implicit fact that“MisterX is an employee of UbiqBiz”.While the rules(1)and(2)express general domain knowledge,rule(4)can be inter-preted as part of some company’s travelling policy,stating that trips between close cities shall be conducted by train.In business rules,for example,rule-based formalisms are used with the motivation to capture complex business knowledge in companies like pricing mod-els or delivery policies.Rule-based knowledge representation systems are especially suitable for reasoning about concrete instance data,i.e.simple facts of the form Employee(MisterX).Com-plex sets of rules can efficiently derive implicit such facts from explicitly given ones.They are problematic if more complex and general statements about the domain shall be derived which do notfit a rule’s head.42Stephan Grimm,Pascal Hitzler,Andreas AbeckerLogicBoth forms,semantic networks as well as rules,have been formalised using logic to give them a precise semantics.Without such a precise formalisation they are vague and ambigu-ous,and thus problematic for computational purposes.From just the graphical representa-tion of the semantic network in Figure3.1,for example,it is not clear whether companies can only bookflights for their own employees or for employees of partner companies as well.Neither is it clear from the fragment Company books−−−−−→Trip whether every com-pany books trips or just some company.Also for rules,despite their much more formal appearance,the exact meaning remains unclear when,for example,forms of negation are introduced that allow for potential conflicts between rules.Depending on the choice of procedural evaluation orflavour of formal semantics,different derivation results are being produced.The most prominent and fundamental logical formalism classically used for knowledge representation is the“first-order predicate calculus”,orfirst-order logic for short,and we choose this formalism to present logic as a form of knowledge representation here.First-order logic allows one to describe the domain of interest as consisting of objects,i.e.things that have individual identity,and to construct logical formulas around these objects formed by predicates,functions,variables and logical connectives[43].We assume that the reader is familiar with the notation offirst-order logic from formalisations of various mathematical disciplines.Similar to semantic networks,most statements in natural language can be expressed in terms of logical sentences about objects of the domain of interest with an appropriate choice of predicate and function symbols.Concepts are mapped to unary,relations to binary predicates.We illustrate the use of logic for knowledge representation by axiomatising parts of the semantic network from Figure3.1more precisely.Subsumption,for example,can be directly expressed by a logical implication,which is illustrated in the translation of the following fragment.Employee kindOf−−−−−−→Person∀x:(Employee(x)→Person(x))Due to the universal quantifier,the variable x in the logical formula ranges over all domain objects and its reading is“everything that is an employee is also a person”.Other parts of the network can be further restricted using logical formulas,as shown in the following example.Company books−−−−−→Trip∀x,y:(books(x,y)→Company(x)∧Trip(y))∀x:∃y:(Trip(x)→Company(y)∧books(y,x)) The graphical representation of the network fragment leaves some details open,while the logical formulas capture the booking relation between companies and trips more precisely. Thefirst formula states that domain and range of the booking relation are companies and trips,respectively,while the second formula makes sure that for every trip there does actu-ally exist a company that booked it.In particular,more complex restrictions that range over larger fragments of a network graph can be formulated in logic,where the intuitive graphical notation lacks expressiv-ity.As an example consider the relations between companies,trips and employees in the following fragment.3Knowledge Representation and Ontologies43 Company books←−−−−−−−−−−−Employee−−−−−→Trip participatesIn←−−−−−−−−−−−−−−−−−−−−−−−−employedAt∀x:∃y:(Trip(x)→Employee(y)∧participatesIn(y,x)∧books(employer(y),x)) The logical formula expresses additional knowledge that is not captured in the graph rep-resentation.It states that,for every trip,there must be an employee that participates in this trip while the employer of this participant is the company that booked theflight.Rules can also be formalised with logic.An IF-THEN-rule can be represented as a logical implication with universally quantified variables.For example,a common formali-sation of the ruleIF a trip’s source and destination cities are close to each otherTHEN the trip is by trainis the translation to the logical formula∀x,y,z:(Trip(x)∧startsFrom(x,y)∧endsIn(x,z)∧close(y,z)→TrainRide(x)). However,the typical rule-based systems do not interpret such a formula in the classical sense offirst-order logic but employ different kinds of semantics,which are discussed in Section3.2.Since a precise axiomatisation of domain knowledge is a prerequisite for processing knowledge within computers in a meaningful way,we focus on logic as the dominant form of knowledge representation.Therefore,we investigate different kinds of logics and formal semantics more closely in a subsequent section.In the context of the Semantic Web,two particular logical formalisms have gained momentum,reflecting the semantic network and rules forms of knowledge representation. The graph notations of semantic networks have been formalised through description log-ics,which are fragments offirst-order logic with typical Tarskian model-theoretic seman-tics but restricted to unary and binary predicates to capture the notions of concepts an relations.On the other hand,rules have been formalised through logic programming for-malisms with minimal model semantics,focusing on the derivation of simple facts about individual objects.Both description logics and logic programming can be found as underly-ing formalisms in various knowledge representation languages in the Semantic Web,which are addressed in Section3.4.3.1.3Reasoning about KnowledgeThe way in which we,as humans,process knowledge is by reasoning,i.e.the process of reaching conclusions.Analogously,a computer processes the knowledge stored in a knowledge base by drawing conclusions from it,i.e by deriving new statements that follow from the given ones.The basic operations a knowledge-based system can perform on its knowledge base are typically denoted by tell and ask[43].The tell-operation adds a new statement to the knowledge base,whereas the ask-operation is used to query what is known.The statements that have been added to a knowledge base via the tell-operation constitute the explicit knowledge a system has about the domain of interest.The ability to process explicit knowledge computationally allows a knowledge-based system to reason over a domain of interest by deriving implicit knowledge that follows from what has been told explicitly.44Stephan Grimm,Pascal Hitzler,Andreas AbeckerThis leads to the notion of logical consequence or entailment.A knowledge base KB is said to entail a statementαifα“follows”from the knowledge stored in KB,which is written as KB|=α.A knowledge base entails all the statements that have been added via the tell-operation plus those that are their logical consequences.As an example,consider the following knowledge base with sentences infirst-order logic.KB={Person(MisterX),participates(MisterX,FL4711),Flight(FL4711),books(UbiqBiz,FL4711),∀x,y,z:(Flight(y)∧participates(x,y)∧books(z,y)→employedAt(x,z)),∀x,y:(employedAt(x,y)→Company(x)∧Employee(y)),∀x:(Person(x)→¬Company(x))}The knowledge base KB explicitly states that“MisterX is a person who participates in theflight FL4711booked by UbiqBiz”,that“participants offlights are employed at the company that booked theflight”,that“the employment relation holds between companies and employees”and that“persons are different from companies”.If we ask the question “Is MisterX employed at UbiqBiz?”by sayingask(KB,employedAt(MisterX,UbiqBiz))the answer will be yes.The knowledge base KB entails the fact that“MisterX is employed at UbiqBiz”,i.e.KB|=employedAt(MisterX,UbiqBiz),although it was not“told”so ex-plicitly.This follows from its general knowledge about the domain.A further consequence is that“UbiqBiz is a company”,i.e.KB|=Company(UbiqBiz),which is reflected by a positive answer to the questionask(KB,Company(UbiqBiz)).This follows from the former consequence together with the fact that“employment holds between companies and employees”.Another important notion related to entailment is that of consistency or satisfiability. Intuitively,a knowledge base is consistent or satisfiable if it does not contain contradictory facts.If we would add the fact that“UbiqBiz is a person”to the above knowledge base KB by sayingtell(KB,Person(UbiqBiz)),it would become unsatisfiable because persons are said to be different from companies.We explicitly said that UbiqBiz is a person while at the same time it can be derived that it is a company.In general,an unsatisfiable knowledge base is not very useful,since in logical for-malisms it would entail any arbitrary fact.The ask-operation would always return a posi-tive result independent from its parameters,which is clearly not desirable for a knowledge-based system.The inference procedures implemented in computational reasoners aim at realising the entailment relation between logical statements[43].They derive implicit statements from a given knowledge base or check whether a particular statement is entailed by a knowledge base.3Knowledge Representation and Ontologies45 An inference procedure that only derives entailed statements is called sound.Soundness is a desirable feature of an inference procedure,since an unsound inference procedure would potentially draw wrong conclusions.If an inference procedure is able to derive every statement that is entailed by a knowledge base then it is called pleteness is also a desirable property,since a complex chain of conclusions might break down if only a single statement in it is missing.Hence,for reasoning in knowledge-based systems we desire sound and complete inference procedures.3.2Logic-Based Knowledge-Representation FormalismsFirst-order(predicate)logic is the prevalent and single most important knowledge repre-sentation formalism.Its importance stems from the fact that basically all current symbolic knowledge representation formalisms can be understood in their relation tofirst-order logic. Its roots can be traced back to the ancient Greek philosopher Aristotle,and modernfirst-order predicate logic was created in the19th century,when the foundations for modern mathematics were laid.First-order logic captures some of the essence of human reasoning by providing a notion of logical consequence as already mentioned.It also provides a notion of universal truth in the sense that a logical statement can be universally valid(and thus called a tautology), meaning that it is a statement which is true regardless of any preconditions.Logical consequence and universal truth can be described in terms of model-theoretic semantics.In essence,a model for a logical theory3describes a state of affairs which makes the theory true.A tautology is a statement for which all possible states of affairs are models.A logical consequence of a theory is a statement which is true in all models of the theory.How to derive logical consequences from a theory–a process called deduction or infer-encing–is obviously central to the study of logic.Deduction allows to access knowledge which is not explicitly given but implicitly represented by a theory.Valid ways of deriv-ing logical consequences from theories also date back to the Greek philosophers,and have been studied since.At the heart of this is what has become known as proof theory.Proof theory describes syntactic rules which act on theories and allow to derive logical consequences without explicit recurrence to models.The notion of universal truth can thus be reduced to syntactic manipulations.This allows to abstract from model theory and enables deduction by symbol manipulation,and thus by automated means.Obviously,with the advent of electronic computing devices in the20th century,the automation of deduction has become an important and influentialfield of study.Thefield of automated reasoning is concerned with the development of efficient algorithms for de-duction.These algorithms are usually required to be sound,and completeness is a desired feature.The fact that sound and complete deduction algorithms exist forfirst-order predicate logic is reflected by the statement thatfirst-order logic is semi-decidable.More precisely,3A logical theory denotes a set of logical formulas,seen as the axioms of some theory to be mod-elled.46Stephan Grimm,Pascal Hitzler,Andreas Abeckersemi-decidability offirst-order logic means that there exist algorithms which,given a the-ory and a query statement,terminate with positive answer infinite time whenever the state-ment is a logical consequence of the theory.Note that for semi-decidability,termination is not required if the statement is not a logical consequence of the theory,and indeed,ter-mination(with the correct negative answer)cannot be guaranteed in general forfirst-order logical theories.For some kinds of theories,however,sound and complete deduction algorithms exist which always terminate.Such theories are called decidable,and they have certain more-or-less obvious advantages,including the following.•Decidability guarantees that the algorithm always comes back with a correct answer infinite time.4Under semi-decidability,an algorithm which runs for a considerable amount of time may still terminate,or may not terminate at all,and thus the user cannot know whether he has waited long enough for an answer.Decidability is particularly important if we want to reason about the question of whether or not a given statement is a logical consequence of a theory.•Experience shows that practically efficient algorithms are often available for decidable theories due to the effective use of heuristics.Often,this is even the case if worst-case complexity is very high.3.2.1Description LogicsDescription logics[3]are essentially decidable fragments offirst-order logic,5and we have just seen why the study of these is important.At the same time,description logics are expressive enough such that they have become a major knowledge representation paradigm, in particular for use within the Semantic Web.We will describe one of the most important and influential description logics,called ALC.Other description logics are best understood as restrictions or extensions of ALC.We introduce the standard description logic notation and give a formal mapping into standard first-order logic syntax.The Description Logic ALCA description logic theory consists of statements about concepts,individuals,and their re-lations.Individuals correspond to constants infirst-order logic,and concepts correspond to unary predicates.In terms of semantic networks,description logic concepts correspond to general concepts in semantic networks,while individuals correspond to individual con-cepts.We deal with conceptsfirst,and will talk about individuals later.Concepts can be named concepts or anonymous(composite)d concepts consist simply of a name,say“human”,which will be mapped to a unary predicate in4It should be noted that there are practical limitations to this due to the fact that computing resources are always limited.A theoretically sound,complete and terminating algorithms may thus run into resource limits and terminate without an answer.5To be precise,there do exist some description logics which are not decidable.And there exist some which are not straightforward fragments offirst-order logics.But for this general introduction,we will not concern ourselves with these.。
Knowledge is a treasure that enriches the mind and empowers individuals to navigate through life with confidence and understanding.Here are some key points that highlight the significance of knowledge in our lives:1.Foundation for Learning:Knowledge forms the basis for further learning.It is the building block upon which we construct our understanding of the world.2.Critical Thinking:Knowledge enables us to think critically and make informed decisions.It helps us to analyze situations,evaluate options,and solve problems effectively.3.Personal Growth:Acquiring knowledge contributes to personal development.It broadens our horizons,enhances our skills,and allows us to grow both intellectually and emotionally.4.Cultural Understanding:Knowledge of different cultures,languages,and histories fosters tolerance and respect for diversity.It helps us to appreciate the richness of human experiences across the globe.5.Economic Opportunities:Knowledge is a key factor in economic success.It opens doors to better job opportunities,higher income,and improved living standards.6.Innovation and Creativity:Knowledge fuels innovation and creativity.It provides the raw materials for new ideas and inventions that can change the world.7.Social Progress:Knowledge is essential for social progress.It helps societies to address complex issues such as poverty,inequality,and environmental challenges.8.Health and Wellbeing:Knowledge about health and wellbeing is crucial for maintaining a good quality of life.It informs us about nutrition,exercise,and mental health practices.9.Civic Responsibility:Knowledge empowers citizens to participate actively in the democratic process.It helps us to understand our rights and responsibilities and to contribute to the common good.10.Lifelong Learning:In an everchanging world,the pursuit of knowledge should be a lifelong endeavor.It keeps us adaptable and relevant in a rapidly evolving society.In conclusion,knowledge is not just a means to an end but an end in itself.It is thecurrency of the modern world,and those who possess it have the power to shape their own destinies and contribute to the betterment of society.。
An Ontology-based Knowledge Management Systemfor Industry ClustersPradorn Sureephong1, Nopasit Chakpitak1, Yacine Ouzrout2, Abdelaziz Bouras21Department of Knowledge Management, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai, Thailand. {dorn | nopasit}@2LIESP, University Lumiere Lyon 2, Lyon, France, {yacine.ouzrout | abdelaziz.bouras}@univ-lyon2.frAbstractKnowledge-based economy forces companies in every country to group together as a cluster in order to maintain their competitiveness in the world market. The cluster development relies on two key success factors which are knowledge sharing and collaboration between the actors in the cluster. Thus, our study tries to propose a knowledge management system to support knowledge management activities within the cluster. To achieve the objectives of the study, ontology takes a very important role in the knowledge management process in various ways; such as building reusable and faster knowledge-bases and better ways of representing the knowledge explicitly. However, creating and representing ontology creates difficulties to organization due to the ambiguity and unstructured nature of the source of knowledge. Therefore, the objectives of this paper are to propose the methodology to capture, create and represent ontology for organization development by using the knowledge engineering approach. The handicraft cluster in Thailand is used as a case study to illustrate our proposed methodology.Keywords: Ontology, Semantic, Knowledge Management System, Industry Cluster 1.1 IntroductionIn the past, the three production factors (Land, Labor and Capital) were abundant, accessible and were considered as the reason of economic advantage, knowledge did not get much attention [1]. Nowadays, it is the knowledge-based economy era which is affected by the increasing use of information technologies. Thus, previous production factors are currently no longer enough to sustain a firm’s competitive advantage; knowledge is being called on to play a key role [2]. Most industries try to use available information to gain more competitive advantages than others. Knowledge-based economy is based on the production, distribution and use of knowledge and information [3]. The study of Yoong and Molina [1] assumed that one way of surviving in today’s turbulent business environment for business organizations is to form strategic alliances or mergers with other similar or2 P. Sureephong, N. Chakpitak, Y.Ouzrout and A. Bourascomplementary business companies. The conclusion of Yoong and Molina’s study supports the idea of industry cluster [3] which is proposed by Porter in 1990.The objectives of the grouping of firms as a cluster are maintaining the collaboration and sharing of knowledge among the partners in order to gain competitiveness in their market. Therefore, Knowledge Management (KM) becomes a critical activity in achieving the goals. In order to manage the knowledge, ontology plays an important role in enabling the processing and sharing of knowledge between experts and knowledge users. Besides, it also provides a shared and common understanding of a domain that can be communicated across people and application systems. On the other hand, creating ontology for an industry cluster can create difficulties to the Knowledge Engineer (KE) as well, because of the complexity of the structure and time consumed. In this paper, we will propose the methodology for ontology creation by using knowledge engineering methodology in the industry cluster context.1.2 Literature Review1.2.1 Industry Cluster and Knowledge MangementThe concept of the industry cluster was popularized by Prof. Michael E. Porter in his book “Competitive Advantages of Nations” [3] in 1990. Then, industry cluster becomes the current trend in economic development planning. However, there is considerable debate regarding the definition of the industry cluster. Based on Porter’s definition of industry cluster [4], the cluster can be seen as a “geographically proximate group of companies and associated institutions (for example universities, government agencies, and related associations) in a particular field, linked by commonalities and complementarities”. The general view of industry cluster map is shown in figure 1.1. Until now, literature of the industry cluster and cluster building has been rapidly growing both in academic and policy-making circles [5].Figure 1.1. Inustry Cluster MapOntology-based Knowledge Management System for Industry Cluster 3After the concept of industry cluster [3] was tangibly applied in many countries, companies in the same industry tended to link to each other to maintain their competitiveness in their market and to gain benefits from being a member of the cluster. From the study of ECOTEC in 2005[6] regarding the critical success factors in cluster development, the two critical success factors are collaboration in networking partnership and knowledge creation for innovative technology in the cluster which are about 78% and 74% of articles mentioned as success criteria accordingly. This knowledge is created through various forms of local inter-organizational collaborative interaction [7]. They are collected in the form of tacit and explicit knowledge in experts and institutions within cluster. We applied knowledge engineering techniques to the industry cluster in order to capture and represent the tacit knowledge in the explicit form.1.2.2 Knowledge Engineering TechniquesInitially knowledge engineering was just a field of the artificial intelligence. It was used to develop knowledge-based systems. Until now, knowledge engineers have developed their principles to improve the process of knowledge acquisition since last decade [8]. These principles are used to apply knowledge engineering in many actual environment issues. Firstly, there are different types of knowledge. This was defined as “know what” and “know how” [9] or “explicit” and “tacit” knowledge from Nonaka’s definition [10] Secondly, there are different type of experts and expertise. Thirdly, there are many ways to present knowledge and use of knowledge. Finally, the use of structured method to relate the difference together to perform knowledge oriented activity [11].In our study, many knowledge engineering methods have been compared [12] in order to select a suitable method to be applied to solve the problem of industry cluster development; i.e. SPEDE, MOKA, CommonKADS. We adopted CommonKADS methodology because it provides sufficient tools; such as a model suite (figure 1.2) and templates for different knowledge intensive tasks.Figure 1.2. CommonKADS Model SuiteContextConceptArtifact4 P. Sureephong, N. Chakpitak, Y.Ouzrout and A. Bouras1.2.2 Ontology and Knowledge ManagementThe definition of ontology by Gruber (1993) [13] is “explicit specifications of a shared conceptualization”. A conceptualization is an abstract model of facts in the world by identifying the relevant concepts of the phenomenon. Explicit means that the type of concepts used and the constraints on their use are explicitly defined. Shared reflects the notion that an ontology captures consensual knowledge, that is, it is not private to the individual, but accepted by the group.Basically, the role of ontology in the knowledge management process is to facilitate the construction of a domain model. It provides a vocabulary of terms and relations in a specific domain. In building a knowledge management system, we need two types of knowledge [14]:Domain knowledge: Knowledge about the objective realities in the domain of interest (Objects, relations, events, states, causal relations, etc. that are obtained in some domains)Problem-solving knowledge: Knowledge about how to use the domain knowledge to achieve various goals. This knowledge is often in the form of a problem-solving method (PSM) that can help achieve the goals in a different domain.In this study, we focus on ontology creation and representation by adopting knowledge engineering methodology to support both dimensions of knowledge. We use the ontology as a main mechanism to represent information and knowledge, and to define the meaning of terms used in the content language and the relation in the knowledge management system.1.3 MethodologyOur proposed methodology divides ontology into three types: generic ontology, domain ontology and task ontology. Generic ontology is the ontology which is re-useable across the domain, e.g. organization, product specification, contact, etc. Domain ontology is the ontology defined for conceptualizing on the particular domain, e.g. handicraft business, logistic, import/export, marketing, etc. Task ontology is the ontology that specifies terminology associated with the type of tasks and describes the problem solving structure of all the existing tasks, e.g. paper production, product shipping, product selection, etc.In our approach to implement ontology-based knowledge management, we integrated existing knowledge engineering methodologies and ontology development processes. We adopted CommonKADS for knowledge engineering methodology and OnToKnowledge (OTK) methodology for ontology development. Figure 1.3 shows the assimilation of CommonKADS and On-To-Knowledge (OTK) [15].Ontology-based Knowledge Management System for Industry Cluster 5Figure 1.3. Steps of OTK methodology and CommonKADS model suite3.1 Feasibility Study PhaseThe feasibility study serves as decision support for an economical, technical and project feasibility study, in order to select the most promising focus area and target solution. This phase identifies problems, opportunities and potential solutions for the organization and environment. Most of the knowledge engineering methodologies provide the analysis method to analyze the organization before the knowledge engineering process. This helps the knowledge engineer to understand the environment of the organization. CommonKADS also provides context levels in the model suite (figure 1.2) in order to analyze organizational environment and the corresponding critical success factors for a knowledge system [16]. The organization model provides five worksheets for analyzing feasibility in the organization as shown in figure 1.4.Figure 1.4. Organization Model WorksheetsThe Knowledge engineer can utilize OM-1 to OM-5 worksheets for interviewing with knowledge decision makers of organizations. Then, the outputs6 P. Sureephong, N. Chakpitak, Y.Ouzrout and A. Bourasfrom OM are a list of knowledge intensive tasks and agents which are related to each task. Then, KE could interview experts in each task using TM and AM worksheets for the next step. Finally, KE validates the result of each module with knowledge decision makers again to assess impact and changes with the OTA worksheet.1.3.2 Ontology Kick Off PhaseThe objective of this phase is to model the requirements specification for the knowledge management system in the organization. The Ontology Requirement Support Document (ORSD) [17]guides knowledge engineers in deciding about inclusion and exclusion of concepts/relations and the hierarchical structure of the ontology. It contains useful information, i.e. Domain and goal of the ontology, Design guidelines, Knowledge source, User and usage scenario, Competency questions, and Application support by the ontology[15].Task and Agent Model are separated in to TM-1, TM-2 and AM worksheets. They facilitate KE to complete the ORSD. The TM-1 worksheet identifies the features of relevant tasks and knowledge sources available. TM-2 worksheet concentrates in detail on bottleneck and improvement relating to specific areas of knowledge. AM worksheet lists all relevant agents who possess knowledge items such as domain experts or knowledge workers.1.3.3 Refinement PhaseThe goal of the refinement phase is to produce a mature and application-oriented target ontology according to the specification given by the kick off phase [18]. The main tasks in this phase are knowledge elicitation and formalization.Knowledge elicitation process with the domain expert based on the initial input from the kick off phase is performed. CommonKADS provides a set of knowledge templates [11] in order to support KE to capture knowledge in different types of tasks. CommonKADS classify knowledge intensive tasks in two categories; i.e. analytic tasks and synthetic tasks. The first is a task regarding systems that pre-exist. In opposition, the synthetic task is about the system that does not yet exist. Thus, KE should realize about the type of task that he is dealing with. Figure 1.5 shows the different knowledge task types.Figure 1.5. Knowledge-intensive task types based on the type of problemOntology-based Knowledge Management System for Industry Cluster 7 Knowledge formalization is transformation of knowledge into formal representation languages such as Ontology Inference Layer (OIL) [19], depends on application. Therefore, the knowledge engineer has to consider the advantages and limitations of the different languages to select the appropriate one.1.3.4 Evaluation PhaseThe main objectives of this phase are to check, whether the target ontology suffices the ontology requirements and whether the ontology based knowledge management system supports or answers the competency questions, analyzed in the feasibility and kick off phase of the project. Thus, the ontology should be tested in the target application environment. A prototype should already show core functionalities of the target system. Feedbacks from users of the prototype are valuable input for further refinement of the ontology. [18]1.3.5 Maintenance and Evolution PhaseThe maintenance and evolution of an ontology-based application is primarily an organizational process [18]. The knowledge engineers have to update and maintain the knowledge and ontology in their responsibility. In order to maintain the knowledge management system, an ontology editor module is developed to help knowledge engineers.1.4 Case StudyThe initial investigations have been done with 10 firms within the two biggest handicraft associations in Thailand and Northern Thailand. No rthern H andicraft M anufacturer and EX porter (NOHMEX) association is the biggest handicraft association in Thailand which includes 161 manufacturers and exporters. Another association which is the biggest handicraft association in Chiang Mai is named Chiang Mai Brand which includes 99 enterprises. It is a group of qualified manufacturers who have capability to export their products and pass the standard of Thailand’s ministry of commerce.The objective of this study is to create a Knowledge Management System (KMS) for supporting this handicraft cluster. One of the critical tasks to implement this system is creating ontologies of the knowledge tasks. Because, ontology is recognized as an appropriate methodology to accomplish a common consensus of communication, as well as to support a diversity of activities of KM, such as knowledge repository, retrieval, sharing, and dissemination [20]. In this case, knowledge engineering methodology was applied for ontology creation in the domain of Thailand’s handicraft cluster.Domain Ontology: can be created by using three models in context level of model suite; i.e. organization model, task model and agent model. At the beginning of domain ontology creation, we adopt generic ontology plus acquired information from the worksheets as an outline. Then, the more information that can be acquired8 P. Sureephong, N. Chakpitak, Y.Ouzrout and A. Bourasfrom organization and environment, the more domain-oriented ontology can be filled-in.Task Ontology: specifies terminology associated with the type of tasks and describes the problem solving structure.The objective of knowledge engineering methods is to solve problems in a specific domain. Thus, most of knowledge engineering approaches provide a collection of predefined sets of model elements for KE [16]. CommonKADS methodology also provides a set of templates in order to support KE to capture knowledge in different types of tasks. As shown in figure 1.5, there are various types of knowledge tasks that need different ontology. Thus, KE has to select the appropriate template in order to capture right knowledge and ontology. For illustration, we will use classification template for analytic task as an example for task ontology creation. Figure 1.6 shows the inferences structure for classification method (left side) and task ontology (right side).Figure 1.6. CommonKADS classification template and task ontology In the case study of a handicraft cluster, one of the knowledge intensive tasks is about product selection for exporting. Not all handicraft products are exportable due to their specifications, function, attributes, etc. Moreover, there are many criteria to select a product to be exported to specific countries. So we defined the task ontology of the product selection task (see the right side of figure 1.6).1.5 ConclusionThe most important role of ontology in knowledge management is to enable and to enhance knowledge sharing and reusing. Moreover, it provides a common mode of communication among the agents and knowledge engineer [14]. However, the difficulties of ontology creation are claimed in most literature. Thus, this study focuses on creating ontology by adopting the knowledge engineering methodology which provides tools to support us for structuring knowledge. Thus, ontology was applied to help Knowledge Management System (KMS) for the industry cluster to achieve their goals. The architecture of this system consists of three parts,Ontology-based Knowledge Management System for Industry Cluster 9 knowledge system, ontology, and knowledge engineering. Hence, the proposed methodology was used to create ontology in the handicraft cluster context. During the manipulation stage, when users accesses the knowledge base, the ontology can support tasks of KM as well as searching. The knowledge base and the ontology is linked one to another via the ontology module. In the maintenance stage, knowledge engineers or domain experts can add, update, revise, and delete the knowledge or domain ontology via knowledge acquisition module [21].To test and validate our approach and architecture, we used the handicraft cluster in Thailand as a case study. In our perspectives of this study, we will finalize the specification of the shareable knowledge/information and the conditions of sharing among the cluster members. Then, we will capture and maintain the knowledge (for reusing knowledge when required) and work on the specific infrastructure to enhance the collaboration. At the end of the study, we will develop the knowledge management system for the handicraft cluster relating to acquiring requirements specification from the cluster.1.6 References[1]Young P, Molina M, (2003) Knowledge Sharing and Business Clusters, In: 7th PacificAsia Conference on Information Systems, pp.1224-1233.[2]Romer P, (1986) Increasing Return and Long-run Growth, Journal of PoliticalEconomy, vol. 94, no.5, pp.1002-1037.[3]Porter M E, (1990) Competitive Advantage of Nations, New York: Free Press.[4]Porter M E, (1998) On Competition, Boston: Harvard Business School Press.[5]Malmberg A, Power D, (2004) (How) do (firms in) cluster create knowledge?, inDRUID Summer Conference 2003 on creating, sharing and transferring knowledge, Copenhagen, June 12-14.[6]DTI, (2005) A Practical Guide to Cluster Development, Report to Department ofTrade and Industry and the English RDAs by Ecotec Research & Consulting.[7]Malmberg A, Power D, On the role of global demand in local innovation processes:Rethinking Regional Innovation and Change, Shapiro P, and Fushs G, Dordrecht, Kluwer Academic Publishers.[8]Chua A, (2004) Knowledge management system architecture: a bridge between KMconsultants and technologist, International Journal of Information Management, vol.24, pp. 87-98.[9]Lodbecke C, Van Fenema P, Powell P, Co-opetition and Knowledge Transfer, TheDATA BASE for Advances in Information System, vol.30, no. 2, pp.14-25.[10]Nonaka I, Takeuchi H, (1995) The Knowledge-Creating Company, Oxford UniversityPress, New York.[11]Shadbolt N, Milton N, (1999) From knowledge engineering to knowledgemanagement, British Journal of Manage1ment, vol. 10, no. 4, pp. 309-322, Dec. [12]Sureephong P, Chakpitak N, Ouzrout Y, Neubert G, Bouras A, (2006) Economicbased Knowledge Management System for SMEs Cluster: case study of handicraft cluster in Thailand. SKIMA Int. Conference, pp.10-15.[13]Gruber TR, (1991) The Role of Common Ontology in Achieving Sharable, ReusableKnowledge Bases, In J. A. Allen, R. Fikes, & E. Sandewall (Eds.), Principles of Knowledge Representation and Reasoning: Proceedings of the Second International Conference, Cambridge, MA, pp. 601-602.10 P. Sureephong, N. Chakpitak, Y.Ouzrout and A. Bouras[14]Chandrasekaran B, Josephson, JR, Richard BV, (1998) Ontology of Tasks andMethods, In Workshop on Knowledge Acquisition, Modeling and Management (KAW'98), Canada.[15]Sure Y, Studer R, (2001) On-To-Knowledge Methodology, evaluated and employedversion. On-To-Knowledge deliverable D-16, Institute AIFB, University of Karlsruhe.[16]Schreiber A Th, Akkermans H, Anjewerden A, de Hoog R, Shadbolt N, van de VeldeW, Wielinga B, (1999) Knowledge Engineering and Management: The CommonKADS Methodology, The MIT Press.[17]Sure Y, Studer R, (2001) On-To-Knowledge Methodology, final version. On-To-Knowledge deliverable D-18, Institute AIFB, University of Karlsruhe.[18]Staab S, Schnurr HP, Studer R, Sure Y, (2001) Knowledge processes and ontologies,IEEE Intelligent Systems, 16(1):26-35.[19]Fensel, Harmelen Horrocks (OIL)[20]Gruber T R, (1997) Toward principles for the design of ontologies used forknowledge sharing, Int. J Hum Comput Stud, vol. 43, no. 5-6, pp.907-28.[21]Chau K W, (2007) An ontology-based knowledge management system for flow andwater quality modeling, Advance in Engineering Software, vol. 38, pp. 172-181.。
Use of Ontologies for Cross-lingual Information Management in the Web Ben Hachey,Claire Grover,Vangelis Karkaletsis†,Alexandros Valarakos†,Maria Teresa Pazienza,Michele Vindigni,Emmanuel Cartier‡,Jos´e Coch‡Division of Informatics,University of Edinburghbhachey,grover@†Institute for Informatics and Telecommunications,NCSR“Demokritos”vangelis,alexv@iit.demokritos.grD.I.S.P.,Universita di Roma Tor Vergatapazienza,vindigni@info.uniroma2.it‡Lingwayemmanuel.cartier,Jose.Coch@AbstractWe present the ontology-based approach for cross-lingual information management of web contentthat has been developed by the EC-funded projectCROSSMARC.CROSSMARC can be perceivedas a meta-search engine,which identifies domain-specific information from the Web.To achievethis,it employs agents for web crawling,spider-ing,information extraction from web pages,datastorage,and data presentation to the user.Domainontologies are exploited by each of these agentsin different ways.The paper presents the ontol-ogy structure and maintenance before describinghow domain ontologies are exploited by CROSS-MARC agents.1IntroductionThe EC-funded R&D project CROSSMARC1pro-poses a methodology for management of infor-mation from web pages across languages.It isa full-scale approach starting with the identifica-tion of web sites in various languages that containpages in a specific domain.Next,the system lo-cates domain-specific web pages within the rele-vant sites and extracts specific product informa-tion from these pages.Finally,the end user in-teracts with the system through a search interfaceallowing them to select and view products accord-ing to the characteristics they deem important.Aunique ontology structure is exploited throughoutthis process in different ways.2Two domains are being implemented during the term ofthe project:laptops and job offers.Domain-specific spidering is managed by the Spidering Agent.The Spidering Agent identi-fies domain-specific web pages grouped under the sites discovered by the Crawling Agent and feeds them to the Information Extraction Agent.The Information Extraction Agent manages communication with remote information extrac-tion systems(four such systems are employed for the four languages of the project).These sys-tems process Web pages collected by the Spider-ing Agent and extract domain facts from them (Grover et al.,2002).The facts are stored in the system’s database.Information storage and retrieval is managed by the Data Storage Agent.Its tasks consist of main-taining a database of facts for each domain,adding new facts,updating already stored facts and per-forming queries on the database.Finally,infor-mation presentation is managed by the Personali-sation Agent,which allows the presentation to be adapted to user preferences and locale. CROSSMARC is a cross-lingual multi-domain system for product comparison.The goal is to cover a wide area of possible knowledge domains and a wide range of conceivable facts in each do-main,hence the CROSSMARC model implements a shallow representation of knowledge for each domain in an ontology(Pazienza et al.,2003).A domain ontology reflects a degree of expert knowl-edge for that domain.Cross-linguality is achieved through the lexical layer of the ontology,which provides language specific synonyms for all on-tology entries.In the overall processingflow,the ontology plays several key roles:During Crawling&Spidering,it comes into use as a“bag of words”–that is,a roughterminological description of the domain thathelps CROSSMARC crawlers and spiders toidentify the interesting web pages.During Information Extraction,it drives theidentification and classification of relevantentities in textual descriptions.It is also usedduring fact extraction for the normalisationand matching of named entities.During Data Storage&Presentation,the lex-ical layer of the ontology makes possible aneasy rendering of a product description fromone language to er stereotypesmaintained by the Personalisation Agent in-clude ontology attributes in order to representstereotype preferences according to the ontol-ogy.Thus,results can be adapted to the pref-erences of the end user who can also com-pare uniform summaries of offers descrip-tions from Web sites written in different lan-guages.This paperfirst presents the CROSSMARC on-tology and discusses ontology management issues. It then details the manner in which CROSSMARC agents exploit domain-specific ontologies at vari-ous processing stages of the multi-agent architec-ture.It next presents related work before conclud-ing with a summary of the current status of the project and future plans.2The CROSSMARC Ontology2.1Ontology StructureThe structure of the CROSSMARC ontology has been designed,first,to beflexible enough to be applied to different domains and languages with-out changing the overall structure and,second,to be easily maintainable by modifying only the ap-propriate features.For this reason,we have con-structed a three-layered structure.The ontology consists of a meta-conceptual layer,a conceptual layer,and an instances layer.The instances layer can be further divided into concept instances and lexical instances,which provide support for multi-lingual product information.For use by CROSS-MARC agents,the concept instances and lexical instances are exported into XML(Figures2and 3).The meta-conceptual layer defines the top-level commitments of the CROSSMARC ontology ar-chitecture defining the language used in the con-ceptual layer.It denotes three meta-elements(fea-tures,attributes,and values),which are used in the conceptual level to assign computational se-mantics to elements of the ontology.Also,this layer defines the structure of the templates that will be used in the information extraction phase. In essence,the meta-conceptual layer specifies the top-level semantics of CROSSMARC across do-mains.The conceptual layer is composed of the con-cepts that populate the specific domain of interest. These concepts follow the structure defined in the meta-conceptual layer for their internal represen-tation and the relationship amongst them.Each concept element is discriminated by the use of a unique identity(ID)number,which is called onto-reference.This conceptual layer defines the se-mantics of a given domain.An important aspect of this is the domain-specific information extrac-tion template.Finally,the instances layer represents domain specific individuals.It consists of two types of in-stances:(1)concept instances that act as the nor-malised values of each individual,and(2)lexicalinstances that denote linguistic relationships be-tween concepts or instances for each natural lan-guage.Concepts are instantiated in this layer by populating their attribute(s)with appropriate val-ues.Every instance is unique and a unique identity number,named onto-value,is attributed to it.As previously mentioned,lexical instances sup-port multi-lingual information.They are instan-tiated in a domain specific lexicon for each natu-ral language supported(currently English,Greek, French,and Italian).Here,possible instantiations of ontology concepts for each language are listed as synonyms.The“idref”attribute on synonym list nodes associates lexical items with the ontol-ogy concept or instances that it corresponds to. Also,regular expressions can be provided for each node of a lexicon for a broader coverage of syn-onyms.We can illustrate the overall ontology structure with an example concept instantiation from the laptop domain.Again,the way we describe the structure of the domain is constrained by the meta-conceptual layer.The conceptual layer defines the items of interest in the domain,for laptops, these include information about the brand(e.g. manufacturer name,model),about the processor (e.g.brand,speed),about preinstalled software (e.g.OS,applications),and etcetera.Finally,in the instances layer,we declare instances of con-cepts and provide a list of possible lexical realisa-tions.For example,the exported domain ontology in Figure2lists‘Fujitsu-Siemens’as an instance of the manufacturer name concept and the ex-ported English lexicon in Figure3lists alternative lexical instantiations of‘Fujitsu-Siemens’. Though it is common knowledge that concep-tual clustering is different from one language to the next,the ontology structure described is suf-ficient to deal with product comparison.Firstly, because commercial products are fairly interna-tional,cross-cultural concepts.Secondly,the on-tology design phase of adding a new domain pro-vides a forum for discussing and addressing lin-guistic cultural differences.2.2Ontology MaintenanceAfter a survey of existing ontology editors and tools,we decided to use Prot´e g´e-20003as the tool for ontology development and maintenance in CROSSMARC.We modified and improved the Prot´e g´e model of representation and the user-interface in order tofit CROSSMARC’s user needs and to facilitate the process of editing CROSS-MARC ontologies.This work has led to the de-velopment and release of a range of tab plug-ins dedicated to the editing of sections of the ontology related to specific steps in the Ontology Mainte-nance Process.The default Prot´e g´e editing Tabs are divided into Class,Slots and Instances.Although this or-ganisation is quite logical,it was impractical for the purposes of CROSSMARC,as the Class view of the knowledge base puts together the Domain Model,the Lexicons,and the Meta layers.For this reasons we developed several plug-in Tabs(de-scribed in Table1)that focus the attention on each different aspect of the knowledge base,allowing for more functional inspection and editing of the specific component under analysis.For more in-formation on ontology maintenance in CROSS-MARC,refer to(Pazienza et al.,2003).3Ontology Use in CROSSMARC3.1Crawling&SpideringThe CROSSMARC implementation of crawling exploits the topic-based website hierarchies used by various search engines to return web sites un-der given points in these hierarchies.It also takes a given set of queries,exploiting CROSSMARC domain ontologies and lexicons,submits them to a search engine,and then returns those sites that cor-respond to the pages returned.The list of web sites output from the crawler isfiltered using a light ver-sion of the site-specific spidering tool(NEAC)im-Prot´e g´e TabWorld modellingCreation of a task-oriented model to be used as template forpurposes of fact-extractionUpgrade of the lexicon for the ontologyImport and Export of the Ontology and Lexicons in XML ac-cording to the Schema adopted in CROSSMARCTable1:CROSSMARC Prot´e g´e Tabs with description of associated maintenance tasks.plemented in CROSSMARC,which also exploits the ontology.The CROSSMARC web spidering tool explores each site’s hierarchy starting at the top page of the site,scoring the links in the page and following “useful”links.Each visited page is evaluated and if it describes one or more offers,it is classified as positive and is stored in order to be processed by the information extraction agent.Thus,the CROSSMARC web spidering tool integrates de-cision functions for page classification(filtering) and link scoring.Supervised machine learning methods are used to create the page classification and link scoring tools.The development of these classifiers re-quires the construction of a representative train-ing set that will allow the identification of impor-tant distinguishing characteristics for the various classes.This is not always a trivial task,particu-larly so for Web page classification.We devised a simple approach which is based on an interactive process between the user(person responsible for corpus formation)and a simple nearest-neighbour classifier.The resulting Corpus Formation Tool presents difficult pages to the user for manual clas-sification in order to build an accurate domain cor-pus with positive and negative examples.For the feature vector representation of the web pages,which is required both by the corpus for-mation tool and the supervised learning methods, we use the domain ontology and lexicons.A spe-cialised vectorisation module has been developed that translates the ontology and the lexicons into patterns to be matched in web pages.These pat-terns vary from simple key phrases and their syn-onyms to complex regular expressions that de-scribe numerical ranges and associated text.The vectorisation module generates such a patternfile (the feature definitionfile)which is then used by an efficient pattern-matcher to translate a web page into a feature vector.In the resulting bi-nary feature vector,each bit represents the exis-tence of a specific pattern in the corresponding web page.A detailed discussion and evaluation of the CROSSMARC crawling and spidering agents can be found in(Stamatakis et al.,2003).3.2Information ExtractionInformation Extraction from the domain-specific web pages collected by the crawling&spider-ing agents,involves two main sub-stages.First, an entity recognition stage identifies named enti-ties(e.g.product manufacturer name,company name)in descriptions inside the web page writ-ten in any of the project’s four languages(Grover et al.,2002).After this,a fact extraction stage identifies those named entities thatfill the slots of the template specifying the information to be ex-tracted from each web page.For this we combine wrapper-induction approaches for fact extraction with language-based information extraction in or-der to develop site-independent wrappers for the domain.Although each monolingual information extrac-tion system(four such systems are currently un-der development)employs different methodolo-gies and tools,the ontology is exploited in about the same way.During the named-entity recog-nition stage,all the monolingual IE systems em-ploy a gazetteer look up process in order to an-notate in the web page those words/phrases that belong to its gazetteers.These gazetteers are pro-duced from the ontology and the corresponding language-specific lexicon through an automatic or semi-automatic process.During the fact extraction stage,most of the IE systems employ a normalisation module.This runs after the identification of the named entities or expressions thatfill a fact slot according to the information extraction template(i.e.the enti-ties representing the product information that will eventually be presented to the end-user).The on-Figure4:Screen shot of CROSSMARC search form.tology and the language dependent lexicons are used for the normalisation of the recognised names and expressions thatfill fact slots.As afirst step, names and expressions are matched against entries in the ontology.If a match is not found,names and expressions are matched against all synonyms in the four lexicons.Whenever a match is found the text is annotated with the characteristic“on-toval”that takes as value the ID of the corre-sponding node from the domain ontology.If no match is found for a name or expression belong-ing to a closed class,their“ontoval”characteristic takes the value of the ID of the corresponding“un-known”node.If the name or expression belongs to an open set the ID of the category is returned. In the cases of annotated numeric expressions,the module returns not only the corresponding ID of the ontology node but also the normalised value and unit.3.3Information Storage&PresentationThe information extracted and normalised by the monolingual IE systems is stored into the CROSS-MARC database by the Data Storage Agent.A separate database is constructed for each domain covered.The structure of the database is deter-mined by the fact extraction schema,which is gen-erated by the Template Editor Tab implemented in Prot´e g´e.The ontology is also exploited for the presenta-tion of information in the CROSSMARC end-user interface.The User Interface design(see Figures4 and5)is based on a web server application which accesses the data source(i.e.the Data Storage out-put)and provides the end user with a web interface for querying data sets.This interface is customised according to a user profile or stereotype main-tained by the personalisation agent and defined with respect to the domain ontology.Each query is forwarded to the Data Storage component and query results are presented to the user after subse-quent XSL transformation stages.These transfor-mations select the relevant features according to the user’s profile and apply appropriate lexical in-formation onto them by accessing the normalised lexical representations corresponding to the user’s language preferences.4Related WorkIn the last years,the increasing importance of the Internet has re-oriented the information extraction community somewhat toward tasks involving texts such as e-mails,web-pages,web-logs and news-groups.The main problems encountered by this generation of IE systems are the high heterogene-ity and the sparseness of the data on such do-mains.Machine learning techniques and ontolo-gies have been employed to overcome those prob-Figure5:Screen shot of CROSSMARC search results display.lems and improve system performance.RAPIER (Califf and Mooney,1997)is such a system that extracts information from computer job postings on USENET newsgroup.It uses a Lexical ontol-ogy to exploit the hypernym relationship to gen-eralise over a semantic class of a pre or postfiller pattern.Following the same philosophy,CRYS-TAL(Soderland et al.,1995)uses a domain ontol-ogy to relax the semantic constraints of its concept node definitions by moving up the semantic hierar-chy or dropping constraints in order to broaden the coverage.The W A VE(Aseltine,1999)algorithm exploits a semantic hierarchy restricted to a simple table look-up process to assign a semantic class to each term.And in(Vargas-Vera et al.,2001),an ontology is used to recognise the type of objects and to resolve ambiguities in order to choose the appropriate template for extraction.IE systems have encountered another limita-tion as regards the static nature of the background knowledge(i.e.the ontology)they use.For that reason bootstrapping techniques for semantic lex-icon and ontology extension during the IE pro-cesses have been introduced.(Brewster et al., 2002)uses an ontology to retrieve examples of lexicalisation of relations amongst concepts in a corpus to discover new instances which can be in-serted to the ontology after user’s validation.In (Maedche et al.,2002)and(Roux et al.,2000), the initial IE model is improved through extension of the ontology’s instances or concepts,exploiting syntactic resources.Ontologies are also used to alleviate the lack of annotated corpora.(Poibeau and Dutoit,2002) employ an ontology to overcome this limitation for an information extraction task.The use of on-tologies in this work is twofold.First,it is used to normalise the corpus by replacing the instances with their corresponding semantic class using a named entity recogniser to specify the instances. Second,it generated patterns exploiting the se-mantic proximity between two words(where one of them is the word that should be extracted)in order to propose new patterns for extraction.The ontology used in this work is a multilingual net overfive languages having more than100differ-ent kinds of links.Kavalec(2002)conducted an ontological anal-ysis of web directories and constructed a meta-ontology of directory headings plus a collection of interpretation rules that accompany the meta-ontology.He treats the meta-ontology schema asa template for IE and uses the ontology’s schema and interpretation rules to drive the information extraction process in the sense offilling a tem-plate.Another work(Craven et al.,1999)uses an ontology that describes classes and relationships of interest in conjunction with labelled regions of hypertext representing instances of the ontology to create an information extraction method for each desired type of knowledge and construct a knowl-edge base from the WWW.5Current Work and Conclusions CROSSMARC is a novel,cross-lingual approach to e-retail comparison that is rapidly portable to new domains and languages.The system crawls the web for English,French,Greek,and Italian pages in a particular domain extracting informa-tion relevant to product comparison.We have recently performed a user evaluation of the CROSSMARC system in thefirst domain. This evaluation consisted of separate user tasks concerning the crawling,spidering,and informa-tion extraction agents as well as the end user in-terface(Figures4and5).We are in the process of analysing the results and are scheduling further user evaluations.We are also currently porting the system into the domain of job offers.An important result of this will be the formalised customisation strategy. This will detail the engineering process for cre-ating a product comparison system in a new do-main,a task that consists broadly of developing a new domain ontology,filling lexicons,and train-ing the crawling,spidering,and information ex-traction tools.The CROSSMARC system benefits from a novel,multi-level ontology structure which con-strains customisation to new domains.Further-more,domain ontologies and lexicons provide an important knowledge resource for all component agents.The resulting system deals automatically with issues that semantic web advocates hope to ly,the web is built for human con-sumption and thus uses natural language and vi-sual layout to convey content,making it difficult for machines to effectively exploit Web content. CROSSMARC explores an approach to extract-ing and normalising product information that is adapted to new domains with minimal human ef-fort.ReferencesJ.H.Aseltine.1999.W A VE:An incremental algorithm for information extraction.In Proceedings of the16th Na-tional Conference on Artificial Intelligence(AAAI1999).C.Brewster,F.Ciravegna,and er centeredontology learning for knowledge management.In Pro-ceedings of7th International Workshop on Applications of Natural Language to Information Systems.M.E.Califf and R.J.Mooney.1997.Relational Learning of Pattern-Match Rules for Information Extraction.In Pro-ceedings of the1st Workshop on Computational Natural Language Learning(CoNLL-97).M.Craven, D.DiPasquo, D.Freitag, A.McCallum, K.Nigam,T.Mitchell,and S.Slattery.Learning to con-struct knowledge bases from the world wide web Artificial Intelligence,118:69–113.C.Grover,S.McDonald,D.Nic Gearailt,V.Karkaletsis,D.Farmakiotou,G.Samaritakis,G.Petasis,M.Pazienza,M.Vindigni,F.Vichot and F.Wolinski.2002.Multilin-gual XML-Based named entity recognition for e-retail do-mains.In Proceedings of the3rd International Confer-ence on Language Resources and Evaluation.M.Kavalec and rmation extraction and ontology learning guided by web directory.In Proceed-ings15th European Conference on Artificial Intelligence.A.Maedche,G.Neumann,and S.Staab.2002.Bootstrap-ping an ontology-based information extractions system.In P.S.Szczepaniak,J.Segovia,J.Kacprzyk,and L.A.Zadeh(eds),Intelligent Exploration of the Web.M.T.Pazienza,A.Stellato,M.Vindigni,A.Valarakos,and V.Karkaletsis.2003.Ontology integration in a multi-lingual e-retail system To appear in Proceedings of the Human Computer Interaction International(HCII’2003). T.Poibeau and D.Dutoit.2002.Generating extraction pat-terns from large semantic networks and an untagged cor-pus.In Proceedings of the19th International Conference on Computational Linguistics.C.Roux,D.Proux,F.Rechenmann,and L.Julliard.Anontology enrichment method for a pragmatic information extraction system gathering data on genetic interactions.In Proceedings of the ECAI2000Workshop on Ontology Learning.S.Soderland,D.Fisher,J.Aseltine,and W.Lehnert.1995.Issues in inductive learning of domain-specific text extrac-tion rules.In Proceedings of the Workshop on New Ap-proaches to Learning for Natural Language Processing. K.Stamatakis,V.Karkaletsis,G.Paliouras,J.Horlock,C.Grover,J.Curran,and S.Dingare.2003.Domain-specific web site identification:The CROSSMARC for-cused web crawler.To appear in Proceedings of the Sec-ond International Workshop on Web Document Analysis. M.Vargas-Vera,J.Domingue,Y.Kalfoglou,E.Motta,and S.Shum.2001.Template-driven information extraction for populating ontologies.In Proceedings IJCAI2001 workshop on Ontologies Learning.。
Building Ontologies for Knowledge Management Applications in Group SessionsDoris Meier Fraunhofer Institute for IndustrialEngineering, Nobelstrasse 12, 70569 Stuttgart,Germanydoris.meier@iao.fhg.deCarsten Tautzempolis Knowledge ManagementGmbHSauerwiesen 2, 67661 Kaiserslau-tern, GermanyCarsten.Tautz@Ralph Traphönerempolis Knowledge ManagementGmbHSauerwiesen 2, 67661 Kaiserslau-tern, GermanyRalph.Traphoener@Michael Wissen Fraunhofer Institute for IndustrialEngineering, Nobelstrasse 12, 70569 Stuttgart,Germany Michael.Wissen@iao.fhg.deJürgen Ziegler Fraunhofer Institute for IndustrialEngineering, Nobelstrasse 12, 70569 Stuttgart,GermanyJürgen.Ziegler @iao.fhg.deABSTRACTThis paper describes a system supporting the engineering of ontologies in group sessions. Here we use the term ontology as a synonym for the corporate vocabulary which is used as the basis for a knowledge management system. We discuss the MetaChart method for distributed groups working in creative sessions with intuitive system support. The groups may create information structures in these sessions and save their knowledge. We show the application of the method and finally outline the organization of the information. KEYWORDSDistributed Knowledge Capture, Group Knowledge Cap-ture, OntologiesINTRODUCTIONIn today’s economy, people get more and more specialized as the processes to be performed become increasingly knowledge-intensive. Consequently, knowledge is not “du-plicated” among the company staff without a dedicated ef-fort. Instead, specialized knowledge is “owned” by experts. To avoid needless double work (because some colleague has already solved the problem) and to avoid the unplanned “forgetting” of knowledge (e.g., if an expert leaves the com-pany), knowledge needs to be shared explicitly. The sharing of knowledge becomes vital to the competitiveness of a company.However, to be able to share knowledge effectively, it is most helpful to establish a common vocabulary. Without such a common vocabulary, it is difficult to share knowl-edge because not everybody in the company interprets the information in the same way. Hence, the corporate vocabu-lary is the basis for any knowledge management system. As such it acts as meta knowledge enabling and guiding both the indexing and recording of new documents as well as the search for existing documents. We use an object oriented representation for this vocabulary consisting of classes (1 for each kind of document) and types attributes. The value ranges of the attribute types are either numeric (integer, real, date, etc.) or enumerations. Enumerations are specified us-ing concepts and keys where keys are the varois word forms in which a particular concept can appear in a document (e.g. keys “house”, “houses”, “maison”, “Haus”, “Häuser” for the concept “house”). This allows for automatic knowledge ex-traction from existing documents.If it is specified explicitly and is used by a knowledge man-agement system, it is also often referred to as an ontology (Gruber, 1995). Clearly, all stakeholders must be involved in defining such an ontology as it decides on the usefulnessof the resulting knowledge management system (Tautz, 2001). Each stakeholder (sponsor, implementers, future us-ers, etc.) has his own distinct objectives, interests, and views. Therefore, bringing together the ideas of all stake-holders and detailing them to the degree needed is often a formidable and time-consuming task. This is especially true for knowledge management systems because existing knowledge (often from many knowledge sources) needs to be identified and structured.Yet, existing ontology engineering methods (for an over-view see (López, 1999)) and authoring environments of ex-isting knowledge management systems fail to support this process in its entirety. Although there are systems guiding the knowledge acquisition from users (e.g., Blythe et al., 2001) or supporting the collaborative construction of on-tologies (e.g., Farguhar et al., 1996), group discussions are not supported in which an ontology is incrementally devel-oped from vague ideas to a concrete and detailed structure.In a typical group discussion, some areas will have very detailed descriptions in the beginning (e.g., if a database exists, its schema can be readily used as part of the ontol-ogy), while others will be vague (this is usually the case for knowledge that has not been captured explicitly so far). These pieces of information are then gradually detailed and extended until a complete ontology results.The difficulties in engineering ontologies lies mainly in the transformation of human knowledge that is distributed among several stakeholders to structured knowledge that can be processed by computers. The complexity of this integra-tion process requires a communication and cooperation model, which goes considerably beyond simple group sup-port and takes into account the further use of elaborated results.A successfull approach should support the followings tasks:- Centralized as well as distributed group work- Personalized view of the information- Simultaneous work at the same objects and notifi-cation eventsWith this background in mind, we concentrate on finding an intuitive method for capturing knowledge about ontologies, that means about objects and their associations. Information objects generated in the context of group cooperation (con-cerning computer supported cooperative work see: Greif, 1988) may be used to derive object structures as well as meta activities. At the end of such a session we have a complete overview of the structure of the information. SCENARIOFor the following parts of the paper, a simple scenario will be used to show the advantages of the MetaChart method. In this scenario, the task of a group of stakeholders is to identify and structure all existing types of knowledge within the company “Smartco”. Smartco produces home enter-tainment equipment. The task shall focus on the exploita-tion of these knowledge sources for a knowledge manage-ment system which is about to be installed at Smartco. Four stakeholders are involved:- The sponsor Susan (who can make a decision on whether to exploit a given knowledge source ornot)- The database expert Greg (who has detailedknowledge on the trouble ticket database) - The project manager Sharon (who has managed al-ready several projects and has a good overview ofwhat kind of knowledge could be helpful in futureprojects)- The implementer John (who is responsible for set-ting up the knowledge management system)While Sharon and John have met personally for this discus-sion, Susan and Greg can join the meeting only remotely due to their tight schedule.The goal is to come up with an object-oriented model which describes the ontology and can be used as the basis for the later knowledge management system.CHARACTERISTICS OF GROUP SESSIONSWhat we aim at is an enhanced support for the user in the early phase of information structuring. Here it is of utmost importance which interaction possibilities are available to the user and what results are obtained. Group based interac-tion possibilities for cooperative work in creative environ-ments with their special needs are to be supported.To work productively with a computer system in a creative group session, this system must fulfill some requirements. First, the system has to be highly intuitive. The user may not be bothered by struggling with the system. It must not cost more time to edit results within the system as it would need to edit it on plain paper. Furthermore, there should not be a high effort in learning to operate with the system, but every group member should be able to use it instantly.Second, there should exist the possibility for different group members to personalize their view of the objects. For exam-ple, it could be that the session was started in a group situa-tion, basing on a powerwall. After the session, every mem-ber takes the results with him. To make use of these results, he should be able to personalize his view of the structure, for example to emphasize on certain parts without losing the rest of the information.Another requirement of a creative group session is the pos-sibility to begin a session at a very low level and add step by step semantics to the structures. First, for example, one could outline the main objects that should be modeled. Then, associations between the objects like inheritance, ag-gregation or interconnections could be added. Some objects perhaps should be extracted from the session and being modeled in an own group session, whose results flow back to the originating session. It is important that the system is open to any desired kind of semantics structure that the group desires to build. So there is no specification of struc-tures which the group must meet, but the group is com-pletely free and not handicapped by any predefined struc-tures they must fulfill.Creativity and cooperation support do not solely depend on the development of software-based tools. The physical envi-ronment where these tools are applied with their interaction possibilities is of equal importance. The implementation of the MetaChart method does not only provide for the exclu-sive use on customary PCs but also on large interactive in-terfaces as, e.g., the powerwall "Interactive Wall" at the Fraunhofer IAO (Wissen et al., 2001) with their various types of interaction.Emphasizing on the aspect of distributed work, it is of ut-most importance that every group member can participate in the session. Certainly this has to be combined with tradi-tional methods of distributed group work like video confer-encing. Using MetaChart, it is furthermore not only possi-ble to discuss with other group members, but also to work together on the same structures and to see instantly any change made.OVERVIEW OF THE METACHART METHODThe conception of MetaChart presented in this contribution finds its application in the preparation and exchange of in-formation and information structures within groups. We aim at the creation of a work environment in which group mem-bers can intuitively prepare information contents and link them into structures as simple as possible. The specific phases of building up information structures as well as generating information and ideas are of particular importance. In each of these phases we have several scenar-ios reflecting different kinds of group work. These scenarios consider cooperation taking place at a single location as well as in a distributed environment. According to model-view-controlling (Krasner and Pope, 1988), each member of the group may simultaneously work at the same model and therefore interact from any place on a shared work space. Modifications of the model layer, e.g. adding attributes, creating links or structuring information objects, are visible to all participants. The distributed and therefore for all team members accessible work space simplifies the process of building up information as a basis for organizational and structural tasks.Supplementing the approach of supporting group work is the possibility to us the MetaChart Method in a distributed envi-ronment. A group member which doesn’t have the possibil-ity to join in a session personally may connect himself via Internet to a running session and interact directly with the group, working actively and getting all changes instantly, as well as view the results later on for example in the included web interface.The work surface does not only serve as a pure media of presentation but, in fact, it simplifies the process of infor-mation building as a central platform for organizational and structural activities in combination with different input and interaction possibilities, fulfilling the requirements of a group session as mentioned above.USER INTERFACEAs outlined before, the user interface has to be highly flexi-ble and intuitive to be accepted as a proper instrument dur-ing creative group sessions. For the intuitivity there are two important aspects:1. the GUI must meet the modern requirements of us-ability engineering2. the methods to input data must match the specialrequirements of group sessionsThe MetaChart Application offers different suitable methods for gathering input like recognizing of mouse gestures, or character recognizing. Furthermore, while working in a ses-sion, arbitrary data can be added to that session. Objects, which stand in associations to other objects, may content as different formats as plain text, images, web links, results of program calls and so on.To point out a typical work situation, we use the scenario to show how MetaChart can be used.- Greg imports the schema of the trouble ticket da-tabase resulting in a list of attributes: organizationand contact information who sent trouble ticket,date of trouble ticket, short title, description, datewhen problem was solved- John argues that “date when problem was solved”does not make sense. Instead, it should be capturedhow long it takes until a trouble ticket has beenclosed.- Susan adds that the solution should be added to the trouble ticket. Up to now, problem and solutionwere not stored together.- John intervenes that trouble tickets need to be fur-ther characterized to enable meaningful search.- Sharon remembers that she received several solu-tion reports for her trouble tickets and opens threeexemplary solution reports.- John and Sharon analyze the solution reports for commonalities and differences to come up withadditional attributes.- They realize that (a) a trouble ticket may have sev-eral (alternative) solutions and (b) a solution maybe remedy for several trouble tickets. This calls fora n:m relationship.- Sharon suggests a taxonomy for characterizing trouble tickets.- Meanwhile Greg queries the database to retrieve the most recent 20 trouble tickets.- Sharon and Greg try to classify those according to Sharon’s taxonomy.- In parallel, John analyzes the reports further and marks several words in the texts. These constitutenew objects. Example for report (excerpt): “To en-sure that the screw does not gradually becomeloose, do not reinsert the screw to fix the cassettemechanics. Rather glue it. …” (italic words arethose marked by John – they should become partof the ontology)In MetaChart, the following screenshot (Figure 1) could be a result of the group session.Figure 1: Snapshot of MetaChart Application ORGANISATION OF INFORMATIONFor implementing knowledge management solutions, we use orenge, a software system by empolis1 that is specifi-cally designed for the effective and efficient structuring and retrieval of already existing knowledge sources, e.g., a col-lection of documents or databases. orenge’s behavior is based on a meta-model. Each knowledge source is de-scribed by a set of attribute value pairs. The meta-model defines which set of attributes is to be used for each type of knowledge source. Such a definition is called an aggregate. Therefore, documents can be organized into classes where each class has its own characteristic set of attributes or ag-gregate.All attributes of the meta-model are typed. In orenge an extensive array of predefined types is available which can be further detailed for a specific application. One of the most common specializations concerns the Text type. It is typically done by enumerating possible values called con-cepts for its range. The taxonomy used for the trouble tick-ets is an example. Of course, such a scheme would have to be agreed on by the stakeholders of a knowledge manage-ment system.This basic meta-model is enhanced by several other models which describe various aspects of the final knowledge man-agement application. At this point, only two shall be de-scribed in detail:1. The import model describes which databases ordocuments should be analysed for structuring thealready available information. In addition, it de-fines which columns (in case of databases) or1 which text sections (in case of documents) will beused for analysis. For example, the section labelled“Affected Product” could be used to analyse theproduct a trouble ticket is about.2. The analysis model describes the linguistic analy-sis of knowledge sources. The analysis automati-cally extracts the values for the aggregate’s attrib-utes from a textual description. One way of identi-fying values is using so-called keys. For example,if the keys “TV”, “TV set”, “television”, and“television set” were associated with the concept“TV”, a trouble ticket would be associated to theproduct “TV” if one of these keys would appear inthe “Affected Product” section.To define a complete meta-model, empolis conducts model-ling workshops with its customers. The experience in these workshops has been that attributes, types (incl. ranges), keys, and import models tend not to be defined in a sequen-tial order but rather based on whatever comes to mind first. Therefore, a knowledge capturing approach as it is sup-ported by the MetaChart approach is very helpful.In contrast to the orenge model with its strictly defined se-mantics, the MetaChart data model is more generic. It al-lows defining objects and relationships between these ob-jects. Objects (visualized by a rectangle) can be arranged graphically in two ways: One rectangle can be contained in another or two rectangles can be associated via an arrow. In addition, each rectangle can be associated with a set of at-tributes (content) and meta data (describing data). An ex-port tool interprets the MetaChart and generates an initial (meaningful) orenge meta-model.In the MetaChart application, there is not a lot of pre-built semantics in the data model. The semantics has to be added while transforming it to an ontology. For example, an hier-archical relation between two objects can mean inheritance or aggregation, at the moment of creation for the sytem it is of no importance. Later in the process of transformation it is specified exactly how the objects relate to each other. For our purpose, we defined the following translation rules among others:- One meta-attribute of each object describes its type (aggregate, attribute, type, concept, importmodel, or knowledge source).- The containment relationship is interpreted accord-ing to its elements. For example, if the surroundingobject is typed as an aggregate, its elements are in-terpreted as attributes. If the surrounding object istyped as a type and its inner elements are typed asconcepts, the inner elements constitute the range ofthe type. If the inner elements are also typed astypes, the union of the ranges of the inner elementswill be taken as the range of the surrounding type(corresponds to orenge’s “compound type”). FURTHER PROCESSING OF THE CONTENT IN ORENGEOnce the initial meta-model has been created using the MetaChart tool, it is exported as an orenge model. This model can then be further enhanced using orenge’s author-ing environment tengerine. A knowledge engineer completes the model by various aspects, thereby defining the retrieval behavior of the resulting knowledge management system. Some of the most important aspects include:1. the similarity model which describes how to com-pute the relevance of a given knowledge source toa user’s query. This enables orenge not only to re-turn exact matches, but also “almost matching, butstill interesting” hits.2. explanations describing why a particular knowl-edge source is part of the search result (which isnot always obvious if a system not only returns100% matches).3. dialog strategy which defines how to guide the userto attain reasonable results as quickly as possible4. rules which describe how to complete a user query(automatic inference or correction of query valuesbased on the information given by the user)CONCLUSIONSExperience has shown that the construction of ontologies for knowledge management systems (KMS) must consider the input of all stakeholders of the later KMS. However, to our knowledge, group based knowledge capturing tools for this purpose supporting the process of going from vague ideas to complete structured models do not exist yet.The outlined MetaChart approach supports this process al-lowing to structure information and content of cooperative work in the context of group sessions. Generated informa-tion objects and their incidental structures can be explored and modified in a user-specific and web-based way. Fur-thermore, this method offers an approach to coordinate tasks generated within a group session, and it offers also the pos-sibility of collaborative system design.Therefore, we have adopted this approach as a front-end authoring system for initial meta-models of orenge, a knowl-edge management system by empolis which- automatically extracts meta-information from ex-isting knowledge sources using a meta-model and- retrieves knowledge based on the meta-model and the extracted meta-information.Currently, we are developing the export module for the MetaChart tool to validate the improvement in tool support for the development of group based meta-models for knowledge management applications.Besides the advantage to define a model in a group session, other major benefits of the MetaChart approach lie in its possibility to do late “typing” and structuring. For example, a list of words can be thought of at first without specifying whether this list is to be interpreted as a set of attributes (making up an aggregate describing a knowledge source), a set of concepts (making up the range of a text attribute), ora set of terms (making up the list of possible occurrences of a concept). Only later in the design process, the group needs to decide how to type the objects and how they relate to each other. Typing and structuring is done on an incre-mental basis as the stakeholders understand the domain increasingly better (based on their discussion of the MetaChart). Typing and structuring can be deferred untilthe MetaChart is exported as an initial orenge model. empo-lis found these two features to be vital during modeling workshops it conducts with its customers as experts tend to jump between subject areas and (at least at first) do not want to be bothered with technical details.ACKNOWLEDGEMENTESThis work has been conducted in context of the German Lead Project INVITE with support of the German Ministry of Education and Research.REFERENCES[1] Blythe, J.; Kim, J.; Ramachandran, S.; Gil, Y., “An Inte-grated Environment for Knowledge Acquisition”, In: Proceedings of the International Conference on Intelli-gent User Interfaces, Santa Fe, New Mexico, January2001.[2] Farquhar, A.; Fikes, R.; Rice, J., “The OntolinguaServer: A Tool for Collaborative Ontology Construc-tion”, In: Proceedings of the Tenth Knowledge Acquisi-tion for Knowledge-Based Systems Workshop, Banff,Alberta, 1996.[3] Greif, I., “Computer-Supported Cooperative Work”, ABook of Readings. San Marco (CA): Morgan Kau-fmann, 1988.[4] Gruber, T.R., “Toward principles for the design of on-tologies used for knowledge sharing”. International Journal on Human-Computer Studies, vol. 43, 1995, P.907–928.[5] Krasner, G.; Pope, S., “A Cookbook for Using theModel-View-Controller User Interface Paradigm in Smalltalk-80.”, In: JOOP 8/9, P. 26ff., 1988[6] López, M.F., “Overview of the methodologies for build-ing ontologies”. In V. R. Benjamins, B.Chandrasekaran, A. Gómez-Pérez, N. Guarino, and M.Uschold (Eds): Proceedings of the IJCAI-99 Workshopon Ontologies and Problem-Solving Methods (KRR5),Stockholm, Sweden, August 1999.[7] Sommestad, G., “The Literary Machine”, URL/lm.htm.[8] Tautz, C., Customizing Software Engineering Experi-ence Management Systems to Organizational Needs,PhD Thesis, Department of Computer Science, Univer-sity of Kaiserslautern, Fraunhofer IRB, Stuttgart, Ger-many, 2001.[9] Wissen, M; Wischy, M. A.; Ziegler, J., “Realisierungeiner laserbasierten Interaktionstechnik für Projektions-wände.”, In J.Krause (Ed.): Mensch & Computer 2001,P. 135-143, Teubner, Stuttgart, 2001.。