CS 8520: Artificial Intelligence Knowledge Representation Paula Matuszek Fall, 2015!1
Introduction Knowledge Representation means: Capturing human knowledge In a form computer can reason about Why? Model human cognition Add power to search-based methods Actually a component of all software development!2
KR Introduction General problem in CS: Solutions = data structures words, arrays records lists, queues objects More specific problem in AI: Solutions = knowledge structures decision trees logic and predicate calculus rules: production systems description logics, semantic nets, frames scripts ontologies!3
We ve been here before! Informed search: a heuristic for informed search is adding knowledge Constraint satisfaction heuristics for choosing which constraint next Logical agents: FOL is one of the oldest forms of knowledge representation in AI In fact, how to formulate and describe our problems has been a part of everything we have talked about!4
Characteristics of a good KR: It should Be able to represent the knowledge important to the problem Reflect the structure of knowledge in the domain Otherwise our development is a constant process of distorting things to make them fit. Capture knowledge at the appropriate level of granularity Support incremental, iterative development It should not Be too difficult to reason about Require that more knowledge be represented than is needed to solve the problem!5
Kinds of Knowledge Things we need to talk about and reason about; what do we know? Objects Descriptions Classifications Events Time sequence Cause and effect Relationships Among objects Between objects and events Meta-knowledge Distinguish between knowledge and its representation!6
Representation Mappings Reasoning Programs Facts English Representation Internal Representation Knowledge Level Symbol Level Mappings are not one-to-one Never get it complete or exactly right!7
Knowledge engineering! Modeling the right conditions and the right effects at the right level of abstraction is difficult Knowledge engineering (creating and maintaining knowledge bases for intelligent reasoning) is an entire field of investigation Research goal: automated knowledge acquisition and machine learning tools to fill the gap: We would like intelligent systems which learn about the conditions and effects, just like we do! We would like intelligent systems which learn when to pay attention to, or reason about, certain aspects of processes, depending on the context!!8
Kinds of KR Decision Trees Logic and Predicate Calculus Rules: Production Systems Description Logics, Semantic Nets, Frames, Scripts Ontologies!9
Decision Trees Knowledge captured as a series of questions and responses or decisions and outcomes Common in troubleshooting manuals, medical domains, etc. Well-known example: Animals Often a binary tree, but doesn t need to be!10
Decision Trees Example: Guessing an animal Does it live in the water? Yes: Does it have scales? Yes: Your animal is a fish. No: Your animal is a frog. No: Is it bigger than a breadbox? Yes: Your animal is a horse No: Your animal is a chipmunk!11
Decision Trees Decision trees are relatively simple representations leading to a single conclusion or action Nodes represent questions/tests/decisions Arcs represent answers/results Often but not necessarily binary Familiar in troubleshooting, biological keys, etc.!12
Decision Trees: Advantages Easy to implement. We know a lot about trees in computer science. Explanations and the inference process are clearcut and easy to explain. Low startup cost. Capture simple domains well. Fast to do inference: it s just a tree walker Given cases with answers, can use machine learning to develop.!13
Decision Trees: Disadvantages Decision trees reflect a semi-procedural view of expertise which is often not a good match to a domain Difficult to modify Difficult to maintain tree shape, even if you allow multiple inheritance Intermediate state of a problem is only captured implicitly. Don t scale well. For complex domain, difficult to elicit from human, hard to maintain and debug. May give illusion of structure which doesn t actually reflect the domain!14
Decision Tree for Choosing a Wine Suppose you are writing an app to be used in a wine store to help a customer choose a wine and you want to use a decision tree. What would it look like?!15
Decision Tree for Wine!16
Logic and Predicate Calculus We have already discussed this at length in conjunction with logical agents Very rich representation Formal; reasoning well understood For big real-world problems has some significant issues: very bushy inference does not always match human thinking well excluded middle no good choice for don t know!17
Production Rules Common formalism in expert systems Knowledge is represented as if-then rules: if <condition> (LHS, left hand side) then <action> (RHS, right hand side) If car won t start, then see if battery is dead. If a person is a student, then a person has an ID card.!18
Rules Continued LHS may be a test, an observation, a symptom, an already-known fact. If the printer won t print If power test is passed If strep diagnosed RHS may be a new fact to be asserted, an action to take, a message Then see if it has power Then assert (power, yes) Then give antibiotics!19
Inference with rules Production rules systems typically have the usual three components: the knowledge base or KB (the rules) the fact base (details for this instance) the inference engine (application which uses rules) Inference engine repeatedly applies rules from the KB to create additional facts until a stopping point is reached!20
Inference Engines As with logic, may be forward-chaining, backwardchaining. May also be mixed. Iterative process find rules which can be applied, add to agenda. Analogous to a fringe. pick a rule from the agenda and fire it, updating KB Conflict resolution is method of choosing rule from among those on agenda we already know about forward, backward chaining recency, specificity, explicit priority!21
Some Additional Issues Non-monotonicity. A rule may retract a fact. eg: If printer(unplugged), then plug it in and retract printer(unplugged) Truth maintenance. Rules on the agenda may no longer belong there. Uncertainty In the facts In the rules!22
Rule-based System Exercises Create a small rule-based system for recommending a wine Create a small rule-based system for diagnosing why your car won t start.!23
Rule-based Exercises Rules for wine? Rules for car? Which worked better for wine, DT or rules? Why? How did wine and car rules differ?!24
Evaluation of Rule-based Inference Advantages Relatively fast Captures natural human patterns Modular Can capture uncertainty and non-monotonicity Restricted syntax simplifies editors, learning, etc. Disadvantages Neither sound nor complete Requires conflict resolution restricted syntax reduces expressiveness System behavior reliant on conflict resolution strategy adding new rules may produce unusual effects under conflict resolution!25
Structured Knowledge Representations Modeling-based representations reflect the structure of the domain, and then reason based on the model. Semantic Nets Frames Scripts Sometimes called associative networks!26
Basics of Associative Networks All include Concepts Various kinds of links between concepts has-part or aggregation is-a or specialization More specialized depending on domain Typically also include Inheritance Some kind of procedural attachment!27
Semantic Nets graphical representation for propositional information originally developed by M. R. Quillian as a model for human memory labeled, directed graph nodes represent objects, concepts, or situations labels indicate the name nodes can be instances (individual objects) or classes (generic nodes) links represent relationships the relationships contain the structural information of the knowledge to be represented the label indicates the type of the relationship!28
Nodes and Arcs Arcs define binary relationships that hold between objects denoted by the nodes. Sue mother john age 5 age husband wife father 34 age Max mother(john,sue) age(john,5) wife(sue,max) age(max,34)...!29
Semantic Networks The ISA (is-a) relation is often used to link instances to classes, classes to superclasses Some links (e.g. haspart) are inherited along ISA paths. The semantics of a semantic net can be relatively informal or very formal often defined at the implementation level Animal Bird isa Rusty isa haspart isa Robin isa Red Wing 30!30
Individuals and Classes Many semantic networks distinguish nodes representing individuals and those representing classes the subclass relation from the instance-of relation Animal subclass Bird instance Rusty Genus instance haspart subclass Wing Robin instance Red 31
Inference by Inheritance One of the main kinds of reasoning done in a semantic net is the inheritance of values along the subclass and instance links. Semantic networks differ in how they handle the case of inheriting multiple different values. All possible values are inherited, or Only the lowest value or values are inherited!32
Multiple inheritance A node can have any number of superclasses that contain it, enabling a node to inherit properties from multiple parent nodes and their ancestors in the network. These rules are often used to determine inheritance in such tangled networks where multiple inheritance is allowed: If X<A<B and both A and B have property P, then X inherits A s property. If X<A and X<B but neither A<B nor B<A, and A and B have property P with different and inconsistent values, then X does not inherit property P at all.!33
Conflicting inherited values!34
Description Logics Description logics provide a family of KR systems with a formal semantics. E.g., KL-ONE, LOOM, Classic, An additional kind of inference done by these systems is automatic classification finding the right place in a hierarchy of objects for a new description Semantic Nets can be considered as a form of description logic.!35
Description Logics Notations to make it easier to describe definitions and properties of categories Taxonomic structure is organizing principle Subsumption: Determine if one category is a subset of another Classification: Determine the category in which an object belongs Consistency: Determine if membership criteria are logically satisfiable!36
Current Best-known Description Logic OWL Ontology Web Language A language for the semantic web Flavors: OWL-Lite, OWL-DL, OWL full, OWL 2 W3C recommendation as of 11 December, 2012 http://www.w3.org/tr/owl2-overview/!37
Ontologies structuring knowledge in a useful fashion An ontology formally represents concepts in a domain and relationships between those concepts The concept originated in philosophy; a model of a theory of nature or existence. An ontology describes the things we want to talk about, including both objects and relationships!38
What Is An Ontology An ontology is an explicit description of a domain: concepts properties and attributes of concepts constraints on properties and attributes Individuals (often, but not always) An ontology defines a common vocabulary a shared understanding!39
Ontology Examples Catalogs for on-line shopping Amazon.com product catalog Broad general ontologies The Open Directory Project https://www.dmoz.org/ Domain-specific standard terminology Unified Medical Language System (UMLS) and MeSH https://www.nlm.nih.gov/pubs/factsheets/mesh.html Upper Ontologies Represent everything in the world!! Cyc http://www.opencyc.org/ Suggested Upper Merged Ontology(SUMO) http:// www.adampease.org/op/!40
MeSH https://www.nlm.nih.gov/mesh/2015/mesh_browser/meshtree.b.html#link_id!41
CYC Ontology http://www.opencyc.org/!42
A Portion of SUMO http://www.adampease.org/op/!43
Why Develop an Ontology? To share common understanding of the structure of information among people among software agents To enable reuse of domain knowledge to avoid re-inventing the wheel to introduce standards to allow interoperability!44
More Reasons To make domain assumptions explicit easier to change domain assumptions (consider a genetics knowledge base) easier to understand and update legacy data To separate domain knowledge from the operational knowledge re-use domain and operational knowledge separately (e.g., configuration based on constraints)!45
Special- and General-purpose Ontologies Special-purpose ontology: Designed to represent a specific domain of knowledge; E.G. genetics (GO. http://geneontology.org/) General-purpose ontology: Should be applicable in any domain Unifies different domains of knowledge Upper ontology provides highest level framework all other concepts follow!46
Developing an Ontology Consider Questions Like: Which wine should I serve with seafood today? What wines should I buy next Monday for my reception in Villanova, PA? Is there a market for the products of another small winery in this area? What online source is the best for wines for my party in Texas next fall?!47
What Do We Need to Know? what types of wines have been served at different occasions events (occasions) wines and winetypes grape types. wine locations and wineries. Soil properties, weather, climate ratings, preferences. retailers foods!48
Which wine should I serve with seafood today? A shared ONTOLOGY of wine and food French wines and wine regions California wines and wine regions http://protege.stanford.edu/conference/2004/slides/ontology101_tutorial.pdf!49
Wines and Wineries http://protege.stanford.edu/conference/2004/slides/ontology101_tutorial.pdf!50
An Ontology Is Often Just the Beginning Ontologies Declare structure Databases Provide domain description Knowledge bases Software agents Problemsolving methods Domainindependent applications http://protege.stanford.edu/conference/2004/slides/ontology101_tutorial.pdf!51
What does an Ontology Look Like? Can be any knowledge representation Reasoning Programs Facts English Representation Internal Representation Typically a description logic or associative network of some kind.!52
Tools Protégé: is a graphical ontology-development tool supports a rich knowledge model is open-source and freely available (http://protege.stanford.edu) Some other available tools: Ontolingua and Chimaera OntoEdit OilEd OpenCyc!53
Protégé OntologyDevelopment Process General approach: Determine Scope Enumerate terms Define classes Define Properties and Constraints Create Instances Usually a highly iterative process. http://protege.stanford.edu/conference/2004/slides/ontology101_tutorial.pdf!54
Determine Domain and Scope What is the domain that the ontology will cover? For what we are going to use the ontology? For what types of questions the information in the ontology should provide answers (competency questions)? http://protege.stanford.edu/conference/2004/slides/ontology101_tutorial.pdf!55
Competency Questions Which wine characteristics should I consider when choosing a wine? Is Bordeaux a red or white wine? Does Cabernet Sauvignon go well with seafood? What is the best choice of wine for grilled meat? Which characteristics of a wine affect its appropriateness for a dish? Does a flavor or body of a specific wine change with vintage year? What were good vintages for Napa Zinfandel? http://protege.stanford.edu/conference/2004/slides/ontology101_tutorial.pdf!56
Enumerate Important Terms What are the terms we need to talk about? What are the properties of these terms? What do we want to say about the terms? http://protege.stanford.edu/conference/2004/slides/ontology101_tutorial.pdf!57
Enumerating Terms: Wine!58
Enumerating Terms - The Wine Ontology wine, grape, winery, location, wine color, wine body, wine flavor, sugar content white wine, red wine, Bordeaux wine food, seafood, fish, meat, vegetables, cheese http://protege.stanford.edu/conference/2004/slides/ontology101_tutorial.pdf!59
Define Classes and the Class Hierarchy A class is a concept in the domain a class of wines a class of wineries a class of red wines A class is a collection of elements with similar properties Instances of classes a glass of California wine you ll have for lunch http://protege.stanford.edu/conference/2004/slides/ontology101_tutorial.pdf!60
Wine Classes!61
Wine Classes White, red, rose Red: Bordeaux, Chianti, Pinot Noir... French, California, Australian, Italian,... Aperitif, dinner, dessert!62
Levels in the Hierarchy Top level Middle level Bottom level http://protege.stanford.edu/publications/ ontology_development/ontology101-noy-mcguinness.html!63
Define Properties of Classes Properties in a class definition describe attributes of instances of the class and relations to other instances A subclass inherits all the properties from the superclass If a wine has a name and flavor, a red wine also has a name and flavor If a class has multiple superclasses, it inherits properties from all of them Port is both a dessert wine and a red wine. It inherits sugar content: high from the former and color:red from the latter!64
Properties of Wines Each wine will have color, sugar content, producer, etc. Intrinsic properties: flavor and color of wine Extrinsic properties: name and price of wine Parts: ingredients in a dish Relations to other objects: producer of wine (winery) Simple properties (attributes): color, flavor, etc Complex properties: producer!65
Properties for the Class Wine http://protege.stanford.edu/publications/ontology_development/ ontology101-noy-mcguinness.html!66
Create Instances Create an instance of a class The class becomes a direct type of the instance Any superclass of the direct type is a type of the instance Assign property values for the instance frame Property values should conform to the constraints Knowledge-acquisition tools often check that http://protege.stanford.edu/conference/2004/slides/ontology101_tutorial.pdf!67
An Instance Example http://protege.stanford.edu/conference/2004/slides/ontology101_tutorial.pdf!68
Modes of Development top-down define the most general concepts first and then specialize them bottom-up define the most specific concepts and then organize them in more general classes combination define the more salient concepts first and then generalize and specialize them!69
Documentation Classes (and properties) usually have documentation Describing the class in natural language Listing domain assumptions relevant to the class definition Listing synonyms Documenting classes and properties is as important as documenting computer code!!70
Defining Classes and a Class Hierarchy The things to remember: There is no single correct class hierarchy But there are some guidelines The question to ask: Is each instance of the subclass an instance of its superclass? Classes vs properties: We could consider country as a class or a property The question to ask: Do I want to inherit other properties based on it? http://protege.stanford.edu/conference/2004/slides/ontology101_tutorial.pdf!71
Limiting the Scope An ontology should not contain all the possible information about the domain No need to specialize or generalize more than the application requires No need to include all possible properties of a class Only the most salient properties Only the properties that the applications require http://protege.stanford.edu/conference/2004/slides/ontology101_tutorial.pdf!72
Limiting the Scope (II) Ontology of wine, food, and their pairings probably will not include Bottle size Label color My favorite food and wine An ontology of biological experiments will contain Biological organism Experimenter Is the class Experimenter a subclass of Biological organism? http://protege.stanford.edu/conference/2004/slides/ontology101_tutorial.pdf!73
Reasoning with Default Information Open and Closed worlds Open World: Information provided is not assumed to be complete, therefore inferences may result in sentences whose truth value is unknown Closed World: Information provided is assumed complete, therefore ground sentences not asserted to be true are assumed false Negation as Failure: A negative literal, not P, can be proved true if the proof of P fails!74
Knowledge Engineering Actually capturing the information from the human subject matter expert (SME) in any of these formats is difficult and time-consuming An iterative process of add knowledge/test. Often a knowledge engineer or ontological engineer works with the SME What is the system for? is critical Automated learning of knowledge is a very active research field right now. KBs and KE vs statistics and ML!75