Why use ontologies




















Figure 6. Subclasses of the Red Burgundy class. Having a single subclass of a class usually points to a problem in modeling. Suppose now that we list all types of wines as direct subclasses of the Wine class. This list would then include such more general types of wine as Beaujolais and Bordeaux, as well as more specific types such as Paulliac and Margaux Figure 7 a. Also having such intermediate categories as Red wine and White wine would also reflect the conceptual model of the domain of wines that many people have Figure 7 b.

After all, the ontology is a reflection of the real world, and if no categorization exists in the real world, then the ontology should reflect that. Figure 7. Categorizing wines. Having all the wines and types of wine versus having several levels of categorization. Most knowledge-representation systems allow multiple inheritance in the class hierarchy: a class can be a subclass of several classes. Suppose we would like to create a separate class of dessert wines, the Dessert wine class.

The Port wine is both a red wine and a dessert wine. All instances of the Port class will be instances of both the Red wine class and the Dessert wine class. The Port class will inherit its slots and their facets from both its parents. Thus, it will inherit the value SWEET for the slot Sugar from the Dessert wine class and the tannin level slot and the value for its color slot from the Red wine class.

One of the hardest decisions to make during modeling is when to introduce a new class or when to represent a distinction through different property values.

It is hard to navigate both an extremely nested hierarchy with many extraneous classes and a very flat hierarchy that has too few classes with too much information encoded in slots. Finding the appropriate balance though is not easy.

There are several rules of thumb that help decide when to introduce new classes in a hierarchy. Subclasses of a class usually 1 have additional properties that the superclass does not have, or 2 restrictions different from those of the superclass, or 3 participate in different relationships than the superclasses. Red wines can have different levels of tannin, whereas this property is not used to describe wines in general. Pinot Noir wines may go well with seafood whereas other red wines do not.

In other words, we introduce a new class in the hierarchy usually only when there is something that we can say about this class that we cannot say about the superclass. In practical terms, each subclass should either have new slots added to it, or have new slot values defined, or override some facets for the inherited slots.

However, sometimes it may be useful to create new classes even if they do not introduce any new properties. Classes in terminological hierarchies do not have to introduce new properties. For example, some ontologies include large reference hierarchies of common terms used in the domain. For example, an ontology underlying an electronic medical-record system may include a classification of various diseases.

In that case, it is still useful to organize the terms in a hierarchy rather than a flat list because it will 1 allow easier exploration and navigation and 2 enable a doctor to choose easily a level of generality of the term that is appropriate for the situation.

Another reason to introduce new classes without any new properties is to model concepts among which domain experts commonly make a distinction even though we may have decided not to model the distinction itself. Finally, we should not create subclasses of a class for each additional restriction.

For example, we introduced the classes Red wine , White wine , and Rose wine because this distinction is a natural one in the wine world.

We did not introduce classes for delicate wine, moderate wine, and so on. When defining a class hierarchy, our goal is to strike a balance between creating new classes useful for class organization and creating too many classes. Do we create a class White wine or do we simply create a class Wine and fill in different values for the slot color? The answer usually lies in the scope that we defined for the ontology.

How important the concept of White wine is in our domain? For a domain model used in a factory producing wine labels, rules for wine labels of any color are the same and the distinction is not very important. Alternatively, for the representation of wine, food, and their appropriate combinations a red wine is very different from a white wine: it is paired with different foods, has different properties, and so on.

Similarly, color of wine is important for the wines knowledge base that we may use to determine wine-tasting order. Thus, we create a separate class for White wine. If the concepts with different slot values become restrictions for different slots in other classes, then we should create a new class for the distinction. Otherwise, we represent the distinction in a slot value. Similarly, our wine ontology has such classes as Red Merlot and White Merlot , rather than a single class for all Merlot wines: red Merlots and white Merlots are really different wines made from the same grape and if we are developing a detailed ontology of wine, this distinction is important.

If a distinction is important in the domain and we think of the objects with different values for the distinction as different kinds of objects, then we should create a new class for the distinction. Considering potential individual instances of a class may also be helpful in deciding whether or not to introduce a new class.

A class to which an individual instance belongs should not change often. Usually when we use extrinsic rather than intrinsic properties of concepts to differentiate among classes, instances of those classes will have to migrate often from one class to another. For example, Chilled wine should not be a class in an ontology describing wine bottles in a restaurant.

The property c hilled should simply be an attribute of wine in a bottle since an instance of Chilled wine can easily cease being an instance of this class and then become an instance of this class again. Usually numbers, colors, locations are slot values and do not cause the creation of new classes. Wine, however, is a notable exception since the color of the wine is so paramount to the description of wine.

For another example, consider the human-anatomy ontology. Or do we have a class Rib with slots for the order and the lateral position left-right? That is, if we want to represent details adjacency and location information which is different for each rib as well as specific functions that each rib playa and organs it protects, we want the classes.

If we are modeling anatomy at a slightly lesser level of generality, and all ribs are very similar as far as our potential applications are concerned we just talk about which rib is broken on the X-Ray without implications for other parts of the body , we may want to simplify our hierarchy and have just the class Rib , with two slots: lateral position , order.

Deciding whether a particular concept is a class in an ontology or an individual instance depends on what the potential applications of the ontology are. Deciding where classes end and individual instances begin starts with deciding what is the lowest level of granularity in the representation.

The level of granularity is in turn determined by a potential application of the ontology. In other words, what are the most specific items that are going to be represented in the knowledge base?

Going back to the competency questions we identified in Step 1 in Section 3, the most specific concepts that will constitute answers to those questions are very good candidates for individuals in the knowledge base. Individual instances are the most specific concepts represented in a knowledge base. For example, if we are only going to talk about pairing wine with food we will not be interested in the specific physical bottles of wine.

Therefore, such terms as Sterling Vineyards Merlot are probably going to be the most specific terms we use. Therefore, Sterling Vineyards Merlot would be an instance in the knowledge base. On the other hand, if we would like to maintain an inventory of wines in the restaurant in addition to the knowledge base of good wine-food pairings, individual bottles of each wine may become individual instances in our knowledge base.

Similarly, if we would like to record different properties for each specific vintage of the Sterling Vineyards Merlot, then the specific vintage of the wine is an instance in a knowledge base and Sterling Vineyards Merlot is a class containing instances for all its vintages. If concepts form a natural hierarchy, then we should represent them as classes.

Consider the wine regions. Initially, we may define main wine regions, such as France, United States, Germany, and so on, as classes and specific wine regions within these large regions as instances. For example, Bourgogne region is an instance of the French region class. Therefore, Bourgogne region must be a class in order to have subclasses or instances. Therefore, we define all wine regions as classes. In our case, all region classes are abstract Figure 8. Figure 8.

Hierarchy of wine regions. The "A" icons next to class names indicate that the classes are abstract and cannot have any direct instances. We cannot say that the class Alsace is a subclass of the class France : Alsace is not a kind of France.

However, Alsace region is a kind of a French region. Therefore, if there is a natural hierarchy among terms, such as in terminological hierarchies from Section 4. As a final note on defining a class hierarchy, the following set of rules is always helpful in deciding when an ontology definition is complete:.

The ontology should not contain all the possible information about the domain: you do not need to specialize or generalize more than you need for your application at most one extra level each way.

For our wine and food example, we do not need to know what paper is used for the labels or how to cook shrimp dishes. Similarly, the ontology should not contain all the possible properties of and distinctions among classes in the hierarchy.

In our ontology, we certainly do not include all the properties that a wine or food could have. We represented the most salient properties of the classes of items in our ontology.

Even though wine books would tell us the size of grapes, we have not included this knowledge. Similarly, we have not added all relationships that one could imagine among all the terms in our system. For example, we do not include relationships such as favorite wine and favorite food in the ontology just to allow a more complete representation of all of the interconnections between the terms we have defined.

The last rules also applies to establishing relations among concepts that we have already included in the ontology.

Consider an ontology describing biology experiments. The ontology will likely contain a concept of Biological organisms. It will also contain a concept of an Experimenter performing an experiment with his name, affiliation, etc. It is true that an experimenter, as a person, also happens to be a biological organism.

However, we probably should not incorporate this distinction in the ontology: for the purposes of this representation an experimenter is not a biological organism and we will probably never conduct experiments on the experimenters themselves. If we were representing everything we can say about the classes in the ontology, an Experimenter would become a subclass of Biological Organism. However, we do not need to include this knowledge for the foreseeable applications.

In fact, including this type of additional classification for existing classes actually hurts: now an instance of an Experimenter will have slots for weight, age, species, and other data pertaining to a biological organism, but absolutely irrelevant in the context of describing an experiment.

However, we should record such design decision in the documentation for the benefit of the users who will be looking at this ontology and who may not be aware of the application we had in mind. Many systems allow us to specify explicitly that several classes are disjoint. Classes are disjoint if they cannot have any instances in common. For example, the Dessert wine and the White wine classes in our ontology are not disjoint: there are many wines that are instances of both. The Rothermel Trochenbierenauslese Riesling instance of the Sweet Riesling class is one such example.

At the same time, the Red wine and the White wine classes are disjoint: no wine can be simultaneously red and white. Specifying that classes are disjoint enables the system to validate the ontology better. If we declare the Red wine and the White wine classes to be disjoint and later create a class that is a subclass of both Riesling a subclass of White wine and Port a subclass of Red wine , a system can indicate that there is a modeling error. In this section we discuss several more details to keep in mind when defining slots in the ontology Step 5 and Step 6 in Section 3.

Mainly, we discuss inverse slots and default values for a slot. A value of a slot may depend on a value of another slot. For example, if a wine was produced by a winery , then the winery produces that wine. These two relations, maker and produces , are called inverse relations.

When we know that a wine is produced by a winery, an application using the knowledge base can always infer the value for the inverse relation that the winery produces the wine.

However, from the knowledge-acquisition perspective it is convenient to have both pieces of information explicitly available. This approach allows users to fill in the wine in one case and the winery in another.. The knowledge-acquisition system could then automatically fill in the value for the inverse relation insuring consistency of the knowledge base.

Our example has a pair of inverse slots: the maker slot of the Wine class and the produces slot of the Winery class. When a user creates an instance of the Wine class and fills in the value for the maker slot, the system automatically adds the newly created instance to the produces slot of the corresponding Winery instance.

For instance, when we say that Sterling Merlot is produced by the Sterling Vineyard winery, the system would automatically add Sterling Merlot to the list of wines that the Sterling Vineyard winery produces. Figure 9. Instances with inverse slots. The slot produces for the class Winery is an inverse of the slot maker for the class Wine.

Filling in one of the slots triggers an automatic update of the other. Many frame-based systems allow specification of default values for slots. If a particular slot value is the same for most instances of a class, we can define this value to be a default value for the slot. Then, when each new instance of a class containing this slot is created, the system fills in the default value automatically. We can then change the value to any other value that the facets will allow.

That is, default values are there for convenience: they do not enforce any new restrictions on the model or change the model in any way.

Then, unless we say otherwise, all wines we define would be full-bodied. Note that this is different from slot values. Slot values cannot be changed. This value cannot be changed in any of the subclasses or instances of the class. Defining naming conventions for concepts in an ontology and then strictly adhering to these conventions not only makes the ontology easier to understand but also helps avoid some common modeling mistakes. There are many alternatives in naming concepts. Often there is no particular reason to choose one or another alternative.

However, we need to. The following features of a knowledge representation system affect the choice of naming conventions:. That is, does the system allow having a class and a slot with the same name such as a class winery and a slot winery?

That is, does the system treat the names that differ only in case as different names such as Winery and winery? That is, can names contain spaces, commas, asterisks, and so on?

It is case-sensitive. Thus, we cannot have a class winery and a slot winery. We can, however, have a class Winery not the upper-case and a slot winery. CLASSIC, on the other hand, is not case sensitive and maintains different name spaces for classes, slots, and individuals.

Thus, from a system perspective, there is no problem in naming both a class and a slot Winery. First, we can greatly improve the readability of an ontology if we use consistent capitalization for concept names. For example, it is common to capitalize class names and use lower case for slot names assuming the system is case-sensitive.

When a concept name contains more than one word such as Meal course we need to delimit the words. Here are some possible choices. If you use delimiters, you will also need to decide whether or not each new word is capitalized. If the knowledge-representation system allows spaces in names, using them may be the most intuitive solution for many ontology developers. It is however, important to consider other systems with which your system may interact. If those systems do not use spaces or if your presentation medium does not handle spaces well, it can be useful to use another method.

A class name represents a collection of objects. For example, a class Wine actually represents all wines. Therefore, it could be more natural for some designers to call the class Wines rather than Wine.

No alternative is better or worse than the other although singular for class names is used more often in practice. However, whatever the choice, it should be consistent throughout the whole ontology. Some systems even require their users to declare in advance whether or not they are going to use singular or plural for concept names and do not allow them to stray from that choice.

Using the same form all the time also prevents a designer from making such modeling mistakes as creating a class Wines and then creating a class Wine as its subclass see Section 4. Some knowledge-base methodologies suggest using prefix and suffix conventions in the names to distinguish between classes and slots.

Thus, our slots become has-maker and has-winery if we chose the has- convention. The slots become maker-of and winery-of if we chose the of- convention. This approach allows anyone looking at a term to determine immediately if the term is a class or a slot. Likewise, a practitioner "specialty" will be the same attribute and have the same meaning on both sites. This standardization allows for more flexibility, and will enable more rapid development of applications, and sharing of information. However, a less likely application of ontologies, and an area of interest for me is with Data Integration and Knowledge Management systems.

So why haven't ontologies been the big buzz word like Hadoop, Spark, and Big Data? Technologies expand in the areas that it can and where there's a market, not necessarily in the areas most needed. Ontologies won't lend themselves to enterprise level applications, and developing multiple ontologies is just not that exciting.

For one thing, it requires cooperation between groups within a domain to standardize on a common ontology for communication. The financial industry has made an attempt at this with the development of the Financial Industry Business Ontology FIBO , but its adoption has been slow.

One reason for this is that the power of ontologies is not well understood. To take the next leap in technology advancement will require mature software, capable of rapidly ingesting, understanding, analyzing, and presenting interpretable results; this is where ontologies come in.

In my opinion, ontologies will be the enabler of the next generation of disruptive technologies. Voice recognition systems like Amazon's Echo, Apple's Siri, and Google Home are very sophisticated, but at the same time, immature technologies. It is not for the lack of hardware processing speeds that we have laughed, or been frustrated when asking one of these devices a simple question, only to receive some nonsensical response. We are capable of writing the code to answer questions correctly, if we have the information needed to properly interpret the question.

Computers are good at computing, but not so good at cognitive processing. Computers struggle with the inherent difficulty of performing human-like tasks that we take for granted.

These human-like tasks appear very simple, but are actually quite difficult to emulate on a computer. IBM's Watson is probably the most advanced in these areas, but there is still a long way to go. When a human sees a picture of a cat, we immediately identify the figure as being a cat. When asked "What can I do for my cold? Ask Siri this same question, and it pulls up a Wikipedia page explaining "cold" in the sense of temperature. However, if you type in a Google search for the same question, you get what would be expected more on Google search below.

Examining the word "cold" as an adjective in Wordnet, you will see that it has thirteen different senses as an adjective, but only three as a noun; everything from "a cold climate" cold vs hot , to "cold in his grave" lacking the warmth of life , to "will they never find a cure for the common cold?

In linguistics, this is the polysemy count, or the coexistence of possible meanings for a word or phrase. However, there are other ontology applications that I find just as interesting, and these are in the area of Data Integration and Knowledge Management.

The video below explains Google's Knowledge Graph better than I ever could, so please, check it out. Hopefully, from the discussion above, and with some help from Google, you understand better my interest in this area. Link diverse data, index it for semantic search and enrich it via text analysis to build big knowledge graphs.

Organize your information and documents into enterprise knowledge graphs and make your data management and analytics work in synergy. An ontology is a formal description of knowledge as a set of concepts within a domain and the relationships that hold between them.

To enable such a description, we need to formally specify components such as individuals instances of objects , classes, attributes and relations as well as restrictions, rules and axioms. As a result, ontologies do not only introduce a sharable and reusable knowledge representation but can also add new knowledge about the domain. The ontology data model can be applied to a set of individual facts to create a knowledge graph — a collection of entities, where the types and the relationships between them are expressed by nodes and edges between these nodes, By describing the structure of the knowledge in a domain, the ontology sets the stage for the knowledge graph to capture the data in it.

There are, of course, other methods that use formal specifications for knowledge representation such as vocabularies, taxonomies, thesauri, topic maps and logical models.

However, unlike taxonomies or relational database schemas, for example, ontologies express relationships and enable users to link multiple concepts to other concepts in a variety of ways. As one of the building blocks of Semantic Technology, ontologies are part of the W3C standards stack for the Semantic Web.

They provide users with the necessary structure to link one piece of information to other pieces of information on the Web of Linked Data. Because they are used to specify common modeling representations of data from distributed and heterogeneous systems and databases, ontologies enable database interoperability, cross-database search and smooth knowledge management. Some of the major characteristics of ontologies are that they ensure a common understanding of information and that they make explicit domain assumptions.

As a result, the interconnectedness and interoperability of the model make it invaluable for addressing the challenges of accessing and querying data in large organizations. Also, by improving metadata and provenance, and thus allowing organizations to make better sense of their data, ontologies enhance data quality.

In recent years, there has been an uptake of expressing ontologies using ontology languages such as the Web Ontology Language OWL. OWL is a semantic web computational logic-based language, designed to represent rich and complex knowledge about things and the relations between them.

It also provides detailed, consistent and meaningful distinctions between classes, properties and relationships. By specifying both object classes and relationship properties as well as their hierarchical order, OWL enriches ontology modeling in semantic graph databases, also known as RDF triplestores. OWL, used together with an OWL reasoner in such triplestores, enables consistency checks to find any logical inconsistencies and ensures satisfiability checks to find whether there are classes that cannot have instances.



0コメント

  • 1000 / 1000