What is an Ontology (Database?)?_问答_开发者_运维开发者技术经验分享

I was just reading this article and it mentions that some organization had an Ontology as(?) their database(?) layer, and that the decision to do this was bad. Problem is I hadn't heard开发者_运维问答 about this before, so I can't understand why it's bad.

So I tried googling about databases and ontology, and came about quite a few pdfs from 2006 that we're full of incomprehensible content (for my mind). I read a few of these and at this point still have absolutely no idea what they are talking about.

My current impression is that it was some crazy fad of 2006 that some academics were trying to sell us, but failed miserably due to the wording of their ideas. But I'm still curious if anyone actually knows what this is actually all about.

Karussell already provided the wikipedia definition:

"a formal representation of the knowledge by a set of concepts within a domain and the relationships between those concepts".

In order to implement such a representation, several languages have been developed. The one that currently gets the most attention is probably the Web Ontology Language (OWL).

In a traditional relational database, concepts can be stored using tables, but the system does not contain any information about what the concepts mean and how they relate to each other. Ontologies do provide the means to store such information, which allows for a much richer way to store information. This also means that one can construct fairly advanced and intelligent queries. Query languages such as SPARQL have been developed specifically for this purpose.

For my masters thesis, I have worked with OWL ontologies, but this was as part of a fairly academic research. I don't know if any of this technology is currently used in practice very much, but I'm sure the potential is there.

Update: example

An example of 'meaning' and reasoning over the ontologies: say you define in your ontology a class Pizza, and a class Vegetarian Pizza, which is a Pizza that has no Ingredients that belong to the class Meat. If you now create some instance of a Pizza that just happens not to have any meat ingredients, the system can automatically infer that your pizza is also a Vegetarian Pizza, even if you did not explicitly specify it.

An ontology is a schema (model) describing the types (and possibly some individuals) in a domain, the relationships that may exist between types and individuals, and constraints on the way that individuals and properties may be combined.

One analogy is with the UML class diagrams - but ontologies have formal semantics, so can be machine-interpreted, rather than just being diagrams for human consumption.

Example:

Classes: Project, Person, ProjectManager. ProjectManager is a subclass of Person (apparently). People and Projects are disjoint

Relationships: worksOn, manages. Manages is a sub-property of worksOn

Constraints: People work on Projects, not the other way around. Only Project Managers can manage projects.

This simple example enables machine inferences, e.g. if X manages Y, then we can infer that Y is a Project, and X is a Project Manager and therefore a Person.

AI people at some point thought that in case we want to build a system to be able to somehow think we should enable the system to somehow know what we know about the world. In other words they wanted to impose our own understanding of the word to the computers by generating a database which almost contains information and concise definitions about concepts and entities we know. Such databases have been built with different algorithms but not very precise after all. You better have a look on a database which is known to be among the best called CYC. http://sw.opencyc.org/ check few words in the box and see what you get as a return. Best wishes

Once upon a time I have assigned such question to a good developer to answer as a task, because my superior believed in Ontologies. It didn't materialize to any sharp answer and my superior was fired after some time. I'm still curious.

My current understanding is that this is an idea of words in a natural language (or "entities") being connected to each other with different relations. Then we generalize that idea to any DB entities. And basically, we end up with nothing interesting and with no useful query language.

I may be wrong.

What about wikipedia?

an ontology is a formal representation of the knowledge by a set of concepts within a domain and the relationships between those concepts

See 'Domain ontologies' and this and that for more details.

Some of the comments above seem a bit dismissive. I've used an ontology database in a real product and it was the only way to solve the problem. An ontology can be used to create a database that can encompass the complexities of the real world much better than something like an relational database. More "information" than "data". It's especially good when the relationships are complex and the information set is large and incomplete. Especially neat is the query mechanism in a good ontology database - it intelligently uses the schema/ontology (such as any class hierarchies) to return answers that would not be found otherwise.

Coming from the Biological Sciences, Ontology is a word that represents a really easy idea, but is defined with other less-commonly used words.

a formal representation of the knowledge by a set of concepts within a domain and the relationships between those concepts

A representation of knowledge, or a "model"
A domain, or "a topic"
A set of concepts, or "things in the domain"
A set of relationships between concepts

So, in computer science terms, it's a graph, where the nodes correspond to things which are all part of the same topic, are annotated with topic-related data, and are connected to other nodes with relationship annotated edges.

As it is a model that doesn't fit into relational databases well, if you intend to store an Ontology you might want to use a graph database, or one of the popular relational database graph storage techniques.

The primary reason Ontologizes haven't overtaken relational databasees in all aspects is because relational databases provide a simple, even if less flexible, means of connecting two items, the foreign-key. While this key doesn't permit a lot of annotation to describe the relationship, it does limit the number of approaches to data structuring, preventing people from creating every kind of relationship (which thankfully means limiting the number of wasteful relationships).

For example, in a "family tree" database based on Ontologies

The domain is one family's tree
The model is the individuals and their relationships within the family tree.
The concepts are the people in the family.
The relationships would be the edges indicating "mother", "father", "bother", "sister", etc.

Note that now comes the tricky part. You have "mother" and "father", but what about "parent"? If you omit "parent" your lookup logic is more complex, so let's include a new relationshiop "parent", which means a "mother" of a person now has two links, "mother" and "parent" (as does the father).

What about "grandparent"? Again, doing it logically leaves some of the information out of the database, but storing it increases the overhead of maintaining the database.

"uncle", "aunt", "in-law", "father-in-law", etc. all add in one new relationship, and the power behind Ontologies is that you are not constrained as to the kinds of relationships you wish to add; however, the difficulties lie in knowing which relationships directly impact the solution (and the general lack of performance if you don't store the relationships directly, as you need to do multiple database lookups to find a "composed relationship").

A long time ago, I used an ontology database developed at Stanford (Protege).

The idea was to keep track of references. Books had authors and quotes. A quote had a link to a book, along with a page number. An author had links to books, books had publisher, publication date, links to authors. Similarly for articles and videos.

The idea was to insert a quote, and have ready access to the attribution, so I no longer had to keep track of what book and page a quotation was found in, the next time I used it.

The ontology database provided a superb way to model the data. But using it was another matter. It took more time to pull the parts of a reference out of the database than it took to copy a complete quotation-and-reference-info from a Word doc.

All it would take to make something like that really useful would be an integration into a Word processor. (Ideally, you would add references more or less normally, but then save them for later re-use, along with a link to the location where you used! :__)

I am a total layman, but it appears to me that artificial intelligence research has a 50 year history that goes round in cycles.