Can triple stores be made scalable_问答_开发者

开发者 https://www.devze.com 2023-01-08 04:46 出处：网络

Most triple stores I read about are said to be scalable to around .5 billion triples. I am interested to know if people think there is a theoretical reason to why they have to have an upper limit, an

Most triple stores I read about are said to be scalable to around .5 billion triples.

I am interested to know if people think there is a theoretical reason to why they have to have an upper limit, and whether y开发者_StackOverflow中文版ou know of any particular ways to make them more scalable.

I am curious to know if existing triple stores do things like this:

Represent URIs with integers

Integers in order

Search the integers instead of the URIs which I would imagine must be faster (because you can do things like a binary search etc.)

Thoughts ...

Just to get to 500million a triple store has to do all of that and more. I have spent several years working on a triple store implementation, and I can tell you that breaking 1 billion triples is not as simple as it may seem.

The problem is that many rdf queries are 2nd or 3rd order (and higher-orders are far from unheard of). This means that you are not only querying a set of entities, but simultaneously the data about the set of entities; data about the entities schemas; data describing the schema language used to describe the entities schemas.

All of this without any of the constraints available to a relational database to allow it to make assumptions about the shape of this data/metadata/metametadata/etc.

There are ways to get beyond 500 million, but they are far from trivial, and the low hanging fruit (ie. the approaches you have mentioned) were required just to get to where we are now.

That being said, the flexibility provided by an rdf-store, combined with a denotational semantic available via its interpretation in Description Logics, makes it all worthwhile.