开发者

Lossless Join Decomposition

开发者 https://www.devze.com 2023-01-28 10:18 出处:网络
I am studying for a test, and this is on the study guide sheet.This is not homework, and will not be graded.

I am studying for a test, and this is on the study guide sheet. This is not homework, and will not be graded.

Relation 开发者_如何学JAVASchema R = (A,B,C,D,E)

Functional Dependencies = (AB->E, C->AD, D->B, E->C)

Is r1 = (A,C,D) r2 = (B,C,E) OR

x1 = (A,C,D) x2 = (A,B,E) a lossless join decomposition? and why?


My relational algebra is horribly rusty, but here is how I remember it to go

If r1 ∩ r2 -> r1 - r2 or r1 ∩ r2 -> r2 - r1 in FDs then you have lossless decomposition.

r1 ∩ r2 = C
r1 - r2 = AD

C->AD is in functional dependencies => lossless

for x1 and x2

x1 ∩ x2 = A
x1 - x2 = CD

A->CD is not in FDs now check x2 - x1

x2 - x1 = BE

A->BE is not in FDs either, therefore lossy

references here, please check for horrible mistakes that I might have committed


Here is my understanding, basically you look at your decompositions and determine whether the common attributes between the relations is a key to at least one of the relations.

So with R1 and R2 - the only thing common between them is C. C would be a Key to R1, since you are given C -> A,D. So it's lossless.

For X1 and X2, the only thing common is A, which by itself is neither a key for X1 or X2 from the functional dependencies you are given.


Functional Dependencies = (AB->E, C->AD, D->B, E->C)

Is r1 = (A,C,D) r2 = (B,C,E) is lossless when you perform the Chase algorithm. It can be seen that both tables agree on 'C' and the dependency C->AD is preserved in the table ACD.

x1 = (A,C,D) x2 = (A,B,E) is lossy as you will conclude after performing Chase Algorithm. Alternately, it can be seen that both the tables only agree on A and there is no such dependency that is fully functionally dependent on A.


As described here, decomposition of R into R1 and R2 is lossless if

  1. Attributes(R1) U Attributes(R2) = Attributes(R)
  2. Attributes(R1) ∩ Attributes(R2) ≠ Φ
  3. Common attribute must be a key for at least one relation (R1 or R2)

EDIT


with the assumption that only the non-trivial cases are considered here, I think OP intended that too (so that (2) holds under this non-trivial assumption):

e.g., not considering the trivial corner case where all tuples of R1 / R2 are unique, i.e., the empty set {} is a key (as @philipxy pointed out), hence any decomposition will be lossless and hence not interesting (since spurious tuples cant be created upon joining) - the corner cases for which decomposition can be lossless despite Attributes(R1) ∩ Attributes(R2) = Φ are ruled out.


We can check the above conditions with the following python code snippet:

def closure(s, fds):
    c = s
    for f in fds:
        l, r = f[0], f[1]
        if l.issubset(c):
            c = c.union(r)
    if s != c:
        c = closure(c, fds)
    return c

def is_supekey(s, rel, fds):
    c = closure(s, fds)
    print(f'({"".join(sorted(s))})+ = {"".join(sorted(c))}')
    return rel.issubset(c) #c == rel

def is_lossless_decomp(r1, r2, r, fds):
    c = r1.intersection(r2)
    if r1.union(r2) != r:
        print('not lossless: R1 U R2 ≠ R!')
        return False
    if r1.union(r2) != r or len(c) == 0:
        print('not lossless: no common attribute in between R1 and R2!')
        return False
    if not is_supekey(c, r1, fds) and not is_supekey(c, r2, fds):
        print(f'not lossless: common attribute {"".join(c)} not a key in R1 or R2!')
        return False
    print('lossless decomposition!')
    return True     

To process the given FDs in standard form, to convert to suitable data structure, we can use the following function:

import re

def process_fds(fds):
    pfds = []
    for fd in fds:
        fd = re.sub('\s+', '', fd)
        l, r = fd.split('->')
        pfds.append([set(list(l)), set(list(r))])
    return pfds

Now let's test with the above decompositions given:

R = {'A','B','C','D','E'}
fds = process_fds(['AB->E', 'C->AD', 'D->B', 'E->C'])

R1, R2 = {'A', 'C', 'D'}, {'B', 'C', 'E'} 
is_lossless_decomp(R1, R2, R, fds)
# (C)+ = ACD
# lossless decomposition!

R1, R2 = {'A', 'C', 'D'}, {'A', 'B', 'E'} 
is_lossless_decomp(R1, R2, R, fds)
# (A)+ = A
# not lossless: common attribute A not a key in R1 or R2!
0

精彩评论

暂无评论...
验证码 换一张
取 消