开发者

Generating counts from closed frequent itemsets

开发者 https://www.devze.com 2023-03-08 01:51 出处:网络
I am reading note where it seems it is said: Given collection of all closed frequent itemsets and their support counts, the support count of any frequent itemset can be obtained.

I am reading note where it seems it is said: Given collection of all closed frequent itemsets and their support counts, the support count of any frequent itemset can be obtained.

A frequent itemset is 开发者_StackOverflow社区called closed if no larger itemset properly contains it and has the same support count.

Trying to prove this but can not work it out.

Here are some link to definitions regarding association rule mining:

Association rule mining


A closed itemset X is an itemset that is not included in another itemset having the same support.

All the itemsets Y1, Y2, Y3 .. YN that are included in X and have the same support are said to be in the same equivalence class. They are not closed itemsets because they are included in a larger itemset that has the same support (X).

Now let's sayt that you have the set of all frequent closed itemsets C and that you want to know the support of an itemset F.

What you need to do is very simple. You need to compare F with all the frequent closed itemsets. You have to find the smallest closed itemsets W such that F is included in W. Then the support of F is the support of W.

If you want more details about closed itemsets, i suggest to read the paper by Pasquier:

http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=7956B5A50ED076203227367503FA7958?doi=10.1.1.37.1102&rep=rep1&type=pdf

If you want some algorithms source code for mining closed itemsets, you can check my Java project:

http://www.philippe-fournier-viger.com/spmf/

It offers AprioriClose and DCI_Closed.


You know that no set can have higher support than its subsets... so the support of any given itemset equals the support of the most frequent superset: sup(x) = max{y.support | y is superset of x and y is in the closed frequent itemsets}

There exists an algorithm to produce support of all itemsets, given closed frequent itemsets and their supports:

kmax = size of largest closed itemset
Fmax = closed frequent itemsets of size kmax
for k = kmax downto 1 do
    Fk = {f | f immediate subset of f' in Fk+1 or f is closed | |f|=k}
    for every f in Fk do
        if f is not closed
            f.support = max{f'.support | f' in Fk+1 , f' is a superset of f}
        endif
    endfor
endfor

Source: http://www.cs.helsinki.fi/group/bioinfo/teaching/dami_s10/dami_lecture4.pdf

0

精彩评论

暂无评论...
验证码 换一张
取 消