In my previous post, I mentioned that confidence is the conditional probability that a transaction having X also contains Y. It can be more explained using an example. As the example, I’m using the same example of supermarket Item list.(See table 1)
Transaction ID | Items Bought |
---|---|
T100 | M,O,N,K,E,Y |
T200 | D,O,N,K,E,Y |
T300 | M,A,K,E |
T400 | M,U,C,K,Y |
T500 | C,O,O,K,I,E |
Table 1
In this example, we only considered the patterns with minimum support of 60%. We found patterns such as K,KE or EKO. Minimum support confirms that if we take a fair amount of transactions, at least 60% of them have each of those patterns. But that does not concerned about relationship among those elements.
As an example, we cannot guarantee that if we found an E in a transaction, the probability of having KO in the same transaction. We can say Y happens when X happened in a transaction with 80% confidence if an only if value of sup(X ∩ Y)/sup(X) exceeds 80%.
Lets clear things with some examples.
conf(E=>KO) = sup(E ∩ (K ∩ O))/ sup(E)
=3/4 =0.75 (or 75%)
conf(K=>EO) = sup(K ∩ (E ∩ O))/ sup(K)
=3/5 = 0.6 or (or 60%)
conf(O=>KO) = sup(O ∩ (K ∩ O))/ sup(O)
=3/3 = 1 or (or 100%)
conf(O=>K) = sup(O ∩ K)/ sup(O)
=3/3 = 1 or (or 100%)
conf(OE=>K) = sup(O ∩ E ∩ K)/ sup(O ∩ E)
=3/3 = 1 or (or 100%)
Lets take minimum confidence is 80%. So according to that value, we cannot guarantee that E=>KO or K=>EO happens. But we can say that O=>KO , O=>K or OE=>K with given level of confidence.
So this is the end of explaining Apriori Algorithm. Please comment below if there is any problem.