ENBIS-16 in Sheffield

11 – 15 September 2016; Sheffield Abstract submission: 20 March – 4 July 2016

Association Rules and Compositional Data Analysis: An Odd Couple?

13 September 2016, 16:00 – 16:20


Submitted by
Josep Antoni Martín-Fernández
Josep Antoni Martín-Fernández (University of Girona), Marina Vives-Mestres (University of Girona), Ron S. Kenett (KPA Group, Raanana, Israel; University of Torino, Torino, Italy; NYU Center for Risk Engineering, New York, USA)
Many modern organizations generate a large amount of transaction data, on a daily basis. Association rule (AR) mining is a powerful semantic data analytic technique used for extracting information from transaction databases. AR was originally developed for basket analysis where the combination of items in a shopping basket are evaluated. An itemset is a set of two of more items. To generate an AR, we first detect the set of more frequent itemsets. Then, as a second step, all possible association rules are generated form each itemset. Any AR that does not satisfy a minimum confidence threshold is removed. Typically, too many AR are found and, after initial filtering, one has to rank rules using additional measures of interest. The R package “arules” provides a broad variety (more than a dozen) of interest measures for AR. In this work we exploit the fact that an AR can be expressed as a contingency table with compositional data (CoDa) structure so that AR and CoDa are not “an odd couple”. We present the properties of AR related compositional measures. Then we show how to confirm the significance of an AR and provide an interpretation of the effects between the itemsets. We contrast CoDa visualization techniques with classical examples to show how this approach proves helpful in analyzing a transaction database.

Return to programme