Learning Robust Association Rules from Data


Example of an Association Rule

"People who buy diapers tend to buy beer." [Grocery data]

Examples of Robust Association Rules

"People who buy baby products tend to buy controlled substances, use credit cards and make purchases on Saturdays." [Grocery data]

"Democrats born in 1960's and registered in the 1980's, tend to be female." [Cambridge, MA Voter List]

"Republican Whites owning home tend to be females with no children." [Pittsburgh, PA Voter List for ZIP 15213]


Learning semantically useful association rules across all attributes of a relational table requires:

  1. more rigorous learning than afforded by traditional approaches.
    See GenTree

  2. the invention of knowledge ratings for learned rules, not just statistical measures.
    See GenTree with knowledge ratings


Traditional algorithms began by learning rules over one attribute expressed in the domain values of that attribute (Srikant and Agrawal, 1995). People who buy diapers tend to buy beer is an example from grocery purchases. In the second generation, Srikant and Agrawal, (1996) introduced a hierarchy whose base values are those originally represented in the data, and values appearing at higher levels in the hierarchy represent increasingly more general concepts of base values. Rules learned over the attribute using the hierarchy are termed generalized association rules (or cross-level rules). People who buy baby products tend to buy controlled substances is an example of a generalized association rule. The work reported herein continues the evolution in the expressiveness of association rules to its broadest application semantically rated rules learned from a large relational table having many attributes. Associated with each attribute is a hierarchy. We find the rules that are formed by combining mixed levels of generalizations across all attributes and that convey the maximum expression of information supported by attribute hierarchies, parameter settings and data tuples. We term these robust rules. An example of a robust rule is People who buy baby products tend to buy controlled substances, use credit cards and make purchases on Saturdays.

Poster

Keywords: association rules, classification problems, data mining, knowledge acquisition, rule learning, hiearchical learning

Related Publications

Related Links


Copyright © 2011. President and Fellows Harvard University.   |   IQSS   |    Data Privacy Lab   |    [info@dataprivacylab.org]