Vikram Pudi's PhD Thesis
Title: Efficient Discovery of Concise Association Rules from Large Databases
Advisor: Prof. Jayant R Haritsa
Date of publication: April 2003
This thesis was awarded the Prof BG Raghavendra Memorial Medal for the Best PhD Thesis (in the area of Operations Research and Allied Areas) in the year 2003-04 by the institute.
Abstract:
Association rules are interesting correlations among attributes in a database. These rules
have many applications in areas ranging from e-commerce to sports to census analysis
to medical diagnosis. The discovery of association rules is an extremely computationally
expensive task and it is therefore imperative to have fast scalable algorithms for mining
these rules. In this thesis, we present efficient techniques for discovering association rules
from large databases and for removing redundancy from these rules so as to improve the
quality of output. We also handle growing databases.
Specifically, we present three new algorithms: (1) ARMOR: This algorithm discovers
association rules from databases and requires at most two database scans. We empirically
show its performance to be within a factor of two of an unachievable lower bound. (2)
g-ARMOR: This is an extension to ARMOR that is designed to remove redundancy from
association rules during the mining process. This is especially important because the
number of association rules generated in typical mining operations runs into the tens
of thousands. g-ARMOR results in an orders of magnitude reduction in the number of
rules thereby making the mining output comprehensible to end users. (3) DELTA: This
algorithm incrementally mines evolving databases. It utilizes previous mining results to
efficiently mine the current database after it has been updated with fresh data. It also
handles situations where the mining specifications over the current database differ from those used over the original database, a common occurence in practice.
Download Entire Thesis: pdf format (1.1MB)
Vikram Pudi
New Boys Hostel 134,
International Institute of Information Technology (IIIT)
Gachibowli, Hyderabad 500019
India