Vikram Pudi's PhD Thesis

Title: Efficient Discovery of Concise Association Rules from Large Databases

Advisor: Prof. Jayant R Haritsa

University/Institute: Indian Institute of Science, Bangalore, India

Date of publication: April 2003

This thesis was awarded the Prof BG Raghavendra Memorial Medal for the Best PhD Thesis (in the area of Operations Research and Allied Areas) in the year 2003-04 by the institute.

Abstract:

Association rules are interesting correlations among attributes in a database. These rules have many applications in areas ranging from e-commerce to sports to census analysis to medical diagnosis. The discovery of association rules is an extremely computationally expensive task and it is therefore imperative to have fast scalable algorithms for mining these rules. In this thesis, we present efficient techniques for discovering association rules from large databases and for removing redundancy from these rules so as to improve the quality of output. We also handle growing databases. Specifically, we present three new algorithms: (1) ARMOR: This algorithm discovers association rules from databases and requires at most two database scans. We empirically show its performance to be within a factor of two of an unachievable lower bound. (2) g-ARMOR: This is an extension to ARMOR that is designed to remove redundancy from association rules during the mining process. This is especially important because the number of association rules generated in typical mining operations runs into the tens of thousands. g-ARMOR results in an orders of magnitude reduction in the number of rules thereby making the mining output comprehensible to end users. (3) DELTA: This algorithm incrementally mines evolving databases. It utilizes previous mining results to efficiently mine the current database after it has been updated with fresh data. It also handles situations where the mining specifications over the current database differ from those used over the original database, a common occurence in practice.

Download Entire Thesis: pdf format (1.1MB)


Vikram Pudi

New Boys Hostel 134,
International Institute of Information Technology (IIIT)
Gachibowli, Hyderabad 500019
India