An Efficient Apriori algorithm for Frequent Pattern Mining using MapReduce in Healthcare Data

Seifedine Kadry

Abstract


The development for data mining technology in healthcare is growing today as knowledge and data mining are a must for the medical sector. Healthcare organizations generate and gather large quantities of daily information. Use of IT allows for the automation of data mining and information that help to provide some interesting patterns which remove manual tasks and simple data extraction from electronic records, a process of electronic data transfer which secures medical records, saves lives and cuts the cost of medical care and enables early detection of infectious diseases. In this research paper an improved Apriori algorithm names Enhanced Parallel and Distributed Apriori (EPDA) is presented for the health care industry, based on the scalable environment known as Hadoop MapReduce. The main aim of the work proposed is to reduce the huge demands for resources and to reduce overhead communication when frequent data are extracted, through split-frequent data generated locally and the early removal of unusual data. The paper shows test results, whereby the EPDA performs in terms of the time and number of rules generated with a database of healthcare and different minimum support values.


Refbacks

  • There are currently no refbacks.


Bulletin of EEI Stats