Arrhythmia is a problem with the rate or rhythm of the heartbeat. During an arrhythmia, the heart beat can be too fast, too slow, or with an irregular rhythm. Almost all of arrhythmias have different profiles in electrocardiograph(ECG). Often, the arrhythmia happens randomly during the initial stage of the disease progression and patients do not feel any symptoms of discomfort. Thus, technicians must continuously collect ECG data by mobile ECG machine (Holter monitor), from 24 hours to 72hours, if they intend to ascertain the presence of certain types of arrhythmia. It is vital that we design an optimal auto classification algorithm that diagnoses arrhythmia types from enormous amount of ECG data with great accuracy.

In our study, we analyze the MIT-BIH arrhythmia database containing 48 half-hours, two-lead ambulatory ECG recording from 47 subjects. Two or more cardiologists independently annotated each record to obtain the computer-readable reference annotations for each beat (approximately 110,000 annotations in all) included with the database. A variety of arrhythmia types are available in this data set.

An optimal feature extraction algorithm from ECG data is the key to classify with high accuracy and tolerance to noise. We propose a time domain portioning of ECG signals, novel partition-specific critical value detection, subsequent feature selection and their transformations into modeling variables. We will also employ the same methodology in the frequency and geometry domains.

We will evaluate and compare the performance of multinomial logistic regression models combined with best subset automatic variable selection, Recurrent Neural Networks, Random Forest Classifiers and Conditional Decision Trees.


Author: Jianwei Zheng

Coauthor(s): Cyril Rakovski

Status: Work In Progress