MEASURING CLINICAL HEALTH DATABASE SIMILARITY USING CLUSTERING AND CLASSIFICATION

P.Vaishnavi, T.Nagavendar, Y.Shiva Reddy

Authors

P.Vaishnavi, T.Nagavendar, Y.Shiva Reddy B.Tech, Computer Science and Engineering, CMR Engineering College, Medchal, T.S, India Author

Abstract

Clustering data derived from Electronic Health Record (EHR) systems is important to discover
relationships between the clinical profiles of patients and as a preprocessing step for analysis
tasks, such as classification. However, the heterogeneity of these data makes the application of
existing clustering methods difficult and calls for new clustering approaches. In this paper, we
propose the first approach for clustering a dataset in which each record contains a patient‟s
values in demographic attributes and their set of diagnosis codes. Our approach represents the
dataset in a binary form in which the features are selected demographic values, as well as
combinations (patterns) of frequent and correlated diagnosis codes. This representation enables
measuring si- milarity between records using cosine similarity, an effective measure for binaryrepresented
data, and finding compact, well-separated clusters through hierarchical clustering.
Our experiments using two publicly available EHR datasets, comprised of over 26,000 and
52,000 records, demonstrate that our approach is able to construct clusters with correlated
demographics and diagnosis codes, and that it is efficient and scalable.

MEASURING CLINICAL HEALTH DATABASE SIMILARITY USING CLUSTERING AND CLASSIFICATION

Authors

Abstract

Downloads

Published

Issue

Section

How to Cite

Call For Paper

Submission

MenuBar

Visitors in IJESR

Images

Indexed

Information

Reach Us

Important Links

Downloads & Indexing

Ethics & Policies