Title :
Extraction of Interpretable Multivariate Patterns for Early Diagnostics
Author :
Ghalwash, Mohamed F. ; Radosavljevic, Vladan ; Obradovic, Z.
Author_Institution :
Center for Data Anal. & Biomed. Inf., Temple Univ., Philadelphia, PA, USA
Abstract :
Leveraging temporal observations to predict a patient´s health state at a future period is a very challenging task. Providing such a prediction early and accurately allows for designing a more successful treatment that starts before a disease completely develops. Information for this kind of early diagnosis could be extracted by use of temporal data mining methods for handling complex multivariate time series. However, physicians usually prefer to use interpretable models that can be easily explained, rather than relying on more complex black-box approaches. In this study, a temporal data mining method is proposed for extracting interpretable patterns from multivariate time series data, which can be used to assist in providing interpretable early diagnosis. The problem is formulated as an optimization based binary classification task addressed in three steps. First, the time series data is transformed into a binary matrix representation suitable for application of classification methods. Second, a novel convex-concave optimization problem is defined to extract multivariate patterns from the constructed binary matrix. Then, a mixed integer discrete optimization formulation is provided to reduce the dimensionality and extract interpretable multivariate patterns. Finally, those interpretable multivariate patterns are used for early classification in challenging clinical applications. In the conducted experiments on two human viral infection datasets and a larger myocardial infarction dataset, the proposed method was more accurate and provided classifications earlier than three alternative state-of-the-art methods.
Keywords :
concave programming; convex programming; data mining; data reduction; diseases; integer programming; matrix algebra; medical computing; patient diagnosis; pattern classification; time series; binary matrix representation; complex black-box approach; complex multivariate time series data; convex-concave optimization problem; dimensionality reduction; early classification methods; early diagnostics; human viral infection datasets; interpretable multivariate pattern extraction; mixed integer discrete optimization formulation; myocardial infarction dataset; optimization based binary classification task; patient health state prediction; temporal data mining methods; temporal observations; Data mining; Diseases; Logistics; Optimization; Time series analysis; Vectors; early classification; early diagnosis; interpretability; multivariate time series; pattern extraction;
Conference_Titel :
Data Mining (ICDM), 2013 IEEE 13th International Conference on
Conference_Location :
Dallas, TX
DOI :
10.1109/ICDM.2013.19