Aggregated Indexing of Biomedical Time Series Data

Author

Woodbridge, Jonathan ; Mortazavi, Bobak ; Sarrafzadeh, Majid ; Bui, Alex A T

Author_Institution

Comput. Sci. Dept., Univ. of California, Los Angeles, Los Angeles, CA, USA

fYear

2012

fDate

27-28 Sept. 2012

Firstpage

Lastpage

Abstract

Remote and wearable medical sensing has the potential to create very large and high dimensional datasets. Medical time series databases must be able to efficiently store, index, and mine these datasets to enable medical professionals to effectively analyze data collected from their patients. Conventional high dimensional indexing methods are a two stage process. First, a superset of the true matches is efficiently extracted from the database. Second, supersets are pruned by comparing each of their objects to the query object and rejecting any objects falling outside a predetermined radius. This pruning stage heavily dominates the computational complexity of most conventional search algorithms. Therefore, indexing algorithms can be significantly improved by reducing the amount of pruning. This paper presents an online algorithm to aggregate biomedical times series data to significantly reduce the search space (index size) without compromising the quality of search results. This algorithm is built on the observation that biomedical time series signals are composed of cyclical and often similar patterns. This algorithm takes in a stream of segments and groups them to highly concentrated collections. Locality Sensitive Hashing (LSH) is used to reduce the overall complexity of the algorithm, allowing it to run online. The output of this aggregation is used to populate an index. The proposed algorithm yields logarithmic growth of the index (with respect to the total number of objects) while keeping sensitivity and specificity simultaneously above 98%. Both memory and runtime complexities of time series search are improved when using aggregated indexes. In addition, data mining tasks, such as clustering, exhibit runtimes that are orders of magnitudes faster when run on aggregated indexes.

Keywords

computational complexity; data mining; database indexing; electrocardiography; medical signal processing; query processing; search problems; time series; very large databases; wearable computers; LSH; algorithm complexity reduction; biomedical time series signals; biomedical times series data aggregation; computational complexity; cyclical patterns; data mining; data storage; database indexing; high dimensional datasets; high dimensional indexing method; locality sensitive hashing; medical time series databases; memory complexity; query object; remote medical sensing; runtime complexity; search algorithms; search results; search space reduction; supersets; time series search; very large datasets; wearable medical sensing; Clustering algorithms; Complexity theory; Electrocardiography; Indexing; Time series analysis; Data mining; Indexing; Time series signals;

fLanguage

English

Publisher

ieee

Conference_Titel

Healthcare Informatics, Imaging and Systems Biology (HISB), 2012 IEEE Second International Conference on

Conference_Location

San Diego, CA

Print_ISBN

978-1-4673-4803-4

Type

conf

DOI

10.1109/HISB.2012.13

Filename

6366184

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=579461