• DocumentCode
    110037
  • Title

    Using Common Table Expressions to Build a Scalable Boolean Query Generator for Clinical Data Warehouses

  • Author

    Harris, Daniel R. ; Henderson, Darren W. ; Kavuluru, Ramakanth ; Stromberg, Arnold J. ; Johnson, Todd R.

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Kentucky, Lexington, KY, USA
  • Volume
    18
  • Issue
    5
  • fYear
    2014
  • fDate
    Sept. 2014
  • Firstpage
    1607
  • Lastpage
    1613
  • Abstract
    We present a custom, Boolean query generator utilizing common-table expressions (CTEs) that is capable of scaling with big datasets. The generator maps user-defined Boolean queries, such as those interactively created in clinical-research and general-purpose healthcare tools, into SQL. We demonstrate the effectiveness of this generator by integrating our study into the Informatics for Integrating Biology and the Bedside (i2b2) query tool and show that it is capable of scaling. Our custom generator replaces and outperforms the default query generator found within the Clinical Research Chart cell of i2b2. In our experiments, 16 different types of i2b2 queries were identified by varying four constraints: date, frequency, exclusion criteria, and whether selected concepts occurred in the same encounter. We generated nontrivial, random Boolean queries based on these 16 types; the corresponding SQL queries produced by both generators were compared by execution times. The CTE-based solution significantly outperformed the default query generator and provided a much more consistent response time across all query types (M = 2.03, SD = 6.64 versus M = 75.82, SD = 238.88 s). Without costly hardware upgrades, we provide a scalable solution based on CTEs with very promising empirical results centered on performance gains. The evaluation methodology used for this provides a means of profiling clinical data warehouse performance.
  • Keywords
    Boolean functions; SQL; data warehouses; health care; medical information systems; query processing; CTE; Clinical Research Chart cell; Informatics for Integrating Biology and the Bedside query tool; SQL query; clinical data warehouses; common table expression; default query generator; healthcare tools; i2b2 query tool; nontrivial random Boolean query; scalable Boolean query generator; Data warehouses; Educational institutions; Generators; Informatics; Servers; Testing; Time factors; Biomedical computing; biomedical informatics; data systems; data warehouses; health information management; large-scale systems;
  • fLanguage
    English
  • Journal_Title
    Biomedical and Health Informatics, IEEE Journal of
  • Publisher
    ieee
  • ISSN
    2168-2194
  • Type

    jour

  • DOI
    10.1109/JBHI.2013.2292591
  • Filename
    6674997