DocumentCode :
3717173
Title :
Brown Dog: Leveraging everything towards autocuration
Author :
Smruti Padhy;Greg Jansen;Jay Alameda;Edgar Black;Liana Diesendruck;Mike Dietze;Praveen Kumar;Rob Kooper;Jong Lee;Rui Liu;Richard Marciano;Luigi Marini;Dave Mattson;Barbara Minsker;Chris Navarro;Marcus Slavenas;William Sullivan;Jason Votava;Inna Zharnitsky
Author_Institution :
National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
fYear :
2015
Firstpage :
493
Lastpage :
500
Abstract :
We present Brown Dog, two highly extensible services that aim to leverage any existing pieces of code, libraries, services, or standalone software (past or present) towards providing users with a simple to use and programmable means of automated aid in the curation and indexing of distributed collections of uncurated and/or unstructured data. Data collections such as these encompassing large varieties of data, in addition to large amounts of data, pose a significant challenge within modern day "Big Data" efforts. The two services, the Data Access Proxy (DAP) and the Data Tilling Service (DTS), focusing on format conversions and content based analysis/extraction respectively, wrap relevant conversion and extraction operations within arbitrary software, manages their deployment in an elastic manner, and manages job execution from behind a deliberately compact REST API. We describe both the motivation and need/scientific drivers for such services, the constituent components that allow for arbitrary software/code to be used and managed, and lastly an evaluation of the systems capabilities and scalability.
Keywords :
"Data mining","Metadata","Software","Libraries","Big data","Indexing"
Publisher :
ieee
Conference_Titel :
Big Data (Big Data), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/BigData.2015.7363791
Filename :
7363791
Link To Document :
بازگشت