DocumentCode
2458695
Title
AutoDict: Automated Dictionary Discovery
Author
Chiang, Fei ; Andritsos, Periklis ; Zhu, Erkang ; Miller, Renée J.
Author_Institution
Dept. of Comput. Sci., Univ. of Toronto, Toronto, ON, Canada
fYear
2012
fDate
1-5 April 2012
Firstpage
1277
Lastpage
1280
Abstract
An attribute dictionary is a set of attributes together with a set of common values of each attribute. Such dictionaries are valuable in understanding unstructured or loosely structured textual descriptions of entity collections, such as product catalogs. Dictionaries provide the supervised data for learning product or entity descriptions. In this demonstration, we will present AutoDict, a system that analyzes input data records, and discovers high quality dictionaries using information theoretic techniques. To the best of our knowledge, AutoDict is the first end-to-end system for building attribute dictionaries. Our demonstration will showcase the different information analysis and extraction features within AutoDict, and highlight the process of generating high quality attribute dictionaries.
Keywords
cataloguing; dictionaries; information retrieval; text analysis; AutoDict; attribute dictionary; automated dictionary discovery; data record; end-to-end system; entity collection; entity description; high quality dictionaries; information analysis; information extraction; information theoretic technique; learning product; loosely structured textual description; product catalog; unstructured textual description; Data mining; Data models; Dictionaries; Frequency measurement; Hidden Markov models; TV; Tagging;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering (ICDE), 2012 IEEE 28th International Conference on
Conference_Location
Washington, DC
ISSN
1063-6382
Print_ISBN
978-1-4673-0042-1
Type
conf
DOI
10.1109/ICDE.2012.126
Filename
6228187
Link To Document