Title :
An Ensemble-Based Named Entity Recognition Solution for Detecting Consumer Products
Author_Institution :
Fac. of Math., Inf. & Mech., Univ. of Warsaw, Warsaw, Poland
Abstract :
This paper presents a technical description of a solution for International Conference on Data Mining 2012 Contest - Consumer Products number 1. The Contest provided a dataset including thousands of text items, a product catalog with over fifteen million products, and hundreds of manually annotated product mentions to support data-driven approaches. The task was to identify product mentions within a large user-generated web-based textual corpus and disambiguate the mentions against the large product catalog. The solution consists of an ensemble-based algorithm for processing a textual content. It uses Conditional Random Fields and a special approach which recognizes product mentions. This solution finished in the third place in the contest.
Keywords :
consumer products; data mining; electronic commerce; Web based textual corpus; conditional random fields; data driven approaches; data mining; detecting consumer products; ensemble based named entity recognition solution; product catalog; textual content processing; Algorithm design and analysis; Catalogs; Conferences; Consumer products; Lead; Prediction algorithms; Training data; Conditional Random Field; Consumer products; ICDM Contest; Named Entity Recognition; Sequence Tagging;
Conference_Titel :
Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on
Conference_Location :
Brussels
Print_ISBN :
978-1-4673-5164-5
DOI :
10.1109/ICDMW.2012.58