Title :
Robust recognition of complex entities in text exploiting enterprise data and NLP-techniques
Author :
Brauer, Falk ; Schramm, Marcus ; Barczynski, Wojciech ; Löser, Alexander ; Do, Hong-Hai
Author_Institution :
SAP Res., SAP AG, Dresden
Abstract :
Data transactions between business partners often include unstructured data such as invoices or purchase orders. In order to process such automatically, complex business entities need to be identified. Examples for complex entities are products, business partners and purchase orders which are stored in a supplier relationship management system. Both, structured records in the enterprise system and text data, describe these complex entities. A major challenge is to correctly associate entities recognized in unstructured data with entities stored in structured data, e.g. enterprise databases. We address that problem and propose a robust process methodology which includes three phases: candidate extraction from unstructured text, generation of initial mappings with structured data and disambiguation of the mappings exploiting relationships among the entities in the enterprise data and the documentspsila structure. We describe each step in detail, propose a common architecture and introduce to our data model and algorithms.
Keywords :
business data processing; database management systems; text analysis; NLP-techniques; data transactions; enterprise data; enterprise databases; robust recognition; supplier relationship management system; text data; Costs; Current supplies; Data mining; Data models; Databases; Identity management systems; Robustness; Supply chain management; Supply chains; Text recognition;
Conference_Titel :
Digital Information Management, 2008. ICDIM 2008. Third International Conference on
Conference_Location :
London
Print_ISBN :
978-1-4244-2916-5
Electronic_ISBN :
978-1-4244-2917-2
DOI :
10.1109/ICDIM.2008.4746780