Title :
An Incremental Knowledge Acquisition Method for Improving Duplicate Invoices Detection
Author :
Van Hai Ho ; Compton, Paul ; Benatallah, Boualem ; Vayssiere, J. ; Menzel, Lucio ; Vogler, Hartmut
Author_Institution :
Sch. of Comput. Sci. & Eng., Univ. of New South Wales, Kensington, NSW
fDate :
March 29 2009-April 2 2009
Abstract :
Duplicate records are a major problem and duplicate invoices are a specific example of this. The detection of duplicate invoices is a critical issue for business since duplicate invoices can result in a company paying more than once for goods or services ordered. Past experience has shown that generic duplicate record detection techniques are not very useful when applied to invoices: the rate of false positives can be so high that invoice clerks are discouraged from using the system. This is because such approaches do not take the business context into account, e.g. what types of good were ordered as well as the past relationship with that vendor. In this paper, we discuss applying ripple down rules (RDR), an approach for incremental and end-user-centred knowledge acquisition, to the problem of classifying pairs of potential duplicate invoices. We describe how we built a prototype on top of the SAP ERP product and evaluated it on a real data set that had been previously independently audited for duplicates. The preliminary results have highlighted the significant potential of this approach for assisting invoicing clerks processing potential duplicate invoices. We have observed a drop in the rate of false positives from 92% down to 18.66% when compared to traditional approaches that do not take the business context into account. We suggest that incremental development of domain specific knowledge may have more general application to the problem of handling duplicate records.
Keywords :
knowledge acquisition; user centred design; duplicate invoices detection; end-user-centred knowledge acquisition; incremental development; incremental knowledge acquisition method; ripple down rules; Australia; Companies; Computer science; Data engineering; Databases; Knowledge acquisition; Knowledge engineering; Prototypes; Support vector machines; USA Councils; Duplicate Detection; Knowledge Acquisition; Rule-based System;
Conference_Titel :
Data Engineering, 2009. ICDE '09. IEEE 25th International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-3422-0
Electronic_ISBN :
1084-4627
DOI :
10.1109/ICDE.2009.38