Title :
Improving the Efficiency of Legal E-Discovery Services Using Text Mining Techniques
Author :
Joshi, Sachindra ; Deshpande, Prasad M. ; Hampp, Thomas
Author_Institution :
IBM Res., India
fDate :
March 29 2011-April 2 2011
Abstract :
E-Discovery Review is a type of legal service that aims at finding relevant electronically stored information (ESI) in a legal case. This requires manual reviewing of large number of documents by legal analysts, thus involving huge costs. In this paper, we investigate the use of IT, specifically text mining techniques, for improving the efficiency and quality of the ediscovery review service. We employ near duplicate detection and automatic classification techniques that can be used to create coherent groups of documents. Since a group characterizes a syntactic or a semantic theme all the documents in a group can be reviewed together. This leads to a faster and more consistent review of documents. Our experimental results on the publicly available Enron email corpus show that we can achieve high precision and recall in identifying the syntactic and semantic groups. We also conduct a user study that demonstrates 80% reduction in the review time and improved consistency in the review results, leading to better service quality.
Keywords :
data mining; pattern classification; public administration; text analysis; Enron email corpus; automatic classification techniques; electronically stored information; legal case; legal e-discovery services; near duplicate detection; text mining techniques; Electronic mail; Indexes; Law; Manuals; Semantics; Syntactics;
Conference_Titel :
SRII Global Conference (SRII), 2011 Annual
Conference_Location :
San Jose, CA
Print_ISBN :
978-1-61284-415-2
Electronic_ISBN :
978-0-7695-4371-0
DOI :
10.1109/SRII.2011.97