• DocumentCode
    2501150
  • Title

    Information extraction from legal documents

  • Author

    Cheng, Tin Tin ; Cua, Jeffrey Leonard ; Tan, Mark Davies ; Yao, Kenneth Gerard ; Roxas, Rachel Edita

  • Author_Institution
    BS Comput. Sci., De La Salle Univ., Manila, Philippines
  • fYear
    2009
  • fDate
    20-22 Oct. 2009
  • Firstpage
    157
  • Lastpage
    162
  • Abstract
    Legal TRUTHS (turning unstructured texts to helpful structure) is a system that extracts relevant information from Philippine Supreme Court decisions, specifically on criminal cases. We describe here the processes involved in the development of Legal TRUTHS focusing on the issues relating to the domain and the geographical setting of the source documents, and the performance evaluation results are also presented. Pertinent information to be extracted for criminal cases such as the crime, the date and time of commission, the plaintiff, and the penalty were determined from a sample set of documents. Sections of these documents were identified for initial segmentation of the data. Automatic filtering of the data was involved in drawing out relevant information from the texts. From 25 training documents and also the same set for testing, performance showed over-all precision at 91.7%, recall at 99.5%, and F-measure at 95.6%. Testing on another 50 documents showed over-all precision at 84.3%, recall at 95.8%, and F-measure at 91.0%.
  • Keywords
    criminal law; document handling; information retrieval; Philippine Supreme Court decisions; automatic data filtering; criminal cases; helpful structure; information extraction; legal TRUTHS; legal documents; turning unstructured texts; Data mining; Information filtering; Information filters; Law; Legal factors; Natural language processing; Tagging; Testing; Tin; Turning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing, 2009. SNLP '09. Eighth International Symposium on
  • Conference_Location
    Bangkok
  • Print_ISBN
    978-1-4244-4138-9
  • Electronic_ISBN
    978-1-4244-4139-6
  • Type

    conf

  • DOI
    10.1109/SNLP.2009.5340925
  • Filename
    5340925