• DocumentCode
    2664405
  • Title

    Automatic extraction of translation patterns from bilingual legal corpus

  • Author

    Ohara, Makoto ; Matsubara, Shigeki ; Inagaki, Y.

  • Author_Institution
    Graduate Sch. of Eng., Nagoya Univ., Japan
  • fYear
    2003
  • fDate
    26-29 Oct. 2003
  • Firstpage
    150
  • Lastpage
    157
  • Abstract
    The multilingualization of legal documents is desirable for promoting the internationalization of the society. Since it is vital to choose proper terms when translating legal documents, which include technical terms and unique patterns, it is desirable to compile bilingual dictionaries for each legal domain. Compiling basic bilingual dictionaries for legal documents, however, is a difficult task because of the great range of legal documents. We describe a method for automatically extracting translation patterns for legal document translation by using legal documents and their translated documents. The proposed method extracts translation patterns with Japanese bunsetsu-level units from legal sentences and the translated sentences that are properly aligned with each other. The proposed method utilizes three indexes for pattern extraction: bilingual dictionaries, statistical co-occurrence information on the parallel corpus, and syntactic information based on dependency grammar. We have extracted translation patterns from the Japanese civil code and its translation. The result has provided 80.5% precision and 49.1% recall, and the extracted translation patterns will be useful for translating legal documents and helping to construct a Japanese-English legal dictionary.
  • Keywords
    computational linguistics; dictionaries; document handling; grammars; language translation; law administration; pattern recognition; Japanese bunsetsu-level units; Japanese civil code; Japanese-English legal dictionary; automatic translation pattern extraction; bilingual dictionaries; bilingual legal corpus; dependency grammar; legal documents; parallel corpus; pattern extraction; statistical co-occurrence information; syntactic information; Costs; Data mining; Dictionaries; Information science; Information technology; Large-scale systems; Law; Legal factors; Natural languages; Stochastic processes;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003 International Conference on
  • Conference_Location
    Beijing, China
  • Print_ISBN
    0-7803-7902-0
  • Type

    conf

  • DOI
    10.1109/NLPKE.2003.1275886
  • Filename
    1275886