• DocumentCode
    2463063
  • Title

    A Multilingual Patent Text-Mining Approach for Computing Relatedness Evaluation of Patent Documents

  • Author

    Lee, Chung-Hong ; Yang, Hsin-Chang ; Wu, Chih-Hong ; Li, Yi-Ju

  • Author_Institution
    Dept. of Electr. Eng., Nat. Kaohsiung Univ. of Appl. Sci., Kaohsiung, Taiwan
  • fYear
    2009
  • fDate
    12-14 Sept. 2009
  • Firstpage
    612
  • Lastpage
    615
  • Abstract
    This paper describes our work on developing a language-independent technique for discovery of implicit knowledge about patents from multilingual patent information sources. Traditional techniques of multi- and cross-language patent retrieval are mostly based on the process of translation. One major problem of those is that it is difficult to find related patents produced from other countries in a stand-alone patent information system. In this paper, we present a novel system platform to support locating similar and relevant multilingual patent documents. The platform is developed using a multilingual vector space based on the latent semantic indexing (LSI) model, and utilizing collected professional Chinese-English parallel corpora for training the system model. These multilingual patent documents can then be mapped into the semantic vector space for evaluating their similarity by means of text clustering techniques. The preliminary results show that our platform framework has potential for retrieval and relatedness evaluation of multilingual patent documents.
  • Keywords
    data mining; indexing; information retrieval; patents; pattern clustering; text analysis; language-independent technique; latent semantic indexing model; multilingual patent documents; multilingual patent information sources; multilingual patent text-mining approach; multilingual vector space; text clustering techniques; Dictionaries; Indexing; Information management; Information processing; Information retrieval; Information systems; Large scale integration; Multimedia computing; Signal processing; Text mining; Latent semantic indexing; Multilingual patent retrieval; Neural networks; Text clustering; Text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Information Hiding and Multimedia Signal Processing, 2009. IIH-MSP '09. Fifth International Conference on
  • Conference_Location
    Kyoto
  • Print_ISBN
    978-1-4244-4717-6
  • Electronic_ISBN
    978-0-7695-3762-7
  • Type

    conf

  • DOI
    10.1109/IIH-MSP.2009.162
  • Filename
    5337401