• DocumentCode
    2565080
  • Title

    Development of a multilingual text mining approach for knowledge discovery in patents

  • Author

    Lee, Chung-Hong ; Yang, Hsin-Chang ; Li, Yi-Ju

  • Author_Institution
    Dept. of Electr., Eng., Nat. Kaohsiung Univ. of Appl. Sci., Kaohsiung, Taiwan
  • fYear
    2009
  • fDate
    11-14 Oct. 2009
  • Firstpage
    2265
  • Lastpage
    2269
  • Abstract
    In this paper we describe our work on developing a novel technique for discovery of implicit knowledge about patents from multilingual patent information sources. In this work we developed a system platform to support locating similar and relevant multilingual patent documents. The platform was implemented using a multilingual vector space based on the latent semantic indexing (LSI) model, and utilizing collected professional Chinese-English parallel corpora for training the system model. These multilingual patent documents could then be mapped into the semantic vector space for evaluating their similarity by means of text clustering techniques. The preliminary results show that our platform framework has potential for retrieval and relatedness evaluation of multilingual patent documents.
  • Keywords
    data mining; indexing; information retrieval; patents; text analysis; document retrieval; knowledge discovery; latent semantic indexing; multilingual patent information sources; multilingual text mining; multilingual vector space; patents; professional Chinese-English parallel corpora; relatedness evaluation; text clustering; Cybernetics; Dictionaries; Indexing; Information analysis; Information management; Information systems; Large scale integration; Terminology; Text mining; USA Councils; Document clustering; Latent semantic indexing; Multilingual patent retrieval; Patent retrieval; Text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2009. SMC 2009. IEEE International Conference on
  • Conference_Location
    San Antonio, TX
  • ISSN
    1062-922X
  • Print_ISBN
    978-1-4244-2793-2
  • Electronic_ISBN
    1062-922X
  • Type

    conf

  • DOI
    10.1109/ICSMC.2009.5345953
  • Filename
    5345953