• DocumentCode
    3197598
  • Title

    A Neural Principal Component Analysis for text based documents keywords extraction

  • Author

    Heni, Saber ; Ejbali, Ridha ; Zaied, Mourad ; Ben Amar, Chokri

  • Author_Institution
    Higher Inst. of Comput. & Multimedia of Gabes, Zrig - Gabes, Tunisia
  • fYear
    2011
  • fDate
    18-20 Dec. 2011
  • Firstpage
    112
  • Lastpage
    115
  • Abstract
    Information retrieval system users, such those operational on the web, usually use text modality to look not only for textual information but also for multimedia content. In order to satisfy the users requirement, information retrieval systems should have prepared a short representation of the content of each document composing the corpus, called index. This index doesn´t, so often, reflect the intended meaning of the document they represent. In this paper, we propose an approach based on a Neural Principal Component Analysis that express the maximum variance of data and extract the principal component from it, by calculating the correlation between words of each document, to determine the keywords that give out the fields of intrest of each document content.
  • Keywords
    data structures; feature extraction; indexing; information retrieval systems; multimedia systems; neural nets; principal component analysis; text analysis; content representation; document content; document index; information retrieval system; maximum data variance; multimedia content; neural principal component analysis; text based document keyword extraction; text modality; textual information; user requirement; Covariance matrix; Eigenvalues and eigenfunctions; Indexing; Principal component analysis; Speech; Vectors; Information retrieval; Normalized Hebbian Algorithm; Principal Component Analysis; data analysis; keywords extraction; neural networks; text based indexing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Next Generation Networks and Services (NGNS), 2011 3rd International Conference on
  • Conference_Location
    Hammamet
  • Print_ISBN
    978-1-4673-0138-1
  • Type

    conf

  • DOI
    10.1109/NGNS.2011.6142550
  • Filename
    6142550