مرکز منطقه ای اطلاع رساني علوم و فناوري - Keyword and keyphrase extraction from single Hindi document using statistical approach

DocumentCode :

2553220

Title :

Keyword and keyphrase extraction from single Hindi document using statistical approach

Author :

Siddiqi, Sifatullah ; Sharan, Aditi

Author_Institution :

Sch. of Comput. & Syst. Sci., Jawaharlal Nehru Univ., New Delhi, India

fYear :

2015

fDate :

19-20 Feb. 2015

Firstpage :

713

Lastpage :

718

Abstract :

In this paper we propose an unsupervised, domain independent as well as corpus independent approach for automatic keyword extraction. In second part of the paper we have suggested an extension of the approach to extract keyphrases from the document. The approach is general and can be applied to any language. However, we have tested the approach on Hindi language. Our approach combines the information contained in frequency and spatial distribution of a word in order to extract keywords from a document. Our work is especially significant in the light that it has been implemented and tested on Hindi which is a resource poor and underrepresented language.

Keywords :

natural language processing; statistical analysis; text analysis; Hindi document; Hindi language; automatic keyphrase extraction; automatic keyword extraction; corpus independent approach; frequency distribution; spatial distribution; unsupervised domain independent approach; Algorithm design and analysis; Data mining; Distribution functions; Graphical models; Signal processing; Signal processing algorithms; Standards; Hindi; Keyphrase; Keyword; Standard Deviation; Statistical;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal Processing and Integrated Networks (SPIN), 2015 2nd International Conference on

Conference_Location :

Noida

Print_ISBN :

978-1-4799-5990-7

Type :

conf

DOI :

10.1109/SPIN.2015.7095377

Filename :

7095377

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2553220