شماره ركورد :
147813
عنوان مقاله :
شيوه جداسازي كليدواژه از مدارك فرامتني
عنوان به زبان ديگر :
Abstracting keywords from hypertext documents
پديد آورندگان :
باولين لي ، بن چويي نويسنده , , فرشيد دانش ، مترجم ,
اطلاعات موجودي :
فصلنامه سال 1384
رتبه نشريه :
علمي ترويجي
تعداد صفحه :
6
از صفحه :
169
تا صفحه :
174
كليدواژه :
مدارك فرامتني , جداسزي كليدواژه ها , WEB MINING , Information retrieval , hypertext , keyword extraction
چكيده لاتين :
This paper presents a process for abstracting keywords from hypertext or text documents. The abstracted keywords, like keywords listed in a paper, identify the contents of a document. Our proposed process can be used, for example, to identify the contents of HTML documents returned from a search engine, to allow users to quickly find their needed information. The proposed process not only considers the occurrence frequency of a word in a document, like other related works, but also considers the occurrence of its synonyms. It also considers key phrases consisting of two or three words. To increase the accuracy of frequency count of words, a stemming algorithm is used to remove suffixes. Our tests show that the stemming algorithm consumed on average 56.7% of the total computation time, and that the proposed process can on average abstract 52% of the keywords provided by the authors of the tested documents.
سال انتشار :
1384
عنوان نشريه :
مطالعات ملي كتابداري و سازماندهي اطلاعات
عنوان نشريه :
مطالعات ملي كتابداري و سازماندهي اطلاعات
اطلاعات موجودي :
فصلنامه با شماره پیاپی سال 1384
كلمات كليدي :
#تست#آزمون###امتحان
لينک به اين مدرک :
بازگشت