DocumentCode :
2034368
Title :
Annotation based classification of the PDF document for semantic web
Author :
Shukla, Archana
Author_Institution :
Comput. Sci. & Eng. Dept., Motilal Nehru Nat. Inst. of Technol., Allahabad, India
Volume :
1
fYear :
2011
fDate :
8-10 April 2011
Firstpage :
370
Lastpage :
376
Abstract :
Main aim of Research Scholars is to produce and communicate new knowledge and to apply innovative applications of existing knowledge which makes a significant impact at national or international level. The most difficult part of a masters or doctoral degree course arguably understands the research paper while identifying their problem area. Then it´s vital that students start with brainstorming interesting research paper ideas and finding a good research paper related to their work. Students of these programs are required to perform the research activity. But most of the time researchers face difficulty to identify their problem area. Defending the research will go on smoothly if the paper is clear in its aim of solving an argument. In this paper, I present an application which provides a user friendly interface based on the context of research academic degree program for research activity. One reason behind this surge is that viewpoints, summaries, notes, observation written by authors on the PDF document are often helpful to readers. My application extracts the metadata such as Title, Keywords, Date and Time, Author, Summary etc and Annotations from the PDF document automatically and also classifies the PDF document either on the basis of the number of comments or on the basis of number of authors made their comments on it. My application also provide facility to classifies the PDF document based on feedback in terms of scores given by research students in between the range of after review the comments available on the PDF document. This help research student in decision-making about the relevance of the PDF document or to judge the quality of the PDF document weather it is related to their problem area or not in the context of the domain, where researchers or students downloaded number of PDF document from the World Wide Web using software agents such as Google to identify their research problem area. These metadata defines the semantics of any document. I have - - developed my application using PDF BOX JAVA API. My work is motivated by the desire to have a knowledge base regarding metadata and annotation about the PDF document so that it can be used by the research students to take decision to identify their problem area.
Keywords :
Java; application program interfaces; document handling; pattern classification; search engines; semantic Web; user interfaces; Google; PDF Box Java API; PDF document classification; annotation based classification; application program interface; doctoral degree course; masters degree course; metadata; research academic degree program; semantic Web; software agents; user friendly interface; Arrays; Context; Data mining; Meteorology; Portals; Relational databases; Annotation; Classification; Metadata; Search Engines; Semantic Web; World Wide Web;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electronics Computer Technology (ICECT), 2011 3rd International Conference on
Conference_Location :
Kanyakumari
Print_ISBN :
978-1-4244-8678-6
Electronic_ISBN :
978-1-4244-8679-3
Type :
conf
DOI :
10.1109/ICECTECH.2011.5941625
Filename :
5941625
Link To Document :
بازگشت