DocumentCode :
2012399
Title :
A Strategy for Automatically Extracting References from PDF Documents
Author :
Alves, Neide Ferreira ; Lins, Rafael Dueire ; Lencastre, Maria
Author_Institution :
Univ. do Estado do Amazonas, Manaus, Brazil
fYear :
2012
fDate :
27-29 March 2012
Firstpage :
435
Lastpage :
439
Abstract :
Every day the number of citations an author receives is becoming more important than the size of his list of publications. The automatic extraction of bibliographic references in scientific articles is still a difficult problem in Document Engineering, even if the document is originally in digital form. This paper presents a strategy for extracting references of scientific documents in PDF format. The scheme proposed was validated in Live Memory platform, developed to generate digital libraries of proceedings of technical events.
Keywords :
bibliographic systems; digital libraries; document image processing; image retrieval; scientific information systems; LiveMemory platform; PDF document; automatic bibliographic reference extraction; digital document; digital libraries; document engineering; scientific articles; scientific documents; Accuracy; Classification algorithms; Data mining; Portable document format; Proposals; Support vector machine classification; Training; bibliographic references; document processing; information extraction; learning; regular expression;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on
Conference_Location :
Gold Cost, QLD
Print_ISBN :
978-1-4673-0868-7
Type :
conf
DOI :
10.1109/DAS.2012.12
Filename :
6195409
Link To Document :
بازگشت