DocumentCode :
2144597
Title :
Conversion of PDF Books in ePub Format
Author :
Marinai, Simone ; Marino, Emanuele ; Soda, Giovanni
Author_Institution :
Dipt. di Sist. e Inf., Univ. di Firenze, Florence, Italy
fYear :
2011
fDate :
18-21 Sept. 2011
Firstpage :
478
Lastpage :
482
Abstract :
In the last years the interest in e-book readers is significantly growing. Two main document formats are supported by most devices: PDF and ePub. The PDF format is widely used to share documents allowing a cross-platform readability. However, it is not ideal for a comfortable reading on small screens. On the opposite, the ePub format is re-flowable and it is well suited for e-book readers. In this paper we describe a system for the conversion of PDF books to the ePub format aiming at inverting the text formatting made during the pagination. To this purpose, layout analysis techniques are performed to identify the book´s table of contents and the main functional regions such as chapters, paragraphs, and notes.
Keywords :
character recognition; document image processing; electronic publishing; PDF book conversion; chapter identification; cross-platform readability; document format; e-book readers; ePub format; functional region identification; layout analysis technique; note identification; paragraph identification; table of content identification; Electronic publishing; Feature extraction; Layout; Nickel; Portable document format; Sections; Text analysis; Ebook; Layout Analysis; PDF; ePub;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location :
Beijing
ISSN :
1520-5363
Print_ISBN :
978-1-4577-1350-7
Electronic_ISBN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2011.102
Filename :
6065357
Link To Document :
بازگشت