Title of article :
Predictive Analysis for Optimal Text Visibility: A Comprehensive Study on Frame-of-Interest Prediction in Book Digitization Videos
Author/Authors :
Buddhawar ، G. Sardar Vallabhbhai National Institute of Technology , Dave ، D. Pimpri Chinchwad College of Engineering , Jariwala ، K. N. Sardar Vallabhbhai National Institute of Technology , Chattopadhyay ، C. School Computing and Data Sciences - FLAME University
From page :
2256
To page :
2267
Abstract :
This research paper addresses an important challenge in book digitization, i.e., accurately predicting frames where text visibility is optimal. Existing models often suffer from high computational complexity, resulting in inefficiencies in automation and accuracy. In contrast, our proposed models offer a solution with lower complexity and higher accuracy. Leveraging a diverse dataset of book flipping videos, we introduce three novel models: the Regular CNN LeNet-5 Model, the Custom LSTM Model, and the 3D CNN Model. Evaluation reveals that our 3D CNN Model achieves an accuracy score of 99.01%, with 377,921 parameters. These models demonstrate a significant increase in efficiency in terms of accuracy metric  with significantly less number of parametrers. Thereby the proposed approach enhances the process of identifying frames of interest. Our findings highlight the transformative potential of these models in streamlining book digitization workflows and improving accessibility to digitized textual content. This study contributes valuable insights at the intersection of computer vision, machine learning, and digitization efforts, offering a promising avenue for enhancing the usability of digitized textual resources.
Keywords :
Book Flipping Videos , Frame of Interest , Book Digitization , predictive analysis
Journal title :
International Journal of Engineering
Journal title :
International Journal of Engineering
Record number :
2777018
Link To Document :
بازگشت