Automatic Subtitles Localization through Speaker Identification in Multimedia System

Author

Park, Seung-Bo ; Oh, Kyung-Jin ; Kim, Heung-Nam ; Jo, Geun-Sik

Author_Institution

Dept. of Comput. & Inf. Eng., Inha Univ., Incheon

fYear

2008

fDate

10-11 July 2008

Firstpage

166

Lastpage

172

Abstract

With the increasing popularity of online video, efficient captioning and displaying the captioned text (subtitles) have also been issued with the accessibility. However, in most cases, subtitles are shown on a separate display below a screen. As a result, some viewers lose condensed information about the contents of the video. To elevate readability and visibility of viewers, in this paper, we present a framework for displaying synchronized text around a speaker in video. The proposed approach first identifies speakers using face detection technologies and subsequently detects a subtitles region. In addition, we adapt DFXP, which is interoperable timed text format of W3C, to support interchanging with existing legacy system. In order to achieve smooth playback of multimedia presentation, such as SMIL and DFXP, a prototype system, namely MoNaPlayer, has been implemented. Our case studies show that the proposed system is feasible to several multimedia applications.

Keywords

face recognition; multimedia systems; speaker recognition; text analysis; DFXP; SMIL; W3C; automatic subtitles localization; face detection technologies; multimedia system; online video; speaker identification; synchronized text; text format; Application software; Auditory system; Computer applications; Computer displays; Conferences; Face detection; Multimedia systems; Prototypes; Timing; Watches; DFXP; SMIL; face detect; timed text;

fLanguage

English

Publisher

ieee

Conference_Titel

Semantic Computing and Applications, 2008. IWSCA '08. IEEE International Workshop on

Conference_Location

Incheon

Print_ISBN

978-0-7695-3317-9

Electronic_ISBN

978-0-7695-3317-9

Type

conf

DOI

10.1109/IWSCA.2008.28

Filename

4573173