Title :
Assessing Box Office Performance Using Movie Scripts: A Kernel-Based Approach
Author :
Eliashberg, Jehoshua ; Hui, Sam K. ; Zhang, Zhongwei Jake
Author_Institution :
Wharton Sch., Univ. of Pennsylvania, Philadelphia, PA, USA
Abstract :
We develop a methodology to predict box office performance of a movie at the point of green-lighting, when only its script and estimated production budget are available. We extract three levels of textual features (genre and content, semantics, and bag-of-words) from scripts using screenwriting domain knowledge, human input, and natural language processing techniques. These textual variables define a distance metric across scripts, which is then used as an input for a kernel-based approach to assess box office performance. We show that our proposed methodology predicts box office revenues more accurately (29 percent lower mean squared error (MSE)) compared to benchmark methods.
Keywords :
feature extraction; humanities; natural language processing; text analysis; MSE; bag-of-words; benchmark methods; box office performance assessment; distance metric; estimated production budget; green-lighting; human input; kernel-based approach; mean squared error; movie scripts; natural language processing techniques; screenwriting domain knowledge; textual feature extraction; textual variables; Benchmark testing; Educational institutions; Feature extraction; Measurement; Motion pictures; Portfolios; Production; Entertainment industry; green-lighting; kernel approach; movie production; text mining;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2014.2306681