Title :
A Practical Approach to Scalable Big Data Computing for the Personalization of Services at Samsung
Abstract :
We observe that the recent advances in big data computing have empowered the personalization of service including model-based services such as speech recognition, face recognition, and context-aware service. Various sources of user´s logs can be utilized in remodeling, adapting, and personalizing pretrained models to improve the quality of service. We propose a system that can support store/retrieve data and process them in a scalable manner on top of Samsung´ big data infrastructure. An automatic speech recognition (ASR) service such as Samsung´s S-Voice, Apple´s SIRI is one of the representative examples. Recently advances in ASR married with big data technologies drive more personalized services in many areas of services. A speaker adaptation is now a well-accepted technology that requires huge computation cost in creating a personalized acoustic model and corresponding language model over several billions of Samsung product users. We implement a personalized and scalable ASR system powered by the big data infrastructure which brings data-driven personalized opportunities to voice-enabled services such as voice-to-text transcriber, voice-enabled web search in a peta bytes scale. We verify the feasibility of speaker adaptation based on 107 testers´ recordings and obtain about 10% of recognition accuracy. An optimal set of performance optimization is suggested to have the best performance such as workflow compaction, file compression, best file system selection among several distributed file systems.
Keywords :
"Adaptation models","Speech","Acoustics","Computational modeling","Engines","Big data","Speech recognition"
Conference_Titel :
Big Data Computing (BDC), 2014 IEEE/ACM International Symposium on
DOI :
10.1109/BDC.2014.11