Title :
Modified weighting method in TF∗IDF algorithm for extracting user topic based on email and social media in Integrated Digital Assistant
Author :
Pramono, Luthfan Hadi ; Rohman, Arief Syaichu ; Hindersah, Dan Hilwadi
Author_Institution :
Electr. Eng. Dept., Bandung Inst. of Technol., Bandung, Indonesia
Abstract :
Integrated Digital Assistant (IDA) is a system designed to be a "personal secretary" who worked in full for the user. IDA will be active when the user is relaxing at home, office activities and even while traveling or outside activities. IDA works to minimize the interaction between user and system. The system will be able to find out information from the outside that is needed by users by searching users\´ topics through email and social media data. Searching and extracting user interest or topics in social media and email data of IDA is using TF*IDF weighting modification algorithm named TF*IDF*DF which is extend of TF*IDF method. Expected with TF*IDF weighting modification algorithm, topics that obtained more representative and in accordance with the information needed by the user. From extraction by using TF*IDF*DF, the number of terms (words) that has a value of document frequency (df) more than one are increases. On the other hand the computational load is also increasing due to the multiplier factor of df. News taken based on the extracted topic using the TF*IDF*DF increased and more diverse. The term from topic extraction result still have noisy text that not appropriate to grammar writing and need to be fixed, so the term that found will be more perfect.
Keywords :
Internet; electronic mail; information retrieval; social networking (online); text analysis; IDA; TF*IDF*DF algorithm; document frequency; email; grammar writing; integrated digital assistant; modified weighting method; personal secretary; social media; user topic extraction; Algorithm design and analysis; Conferences; Data mining; Electronic mail; Feature extraction; Media; Twitter; TF∗IDF; feature selection; topic extraction; topic model; user topic;
Conference_Titel :
Rural Information & Communication Technology and Electric-Vehicle Technology (rICT & ICeV-T), 2013 Joint International Conference on
Conference_Location :
Bandung
Print_ISBN :
978-1-4799-3363-1
DOI :
10.1109/rICT-ICeVT.2013.6741547