Title :
Information extractor for small medium enterprise aggregator
Author :
Oktavino, H. Fabrian ; Maulidevi, Nur Ulfa
Author_Institution :
Comput. Sci. / Inf., Inst. Teknol. Bandung, Bandung, Indonesia
Abstract :
Indonesia have a massive number of SMEs, but with a very low revenue. An alternative to increase revenue is by using internet. Some SMEs already develop their website, but they don´t have same navigation. The websites confuse the potential buyers. So, a website´s aggregator is essential. This aggregator is made without the owner of the SMEs to register their website, which means it can automatically show website´s content that already been made. For this purpose, two stages is required. First is to find relevant SMEs websites, and the second is to extract information automatically. This paper focuses on information extractor to extract information from SMEs e-commerce website with or without shopping cart feature, used to make an automatic SME aggregator and make prototype database. Learning algorithms is needed to recognize information that will be extracted. The research is about how to preprocessing website pages and what is the best algorithm for automatic information extraction. The system will compare three algorithms, Naïve Bayes, Decision Tree, and Support Vector Machine. Algorithm with the best accuracy will be used for the system´s model. Support Vector Machine is the best algorithm. SMOTE, which is method to solve imbalanced data set problem by oversampling minority class, is the best filter for system´s training model. System can extract information with best performance from SMEs e-commerce website with shopping cart feature.
Keywords :
Bayes methods; Web sites; decision trees; electronic commerce; small-to-medium enterprises; support vector machines; Internet; SME; SMOTE; Web site aggregator; decision tree; e-commerce Web site; imbalanced data set problem; information extractor; learning algorithm; naïve Bayes; revenue; shopping cart feature; small medium enterprise aggregator; support vector machine; Accuracy; Classification algorithms; Data mining; Decision trees; Feature extraction; Support vector machines; Training; Information Extractor; SME; SMOTE; Support Vector Machine;
Conference_Titel :
Data and Software Engineering (ICODSE), 2014 International Conference on
Print_ISBN :
978-1-4799-8175-5
DOI :
10.1109/ICODSE.2014.7062659