DocumentCode :
3660769
Title :
Correlation Analysis of Big Data to Support Machine Learning
Author :
Rajiv Pandey;Manoj Dhoundiyal;Amrendra Kumar
Author_Institution :
Amity Inst. of Inf. Technol., Amity Univ., Lucknow, India
fYear :
2015
fDate :
4/1/2015 12:00:00 AM
Firstpage :
996
Lastpage :
999
Abstract :
The large size and complexity of datasets in Big Data need specialized statistical tools for analysis and we use R for correlation analysis of our data set. This paper explores the correlation analysis through best fit linear regression of quantitative variables with help of the demonstration based on scatter plots and linear regression best fit line. The analysis demonstrated in this paper is scalable to Big Data in any other context where the quantitative variables are clearly delineated. R provides multiple techniques and inferences to statistical analysis of dataset, this paper however explores the correlation between quantitative variable establishing the extent of dependability between them using R functions. The correlation and best fit line functions of R i.e. Cor () and abline(lmout) respectively are significantly explored.
Keywords :
"Correlation","Big data","Education","Linear regression","Data mining","Data analysis","Complexity theory"
Publisher :
ieee
Conference_Titel :
Communication Systems and Network Technologies (CSNT), 2015 Fifth International Conference on
Type :
conf
DOI :
10.1109/CSNT.2015.32
Filename :
7280068
Link To Document :
بازگشت