مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

3103309

Title :

Data Mining and Applied Linear Algebra

Author :

Chu, Moody

Author_Institution :

North Carolina State Univ., Raleigh

fYear :

2008

fDate :

17-17 Jan. 2008

Firstpage :

Lastpage :

Abstract :

In this era of hyper-technological innovation, massive amounts of data are being generated at almost every level of applications in almost every area of disciplines. Extracting interesting knowledge from raw data, or data mining in a broader sense, has become an indispensable task. Nevertheless, data collected from complex phenomena represent often the integrated result of several interrelated variables, whereas these variables are less precisely defined. The basic principle of data mining is to distinguish which variable is related to which and how the variables are related. In many situations, the digitized information is gathered and stored as a data matrix. It is often the case, or so assumed, that the exogenous variables depend on the endogenous variables in a linear relationship. Retrieving "useful" information therefore can often be characterized as finding "suitable" matrix factorization. This paper offers a synopsis from this prospect on how linear algebra techniques can help to carry out the task of data mining. Examples from factor analysis, cluster analysis, latent semantic indexing and link analysis are used to demonstrate how matrix factorization helps to uncover hidden connection and do things fast. Low rank matrix approximation plays a fundamental role in cleaning the data and compressing the data. Other types of constraints, such as nonnegativity, will also be briefly discussed.

Keywords :

approximation theory; data analysis; data compression; data mining; matrix decomposition; applied linear algebra; data analysis; data cleaning; data compression; data matrix factorization; data mining; information retrieval; knowledge extraction; low rank matrix approximation; Cleaning; Data analysis; Data mining; Image reconstruction; Indexing; Informatics; Information retrieval; Linear algebra; Mathematics; Technological innovation; cluster analysis; data mining; factor analysis; linear model; link analysis; matrix factorization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Informatics Education and Research for Knowledge-Circulating Society, 2008. ICKS 2008. International Conference on

Conference_Location :

Kyoto

Print_ISBN :

978-0-7695-3128-1

Type :

conf

DOI :

10.1109/ICKS.2008.39

Filename :

4460463

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3103309