DocumentCode
2127568
Title
A New Technology for Combining Small Samples Based on Clustering and Its Applications
Author
Zonglei, Lu ; Jiandong, Wang ; Yunfeng, Zai
Author_Institution
Coll. of Inf. Sci. & Technol., Nanjing Univ. of Aeronaut. & Astronaut., Nanjing
fYear
2008
fDate
21-22 Dec. 2008
Firstpage
735
Lastpage
740
Abstract
Samples are important research objects of data mining. Limited by the basic theory of data mining, the sample size cannot be too small. However, it is difficult to collect enough data in some applications. Sometimes, strict requirement for sample collection lead to the generation of many small sample sets with similar characteristics. If the constraint for data collection is relaxed, the similar samples may be combined into a large sample set. The process of combining small samples is essentially a process of clustering, since both processes involve grouping data based on similarity. A new clustering algorithm, which is independent of the similarity, is presented in this paper. With this algorithm, 1516 samples of flights records are reduced to 4 large sample sets. The experiments show that the combining is helpful for determining the probability distribution of the samples, which is useful for flight delay early warning system.
Keywords
data mining; learning (artificial intelligence); pattern clustering; probability; data mining; intelligent information processing; machine learning; probability distribution; sample clustering algorithm; sample collection; Aircraft; Clustering algorithms; Data mining; Delay effects; Large-scale systems; Learning systems; Machine learning; Probability distribution; Space technology; Statistical distributions; Clustering; Data Mining; Flights Delay; Sample Combining;
fLanguage
English
Publisher
ieee
Conference_Titel
Knowledge Acquisition and Modeling, 2008. KAM '08. International Symposium on
Conference_Location
Wuhan
Print_ISBN
978-0-7695-3488-6
Type
conf
DOI
10.1109/KAM.2008.8
Filename
4732925
Link To Document