DocumentCode :
3126981
Title :
Modeling Unreliable Data and Sensors: Using F-measure Attribute Performance with Test Samples from Low-Cost Sensors
Author :
Iyer, Vasanth ; Iyengar, S. Sitharama
Author_Institution :
Int. Inst. of Inf. Technol., Hyderabad, India
fYear :
2011
fDate :
11-11 Dec. 2011
Firstpage :
15
Lastpage :
22
Abstract :
Building a high performance classifier requires training with labeled data, which is supervised and allows generalizing the classifier´s decision boundary and in practice most of the data is unlabeled, newer algorithms needs to be learn by knowledge discovery. Sufficient training data are collected in the form of empirical evidence, which have labeled positive and negative samples to build the hypothesis. The hypothesis is constructed by the conjunction of the attributes, which can be learnt by machine learning algorithm. In this paper, we work with two forms of ranking weights, precision and relevance, which help in finding hidden patterns and prediction future events. Empirical evidence for a weather patterns and tracking of a phenomenon needs to accurately extract the attributes and label the training samples, which is a very laborious and time-consuming effort. Automating weather prediction algorithms, which are trained by supervised learning, needs to be generalized so that it can be tested with unreliable and noisy weather data from low cost sensors. We use a training data from previous forest fires events, the datasets containing all the attributes are labeled using manual data logs for a given geographical area. The labeled original dataset is mapped to the data collected from on-line sensors, which further improves the accuracy of the training set. As some of classes have very few samples, which are related to the peak fire seasons, domain specific knowledge are added by sensor measurements and Fire Weather Index (FWI) to help accurately model the events. We show that training accuracy of the small forest fire classifier using attributes from manual logs is enhanced by 30% by using sensor data. The rare and hard to classify large forest fires are 95% accurately classified by using the new Fire Weather Index (FWI). We also show that our framework is more robust to outliers from noisy sensor measurements by accounting for in the model parameters. The model - llows further generalization for linearly and non-linearly separable datasets by estimating the parameters (1-δ) and minimum allowable error ϵ for hypothesis, sampling accuracy and cross validation.
Keywords :
computerised instrumentation; data mining; fires; forestry; learning (artificial intelligence); pattern classification; sampling methods; sensors; weather forecasting; FWI; automated weather prediction algorithm; domain specific knowledge; fire weather index; forest fire classifier; forest fire event; future event prediction; geographical area; hidden pattern; high performance classifier; knowledge discovery; labeled data; low cost sensors; machine learning algorithm; manual data logs; noisy sensor measurements; noisy weather data; online sensors; parameter estimation; peak fire seasons; supervised learning; training data; unlabeled data; weather pattern; weight ranking; Accuracy; Barium; Indexes; Mathematical model; Meteorology; Sensor phenomena and characterization; Dataming; Event Modeling; FWI; Forest fires; IR; Machine Learning; Ranking functions; Sampling sensors; Sensor Networks; Temporal Patterns;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
978-1-4673-0005-6
Type :
conf
DOI :
10.1109/ICDMW.2011.124
Filename :
6137355
Link To Document :
بازگشت