پديد آورندگان :
قاسمينژاد، رضوان دانشگاه تهران - پرديس دانشكده هاي فني - دانشكده مهندسي نقشه برداري و اطلاعات مكاني , عباسپور، رحيم علي دانشگاه تهران - پرديس دانشكده هاي فني - دانشكده مهندسي نقشه برداري و اطلاعات مكاني , مجرب، مسعود دانشگاه تهران - پرديس دانشكده هاي فني - دانشكده مهندسي معدن
كليدواژه :
بهينه سازي توده ذرات , Gustafson Kessel , Fuzzy c-means , تحليل هاي لرزهاي , الگوهاي لرزهاي , خوشه بندي فازي
چكيده فارسي :
شناسايي الگوها در دادههاي لرزهاي از طريق خوشهبندي، بهعنوان يكي از رايجترين روشهاي دادهكاوي، منجر به استخراج اطلاعات بسيار مهمي از يك حجم زياد داده ميشود. به دليل ماهيت دادههاي لرزهاي، الگوريتمهاي خوشهبندي فازي نتايج واقعبينانهتري را ارائه ميكنند. اگرچه الگوريتمهاي بسياري بدين منظور ارائهشده است اما حساس بودن به مقادير اوليه و به تله افتادن در جوابهاي بهينه محلي ازجمله مشكلاتي است كه در رابطه با روشهاي ارائهشده براي خوشهبندي وجود دارد. ازاينرو، در اين مقاله الگوريتمهاي فرا ابتكاري بهعنوان روشهاي بهينهسازي كارآمد بهمنظور رفع مشكلات روشهاي خوشهبندي پيشنهادشدهاند. در اين مقاله سعي شد تا با استفاده از تركيب الگوريتم بهينهسازي توده ذرات و دو الگوريتم خوشهبندي فازي Gustafson Kessel و Fuzzy c-means دو رويكرد براي خوشهبندي دادههاي لرزهاي ارائه شود. هريك از اين دو الگوريتم كه به ترتيب PSO-GK و PSO-FCM ناميده ميشوند بر روي دادههاي لرزهاي ساختگي و دادههاي لرزهاي ايران اعمال شدند. بهمنظور ارزيابي نتايج حاصل از دو الگوريتم، سه معيار ارزيابي خوشهبندي فازي يعني FHV، متوسط چگالي بخشبندي و چگالي بخشبندي مورداستفاده قرار گرفتند. مقدار FHV در الگوريتم PSO-GK به ميزان 0/4272 براي دادههاي ساختگي و به ميزان 0/0941 براي دادههاي لرزهاي ايران كمتر (بهتر) از مقدار اين معيار در الگوريتم PSO-FCM ميباشد. همچنين مقادير دو معيار ارزيابي ديگر هم براي دادههاي ساختگي و هم براي دادههاي لرزهاي ايران در الگوريتم PSO-GK داراي مقادير بهتري ميباشند كه نشان از كارايي بهتر الگوريتمي است كه بر مبناي Gustafson Kessel ارائهشده نسبت به الگوريتمي كه برمبناي Fuzzy c-means ارائهشده براي تحليل دادههاي لرزهاي دارد.
چكيده لاتين :
As a common method in data mining, pattern recognition within seismic data using clustering leads to the extraction of valuable information from large databases. Categorization of clustering algorithms is neither straightforward nor canonical. Clustering algorithms can be divided into four broad classes of hierarchical, density-based, grid-based, and partitioning methods. The application of these methods depends on the kind and nature of problem. From the labeling and assignment point of view, clustering algorithms can be divided into hard and soft methods. In the hard clustering, each data belongs to one and only one cluster while in soft (or fuzzy) clustering; each data belongs to different clusters with different degrees of membership. In the field of seismology and with application of hazard analysis, it is an essential task to break an area into different regions with more or less similar seismological characteristics. So it is needed to use clustering algorithms. For data mining and clustering analysis among seismic catalogs, some issues should be considered, such as, among an active seismic area, there are different regions with different rates of seismicity, As a result, the density and number of events are not the same in different regions or seismotectonic provinces, the earthquake events are mainly distributed among different segments of major faults, there are different seismotectonic regions among an area, therefor seismic characteristics in a region vary gradually and there are not abrupt changes in these characteristics. Thus, it may be a more proper approach to partition earthquakes based on the fuzzy clustering methods that tend to investigate realistic data. Although many clustering algorithms have been proposed so far, these algorithms are very sensitive to initial conditions and pretty often get trapped in local optimum solutions so they couldn’t find real clusters in space of problem. Therefore, some other global optimal searching algorithm should be used to find global clusters. The clustering problem may be considered as an optimization problem in general. Metaheuristics are widely renowned as efficient approaches for many hard optimization problems including cluster analysis. Metaheuristics uses an iterative search strategy to find an approximate optimal solution using a limited computation resource, such as computing power and computation time. Therefore, the present paper suggests some metaheuristics algorithms to solve the problems associated with clustering algorithms, Gustafson Kessel and Fuzzy c-means. The two algorithms called PSO-GK and PSO-FCM, respectively then they are applied on synthetic seismic data as well as real seismic data acquired across Iran, with the results validated using validity clustering indexes such as fuzzy hyper volume (FHV), average partition density (APD) and partition density (PD). These indexes show the clear separation between the clusters, minimal volume of the clusters, and maximal number of data points concentration in the vicinity of the cluster centroid. A low value for FHV and high values for APD and PD indexes would ideally indicate a good partition. The amount of FHV index in PSO-GK algorithm for synthetic seismic data is 0.4272 and for real seismic data acquired across Iran is 0.0941 better than this index in PSO-FCM algorithm. The two other indexes are also achieved better amounts in PSO-GK algorithm than PSO-FCM algorithm. Based on the comparison results, the proposed Gustafson-Kessel approach-based algorithm was found to be more appropriate for the analysis of seismic data.