|
Abstract
|
Microarray image analysis is an algorithmic and analytical approach used in gene expression profiling in the study of biological and zoological taxonomy. Gene expression data is a structured matrix which represents gene data corresponding to specific conditions. The intricacy of biological networks, along with the volume of imprecise and outlier-containing data, makes dealing with them challenging. Hence, clustering algorithms are crucial for identifying patterns in massive genetic data. This paper presents Microarray spot segmentation using Hybrid Symbolic Fuzzy C-Means (HSFCM) clustering method which exhibits effective handling of uncertainty and imprecision. Even though the microarray spot data points are similar, they are not indistinguishable and demonstrate some dissimilarity. To capture this disparity, HSFCM method uses symbolic interval valued object instead of crisp data points to resolve imprecision present in the spot data. The proposed novel approach involves fuzzification of input data and later converting it to interval based symbolic object which is fed to clustering model. To assess the performance of the proposed method, experimentation is carried out on Lymphoma Leukemia Molecular Profiling Project microarray dataset and results are evaluated using Cluster Validity index and Mean Square Error (MSE). From the result analysis, it is evident that the proposed methods have achieved a result of 0.975, 0.098, 43.127 and 385.739 (Vpc, Vpe, Vxb and Vfb) for the dataset consisting of 9216 images and 0.972, 0.094, 40.421 and 392.63 (Vpc, Vpe, Vxb and Vfb) for the dataset consisting of 18432 microarray spot images. Further, the proposed method achieved a Mean Squared Error (MSE) of 0.025 and 0.402 for 9216 and 18432 microarray images set respectively. The results exhibit that the proposed approach has outperformed the modern contemporary clustering techniques.
|