Research Info

Home /Cancer Gene Expression ...
Title Cancer Gene Expression Classification Based on RNA Sequence Data Using Machine Learning: An Integrative Approach with Feature Selection and Data Balancing Techniques
Type Refereeing
Keywords Cancer Gene Expression, Data Balancing, Feature Selection, Machine Learning, RNA Sequence
Abstract The research on gene expression profiling and its role in diseases, notably cancer, has become an increasingly 9 popular area. RNA sequencing (RNA-Seq) has recently one of the most powerful tools for gene expression profiling. Accurate classification of different tumor types based on gene expression profiles can significantly improve targeted therapies' diagnosis, prognosis, and development. The proposed method has four main steps. In the first step, the data are preprocessed using standardizing technique. Then, the most important features are selected using the ExtraTree method. Considering several unbalanced class types, the classifiers can not be as well as desired. Therefore, the SMOTE balancing technique is employed to solve this issue. Finally, the Support Vector Machines are used to classify the selected features. The effectiveness of the proposed method is demonstrated using the PANCAN dataset constructed from RNA-Seq data from 801 cancer samples with 20,531 features in five tumor classes: BRCA, KIRC, COAD, LUAD, and PRAD. In this study, only seven important attributes were selected for analysis purposes. The technique’s performance was assessed using accuracy, precision, recall, and F1-measure indices. A method has been developed for integrating RNA-Seq data with machine-learning techniques to enhance the accurate classification of different tumor types. Such a combination of feature selection, oversampling, and Support Vector Machine classification has given promising results demonstrating the ability to differentiate between different types of cancers. This method could assist physicians in customizing particular treatments for cancer patients according to their type of cancer.
Researchers Seyed Alireza Bashiri Mosavi (Referee)