سامانه پژوهشی مرکز آموزش عالی فنی و مهندسی بوئین زهرا | Hyperparameter Optimization for Problem-Based Custom CNN Architectures Using a Smart Grid Search Method

عنوان	Hyperparameter Optimization for Problem-Based Custom CNN Architectures Using a Smart Grid Search Method
نوع پژوهش	داوری و نظارت بر فعالیت‌های پژوهشی
کلیدواژه‌ها	Convolutional Neural Network, Hyperparameter Optimization, Grid Search Algorithm
چکیده	Recent advancements in deep learning architectures have made it possible to solve a wide range of classification problems present in industrial applications. However, many CNN architectures are complex, with large model sizes and high inference times, making them less suitable for deployment in resource-constrained environments. This study proposes an improved grid search based hyperparameter optimization method to design lightweight CNN architectures for task-specific classification problems. In order to test the performance of the proposed algorithm, a dataset of ripe and unripe pistachios was used. To classify the ripe and unripe pistachios with a small-sized and high test accuracy model , a two-layer CNN architecture’s hyperparameters were optimized with the proposed algorithm. The implementation of the proposed algorithm took 14 hours and 17 minutes, and the algorithm calculated 173 different CNN architectures with different parameters. Among these 173 CNN architectures, the model with higher test accuracy than the literature and the smallest size was determined to have a size of 4.74 MB and achieve 98.70% test accuracy. When compared to the literature, this model shows 0.26% improvement in test accuracy and 36.6 times smaller than the AlexNet model. The proposed method was also compared with Bayesian optimization using a similar search space. While Bayesian optimization achieved the same highest test accuracy (99.22%), it produced a model that was 3.7 times larger. Moreover, when small-sized CNN architectures with identical accuracies (98.19%) were compared, the proposed method produced a model that was 2.5 times smaller than the Bayesian result in terms of model size. These results demonstrate the success of the method in efficiently discovering lightweight, high-accuracy CNN architectures.
پژوهشگران	سید علیرضا بشیری موسوی (داور)

مشخصات پژوهش