Machine Learning for Varietal Binary Classification of Soybean (Glycine max (L.) Merrill) Seeds Based on Shape and Size Attributes


Food Analytical Methods, vol.15, no.8, pp.2260-2273, 2022 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 15 Issue: 8
  • Publication Date: 2022
  • Doi Number: 10.1007/s12161-022-02286-3
  • Journal Name: Food Analytical Methods
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Agricultural & Environmental Science Database, CAB Abstracts, Compendex, Food Science & Technology Abstracts, INSPEC, Veterinary Science Database
  • Page Numbers: pp.2260-2273
  • Keywords: Soybean, Feature selection, Variety classification, Multilayer perceptron, Random forest, AERODYNAMIC PROPERTIES, MECHANICAL-PROPERTIES, BEANS, DISCRIMINATION, RECOGNITION, PARAMETERS, ORIGIN, FRUITS, VOLUME, AREA
  • Kayseri University Affiliated: No


© 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.The most important principal quality attributes of seeds are shape, size, and mass. These parameters play a critical role in the design of classifier and grading machines. This study was conducted to develop classification models for distinguishing the soybean seeds based on shape, size, and mass attributes. The seeds of soybean varieties of Bravo, Ceyhan, Çevik, İlksoy, and Traksoy were classified in pairs. Four different machine learning algorithms (random forest, RF; support vector machine, SVM; Naïve Bayes, NB; and multilayer perceptron, MLP) were used to evaluate the classification performance. In all cases, the soybean seeds of Ceyhan and Traksoy varieties were classified with the greatest accuracy as 90.00% for the RF classifier and 89.00% for MLP. The variety pairs that followed these varieties with the highest accuracy were Çevik and İlksoy (88.00%, MLP) and Çevik and Traksoy (87.50%, RF). The highest mass (0.19 g), volume (155.02 mm3), geometric mean diameter (6.65 mm), and projected area (34.80 mm2) values were obtained from Traksoy variety. The Pillai trace and Wilks’ lambda results revealed that differences in physical attributes of the soybean varieties were significant (p < 0.01). In Wilks’ lambda statistics, the unexplained part of the differences between the groups was found to be 23.0%. Traksoy and Çevik varieties with the highest Mahalanobis distances had similar attributes. Present findings showed that MLP and RF could potentially be used for the classification of soybean varieties.