Color image thresholding is a well-known approach for image segmentation. An objective function, based on image entropy, is defined by threshold numbers and locations in the color histogram. Multiple classes of images can be created with multilevel thresholding. The main problem with thresholding techniques is to decide the threshold color values for each image. For a human operator, it is very hard to determine the specific threshold values of the images to be segmented. From this perspective, multilevel color image thresholding can be considered as an optimization problem that an algorithm should determine the optimum threshold values to obtain a perfectly segmented image. In recent years, metaheuristic algorithms become popular in several fields including image thresholding by their advantage of flexible structure. The motivation of this study relies on the fact that using same algorithm to solve any particular problem do not guarantee the best results as stated in no-free lunch theorem in optimization. Therefore, six novel nature inspired algorithms such as equilibrium optimization (EO), political optimizer (PO), turbulent flow of water-based optimization (TFWO), henry gas solubility optimization (HGSO), marine predators algorithm (MPA), and slime mould algorithm (SMA) are chosen to be compared by determining the multilevel color image thresholding values. These algorithms, which are used for the first time to solve this problem, are also compared statistically with extensive experiments. Aerial test images obtained by drones are used in the experiments. Kapur’s entropy and between-class variance (Otsu’s method) objectives are maximized by metaheuristic algorithms. Experimental results are evaluated with structural similarity (SSIM), peak-signal noise ratio (PSNR), blind/referenceless image spatial quality evaluator (BRISQUE), perception-based image quality evaluator (PIQE), natural image quality evaluator (NIQE), and computing CPU time consumption of the algorithms. Another problem of thresholding is that the question of how to compare the results objectively. Many image quality metrics may give incompatible results. Correlation analysis of the average ranking results show that image quality metrics used in this study are compatible to each other. Extensive experiments show that MPA and TFWO outperformed SMA, EO, PO and HGSO in terms of PSNR, SSIM, BRISQUE, PIQE, NIQE, and CPU time consumption for multilevel color aerial image thresholding.