Journal of Energy Storage, cilt.136, 2025 (SCI-Expanded, Scopus)
State-of-Charge (SoC) estimation of batteries is critical for electric vehicle health management. However, achieving high accuracy under heterogeneous operating conditions while preserving model transparency remains challenging. This paper proposes a novel descriptive proximity-based learning framework for accurate and explainable SoC estimation. Battery discharge data are mapped into a descriptive feature space, Φ(x)={current, voltage, capacity, energy, cycle count, temperature}, where local proximity reflects similar operational states. An adaptive ɛ-proximity threshold is derived using robust Median Absolute Deviation (MAD) statistics. This threshold guides a density-based clustering algorithm (DBSCAN) to identify intrinsic regimes without manual tuning. We then define fuzzy proximity relations between clusters and rigorously prove that they satisfy the reflexivity, symmetry, and transitivity-like axioms of proximity. A gradient-boosted tree model (LightGBM) is trained using the cluster-informed features, yielding near-perfect SoC predictions (e.g., cross-validation RMSE ≈0.13 and R2≈1.00). Shapley Additive Explanations (SHAP) reveal that the model's predictions align with physical expectations by distributing influence across meaningful features such as capacity and voltage, thus improving interpretability and trust. The proposed framework demonstrates a powerful integration of proximity-based modeling and explainable machine learning, advancing the state-of-the-art in reliable SoC estimation across varying conditions. Additional subsampling and noise robustness analyses confirm that the framework remains stable under reduced data availability and sensor perturbations, ensuring practical reliability.