Precise Identification of Multi-Regional Relative Poverty: A Two-Stage Knowledge-Distilled Adaptive Framework
DOI:
https://doi.org/10.71204/hefd5f67Keywords:
Relative Poverty Prediction, Gradient-Boosted Trees, Knowledge Distillation, Spatial Heterogeneity, extremely imbalanced classification, Bayesian OptimizationAbstract
As poverty reduction strategies shift comprehensively toward alleviating relative poverty, precisely identifying multidimensional relative poverty populations has become critical for social governance. However, existing data-driven models often face algorithmic bottlenecks—such as spatial heterogeneity, regional sample sparsity, and extreme category imbalance—when applied to complex scenarios characterized by vast territories and significant regional disparities. To address this, this study proposes a two-stage knowledge-distilled adaptive gradient boosting framework (TKDAF). First, in the prior knowledge extraction stage (Stage I), a base structure-regularized gradient boosting tree model is constructed. Combined with SHAP game-theoretic attribution, this stage quantifies and extracts the global objective weights of poverty-inducing features across regions. Second, in the spatial adaptive enhancement stage (Stage II), the S-DAGB (Spatial-Adaptive Distilled Gradient Boosting) core prediction model is introduced. It achieves deep integration of multiple regularization and feature enhancement mechanisms by incorporating: feature space nonlinear reconstruction based on prior knowledge, a dynamic category weighting mechanism based on effective sample size (ENS), and spatial adaptive optimization using the TPE Bayesian algorithm.Empirical results based on the 2020 China Family Panel Studies (CFPS) multidimensional dataset demonstrate that the S-DAGB model not only effectively overcomes the generalization bottleneck of deep tree models in the sample-constrained Northeast region (achieving 93.52% accuracy),but also significantly improves precision in regions with highly heterogeneous features and extreme class imbalance, such as central and western China. This effectively reduces wasteful allocation of poverty alleviation resources caused by false positive errors. This study provides an algorithmic solution that combines high accuracy with interpretability for precise identification of relative poverty in complex data distribution scenarios.
References
Alkire, S., & Foster, J. (2011). Counting and multidimensional poverty measurement. Journal of Public Economics, 95(7-8), 476-487.
Bergstra, J., Bardenet, R., Bengio, Y., et al. (2011). Algorithms for hyper-parameter optimization. In Advances in Neural Information Processing Systems (pp. 2546-2554).
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32.
Chawla, N. V., Bowyer, K. W., Hall, L. O., et al. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321-357.
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794).
Cui, Y., Jia, M., Lin, T. Y., et al. (2019). Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 9268-9277).
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189-1232.
Jean, N., Burke, M., Xie, M., et al. (2016). Combining satellite imagery and machine learning to predict poverty. Science, 353(6301), 790-794.
Ke, G., Meng, Q., Finley, T., et al. (2017). LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems (pp. 3146-3154).
Li, X., Zhou, Y., & Chen, Y. (2020). Theory and methods for regional multidimensional poverty measurement. Acta Geographica Sinica, 75(4), 753-768.
Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems (pp. 4765-4774).
Qian, Y., Wang, C., & Wang, J. (2022). Mutual information and decision tree algorithm for eliminating random consistency. Journal of Shanxi University (Natural Science Edition), 45(5), 1206-1215.
Ravallion, M., & Chen, S. (2007). China’s (uneven) progress against poverty. Journal of Development Economics, 82(1), 1-42.
Shi, Y., Ding, T., Qi, X., et al. (2024). An explainable model for relative poverty identification and early warning. Journal of Shanxi University (Natural Science Edition), 47(1), 155-165.
Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical Bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems (pp. 2951-2959).
Sun, J., & Xia, T. (2019). The evolution of China’s poverty alleviation strategy and post-2020 relative poverty governance. Chinese Rural Economy, (10), 98-111.
Wang, B., Luo, Q., Chen, G., et al. (2022). Differences and dynamics of multidimensional poverty in rural China from multiple perspectives analysis. Journal of Geographical Sciences, 32(8), 1383-1404.
Wang, S., & Zeng, X. (2018). Preliminary exploration of post-2020 poverty issues. Journal of Hohai University (Philosophy and Social Sciences Edition), 20(2), 7-13.
Wang, X., & Feng, H. (2020). China’s multidimensional relative poverty standards post-2020: International experience and policy orientations. Chinese Rural Economy, (3), 2-21.
Zou, W., & Fang, Y. (2011). A dynamic multidimensional study on poverty in China. Economic Research Journal, (12), 42-55.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Guanghuang Liu (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in this journal are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author(s) and source are properly credited. Authors retain copyright of their work, and readers are free to copy, share, adapt, and build upon the material for any purpose, including commercial use, as long as appropriate attribution is given.