The pet prediction problem solved based on the Random Forest-ARIMA ensemble algorithm
DOI:
https://doi.org/10.71451/1jdjtv33Keywords:
Python; Random Forest; XGBoost; ARIMA; pet cats and dogsAbstract
With the enhancement of consumer awareness, the pet industry market size has been steadily increasing, becoming a leading emerging sector. As more people treat pets as family members, the demand for pet-related products and services continues to rise, making accurate forecasting of market trends essential for industry growth. This paper aims to explore the future development trends of the pet industry and formulate sustainable planning strategies by employing flexible prediction methods based on historical data. To address the research needs, this paper first utilizes Python-based web scraping techniques to deeply mine relevant industry data and compile an "initial dataset". This dataset, however, contains missing values, which are filled using the mode imputation method. Subsequently, the built-in find function in MATLAB is applied to traverse the dataset files and ensure that there are no outliers present. After processing the data, correlations between various indicators are examined, and relevant features are extracted for further analysis. To predict the number of pet cats and dogs in China over the next three years, a Random Forest-Multiple Linear Regression-ARIMA integrated model is constructed. The results from this model show that the pet cat population will be approximately 65 million, 57 million, and 90 million in the coming years, while the pet dog population will be around 52 million, 54 million, and 52 million. These predictions provide valuable insights into the future of the pet industry, helping businesses and policymakers plan and strategize for sustainable growth.
References
[1] Qin, Y. J. (2024). A Comparative Study of Housing Price Prediction Models Based on Multiple Linear Regression and Random Forest Algorithm. Modern Information Technology, 22, 127-131. doi:10.19850/j.cnki.2096-4706.2024.22.025.
[2] He, X. F., He, H. H., Yang, L., Yu, Y., Jiang, M. F., & Zhang, T. (2024). Analysis of Tobacco Sales Influencing Factors Based on Random Forest Model. Information Technology, 11, 147-153. doi:10.13274/j.cnki.hdzj.2024.11.022.
[3] Zhang, X., & Li, L. (2024). Seasonal Electric Vehicle Charging Load Prediction Based on Random Forest. Software Engineering, 11, 11-14+37. doi:10.19644/j.cnki.issn2096-1472.2024.011.003.
[4] Lu, X. Y. (2024). A Study on the Prediction Method of VOCs Content in Coatings Based on Multiple Linear Regression Model. Popular Standardization, 20, 181-183.
[5] Sun, F. N., & Zhang, Z. J. (2024). A Model for the Influence of Logistics Demand in Inner Mongolia Based on Multiple Linear Regression Method and Analysis of Its Influencing Factors. Chinese Business Theory, 17, 99-103. doi:10.19699/j.cnki.issn2096-0298.2024.17.099.
[6] Zhang, Z. Q., & Du, J. (2024). Port Logistics Demand Prediction Analysis of Ningbo City Based on Double Exponential Smoothing and Multiple Linear Regression. Logistics Technology, 17, 78-82. doi:10.13714/j.cnki.1002-3100.2024.17.020.
[7] Qi, P. Y., Yao, X. W., Liu, Q. H., Xu, K. Q., Ren, H. F., & Xu, K. L. (2024). Prediction of Ash Melting Characteristics Temperature for Biomass and Bituminous Coal Co-firing Based on Multiple Linear Regression Model. Agricultural Engineering Journal, 15, 174-182.
[8] Cui, Y. H., Zhu, Z. H., & Li, T. (2024). Forecasting of Freight Turnover in Shijiazhuang Based on Grey Prediction-ARIMA Model. Logistics Technology, 22, 8-11+18. doi:10.13714/j.cnki.1002-3100.2024.22.002.
[9] Li, Q. J., Wang, N. L., Li, W. X., Yin, D. P., Jin, Y., Qiu, L., & Lu, Y. (2024). Study on the Prediction of Influenza-like Cases in Hainan Province Based on ARIMA Model. Journal of Xinjiang Medical University, 11, 1533-1538.
[10] Wu, C. Y. (2024). Forecasting Analysis of Per Capita Disposable Income Based on ARIMA-GM(1, Combined Model. Journal of Higher Education Science, 08, 50-56.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 International Scientific Technical and Economic Research

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).