A Deep Reinforcement Learning Signal Control Algorithm for Traffic Carbon Emission Optimization

Hanyu Xu1
1 Department of Architecture and Civil Engineering, City University of Hong Kong, Hong Kong, China
International Scientific Technical and Economic Research 2026, Vol. 4, No. 1, pp. 200-221
DOI: 10.71451/ISTAER2610
Received: 10 January 2026; Revised: 25 February 2026; Accepted: 23 March 2026; Published: 29 March 2026
Abstract

Urban traffic congestion leads to frequent vehicle start-stop events and low-speed operation, which is one of the primary drivers of carbon emission growth. To address the problems of multi-objective conflict, training instability, and inadequate carbon emission modeling in existing traffic signal control methods for carbon emission optimization, this paper proposes a deep reinforcement learning signal control algorithm for carbon emission optimization. This method constructs a carbon-emission-aware dynamic reward mechanism and achieves collaborative optimization of traffic efficiency and emission reduction objectives through adaptive weight adjustment; Lagrange multiplier method is introduced to embed the carbon emission threshold as an explicit constraint into the strategy learning process to ensure that the emission level is controlled within an acceptable range; For multi-intersection scenarios, a distributed collaborative control framework based on parameter sharing and neighborhood information interaction is designed to enhance the model's ability to perceive the spatial propagation characteristics of traffic flow. Based on the SUMO simulation platform, experimental validation is conducted in three scenarios: a single intersection, a 4×4 grid network, and a real-world urban road network. The results show that compared with PPO algorithm, the average carbon emissions of this method are reduced by 11.3% to 12.8%, average delay is reduced by 15.7%, average speed is increased by 9.6%, and the comprehensive performance index is improved by 12.2%; During the training process, the fluctuation of strategy is reduced by about 50%, and the degradation rate of generalization performance is reduced by 34.2% compared with the comparison method. This study provides an effective intelligent solution for low-carbon-oriented urban traffic signal control.

Keywords
Deep reinforcement learning Traffic signal control Carbon emission optimization Multi objective optimization Constraint reinforcement learning
References
  1. Li, B. W., Chen, Z. H., Zhu, X. H., Zhang, Z., Peng, Z. R., Zhao, H. M., & He, H. D. (2025). Assessment of eco-driving strategies on carbon emissions for hybrid vehicles through portable emissions measurement systems. Atmospheric Pollution Research, 16(3), 102365. DOI: 10.1016/j.apr.2024.102365
  2. Chavhan, S., Deepika, I. S., Gupta, D., & Rodrigues, J. J. (2025). Energy-Efficient-Enabled Edge-AI-IoT Integrated Traffic Incident Analysis and Avoidance of Secondary Incidents. IEEE Internet of Things Journal. DOI: 10.1109/JIOT.2025.3555408
  3. Li, X., Wang, G., Zhu, Y., & Liu, W. (2025). A System Dynamics-Based Simulation Study on Urban Traffic Congestion Mitigation and Emission Reduction Policies. Sustainability, 17(20), 9296. DOI: 10.3390/su17209296
  4. Li, D., Zhu, F., Wu, J., Wong, Y. D., & Chen, T. (2024). Managing mixed traffic at signalized intersections: An adaptive signal control and CAV coordination system based on deep reinforcement learning. Expert Systems with Applications, 238, 121959. DOI: 10.1016/j.eswa.2023.121959
  5. Benhamza, K., Seridi, H., Agguini, M., & Bentagine, A. (2024). A multi-agent reinforcement learning based approach for intelligent traffic signal control. Evolving Systems, 15(6), 2383-2397. DOI: 10.1007/s12530-024-09622-4
  6. Chen, X., Wang, X., Zhao, W., Wang, C., Cheng, S., & Luan, Z. (2025). Hierarchical deep reinforcement learning based multi-agent game control for energy consumption and traffic efficiency improving of autonomous vehicles. Energy, 323, 135669. DOI: 10.1016/j.energy.2025.135669
  7. Hu, J., Shan, Y., Yang, Y., Parisio, A., Li, Y., Amjady, N., ... & Rodríguez, J. (2023). Economic model predictive control for microgrid optimization: A review. IEEE Transactions on Smart Grid, 15(1), 472-484. DOI: 10.1109/TSG.2023.3266253
  8. Qadri, S. S. S. M., Gökçe, M. A., & Öner, E. (2020). State-of-art review of traffic signal control methods: challenges and opportunities. European Transport Research Review, 12(1), 55. DOI: 10.1186/s12544-020-00439-1
  9. Tedjopurnomo, D. A., Bao, Z., Zheng, B., Choudhury, F. M., & Qin, A. K. (2020). A survey on modern deep neural network for traffic prediction: Trends, methods and challenges. IEEE Transactions on Knowledge and Data Engineering, 34(4), 1544-1561. DOI: 10.1109/TKDE.2020.3001195
  10. Liu, Y., Lyu, C., Zhang, Y., Liu, Z., Yu, W., & Qu, X. (2021). DeepTSP: Deep traffic state prediction model based on large-scale empirical data. Communications in Transportation Research, 1, 100012. DOI: 10.1016/j.commtr.2021.100012
  11. Luo, R., Peng, Z., & Hu, J. (2023). On model identification based optimal control and it's applications to multi-agent learning and control. Mathematics, 11(4), 906. DOI: 10.3390/math11040906
  12. Nguyen, T. T., Nguyen, N. D., & Nahavandi, S. (2020). Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications. IEEE Transactions on Cybernetics, 50(9), 3826-3839. DOI: 10.1109/TCYB.2020.2977374
  13. Liu, H., Li, X., Zhang, L., & Cheng, R. (2026). Bridging phase and timing: A joint Q-value learning framework for synergistic traffic signal control at consecutive arterial road intersections. Physica A: Statistical Mechanics and its Applications, 131421. DOI: 10.1016/j.physa.2026
  14. Bernárdez, G., Suárez-Varela, J., López, A., Shi, X., Xiao, S., Cheng, X., ... & Cabellos-Aparicio, A. (2023). Magnneto: A graph neural network-based multi-agent system for traffic engineering. IEEE Transactions on Cognitive Communications and Networking, 9(2), 494-506. DOI: 10.1109/TCCN.2023.3235719
  15. Wang, X., Yue, X., Huang, J., & Li, S. (2025). Integrating traffic dynamics and emissions modeling: From classical approaches to data-driven futures. Atmosphere, 16(6), 695. DOI: 10.3390/atmos16060695
  16. Mera, Z., Varella, R., Baptista, P., Duarte, G., & Rosero, F. (2022). Including engine data for energy and pollutants assessment into the vehicle specific power methodology. Applied Energy, 311, 118690. DOI: 10.1016/j.apenergy.2022.118690
  17. He, K., Chen, C., Chen, S., Chen, B., Zhang, A., Chen, P., ... & Wu, Z. (2025). Reinforcement Learning for Multi-Objective Optimization: A Review. Archives of Computational Methods in Engineering, 1-30. DOI: 10.1007/s11831-025-10389-3
  18. Nguyen, T. T., Nguyen, N. D., Vamplew, P., Nahavandi, S., Dazeley, R., & Lim, C. P. (2020). A multi-objective deep reinforcement learning framework. Engineering Applications of Artificial Intelligence, 96, 103915. DOI: 10.1016/j.engappai.2020.103915
  19. Liu, X., Ye, K., van Vlijmen, H. W., Emmerich, M. T., IJzerman, A. P., & van Westen, G. J. (2021). DrugEx v2: de novo design of drug molecules by Pareto-based multi-objective reinforcement learning in polypharmacology. Journal of Cheminformatics, 13(1), 85. DOI: 10.1186/s13321-021-00561-9
  20. Pereira, V., Sousa, P., & Rocha, M. (2022). A comparison of multi-objective optimization algorithms for weight setting problems in traffic engineering. Natural Computing, 21(3), 507-522. DOI: 10.1007/s11047-020-09807-1
  21. Taha, K. (2020). Methods that optimize multi-objective problems: A survey and experimental evaluation. IEEE Access, 8, 80855-80878. DOI: 10.1109/ACCESS.2020.2989219
  22. Gu, S., Yang, L., Du, Y., Chen, G., Walter, F., Wang, J., & Knoll, A. (2024). A review of safe reinforcement learning: Methods, theories, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(12), 11216-11235. DOI: 10.1109/TPAMI.2024.3457538
  23. Ceusters, G., Camargo, L. R., Franke, R., Nowé, A., & Messagie, M. (2023). Safe reinforcement learning for multi-energy management systems with known constraint functions. Energy and AI, 12, 100227. DOI: 10.1016/j.egyai.2022.100227
  24. Motte, M., & Pham, H. (2022). Mean-field Markov decision processes with common noise and open-loop controls. The Annals of Applied Probability, 32(2), 1421-1458. DOI: 10.1214/21-AAP1713
  25. Yang, J., Wu, J., Fang, L., Fan, H., Zhang, B., Zhao, H., ... & You, X. (2025). MSRFormer: road network representation learning using multi-scale feature fusion of heterogeneous spatial interactions. Geo-spatial Information Science, 1-20. DOI: 10.1080/10095020.2025.2583710
  26. Ye, C., Liu, F., Ou, Y., & Xu, Z. (2022). Optimization of Vehicle Paths considering Carbon Emissions in a Time‐Varying Road Network. Journal of Advanced Transportation, 2022(1), 9656262. DOI: 10.1155/2022/9656262
  27. Li, H., Qian, X., & Song, W. (2024). Prioritized experience replay based on dynamics priority. Scientific Reports, 14(1), 6014. DOI: 10.1038/s41598-024-56673-3
  28. Vadlamani, S. K., Xiao, T. P., & Yablonovitch, E. (2020). Physics successfully implements Lagrange multiplier optimization. Proceedings of the National Academy of Sciences, 117(43), 26639-26650. DOI: 10.1073/pnas.2015192117
  29. Saeed Chilmeran, H. T., Hamed, E. T., Ahmed, H. I., & Al-Bayati, A. Y. (2022). A method of two new augmented lagrange multiplier versions for solving constrained problems. International Journal of Mathematics and Mathematical Sciences, 2022(1), 3527623. DOI: 10.1155/2022/3527623
  30. Chen, R., Tsay, Y. S., & Ni, S. (2022). An integrated framework for multi-objective optimization of building performance: Carbon emissions, thermal comfort, and global cost. Journal of Cleaner Production, 359, 131978. DOI: 10.1016/j.jclepro.2022.131978
  31. Zubaer, K. H., Alam, Q. M., Toha, T. R., Salim, S. I., & Al Islam, A. A. (2020). Towards simulating non-lane based heterogeneous road traffic of less developed countries using authoritative polygonal GIS map. Simulation Modelling Practice and Theory, 105, 102156. DOI: 10.1016/j.simpat.2020.102156
  32. Chen, D., Zhu, M., Yang, H., Wang, X., & Wang, Y. (2024). Data-driven traffic simulation: A comprehensive review. IEEE Transactions on Intelligent Vehicles, 9(4), 4730-4748. DOI: 10.1109/TIV.2024.3367919