Research on Intelligent Generation Algorithm of Interface Icon Based on Diffusion Model

To address the problems in interface icon generation, such as a lack of structural expression, difficulty in maintaining style consistency, and limited capability for multi-condition generation, this paper proposes a structure-aware intelligent icon generation method named IconDiff, which is based on a diffusion model. Based on the classical diffusion framework, this method introduces a structure-guided branching mechanism and a multimodal condition fusion mechanism to achieve collaborative modeling of text semantics, style features, and attribute information. It also enhances boundary clarity and semantic identifiability by designing an icon-specific loss function. At the same time, a multidimensional annotation data set containing 268000 icon samples is constructed, and a special evaluation index system for icon tasks is designed. Under a unified experimental setup, compared with various mainstream generation methods, the proposed method reduces the FID by approximately 25.2%, improves structural clarity by about 6.0%, enhances identifiability by about 6.8%, and increases style consistency by about 7.8%. In addition, ablation experiments verify the effectiveness of the key modules. Generalization and robustness analysis show that the model maintains stable performance even in the absence of semantic and style conditions. The research results show that the method in this paper has significantly improved the generation quality and controllability, and provides an effective solution for the automatic design of interface icons.

Petković, G., Pasanec Preprotić, S., & Kozjan Cindrić, A. (2025). Experiential Graphic Design: Informing, Inspiring, and Integrating People in Physical Spaces—A Review. Buildings, 15(11), 1862. DOI: 10.3390/buildings15111862
Zhao, Y., Liang, Z., Qiu, Y., & Wang, X. (2025). A novel flexible identity-net with diffusion models for painting-style generation. Scientific Reports, 15(1), 27896. DOI: 10.1038/s41598-025-12434-4
Jiang, S., Wu, M., Lai, Z., & Pu, Q. (2025). Mapping with a sense of place: a crowdsourced image-based color generation approach. Cartography and Geographic Information Science, 1-21. DOI: 10.1080/15230406.2025.2580432
Eswaran, U., & Eswaran, V. (2025). AI-driven cross-platform design: Enhancing usability and user experience. In Navigating usability and user experience in a multi-platform world (pp. 19-48). IGI Global. DOI: 10.4018/979-8-3693-2337-3.ch002
Yuzhao, Z. (2025). Research on Cross-Platform Data Fusion and Intelligent Analysis Methods for Online Communication. International Journal of High Speed Electronics and Systems, 2540876. DOI: 10.1142/S0129156425408769
Collaud, R., Reppa, I., Défayes, L., McDougall, S., Henchoz, N., & Sonderegger, A. (2022). Design standards for icons: The independent role of aesthetics, visual complexity and concreteness in icon design and icon understanding. Displays, 74, 102290. DOI: 10.1016/j.displa.2022.102290
Zhou, Y., Leng, H., Meng, S., Wu, H., & Zhang, Z. (2024). StructDiffusion: End-to-end intelligent shear wall structure layout generation and analysis using diffusion model. Engineering Structures, 309, 118068. DOI: 10.1016/j.engstruct.2024.118068
Leng, H., Gao, Y., & Zhou, Y. (2024). ArchiDiffusion: A novel diffusion model connecting architectural layout generation from sketches to Shear Wall Design. Journal of Building Engineering, 98, 111373. DOI: 10.1016/j.jobe.2024.111373
Po, R., Yifan, W., Golyanik, V., Aberman, K., Barron, J. T., Bermano, A., ... & Wetzstein, G. (2024). State of the art on diffusion models for visual computing. In Computer Graphics Forum (Vol. 43, No. 2, p. e15063). DOI: 10.1111/cgf.15063
Wang, B., Chen, Q., & Wang, Z. (2025). Diffusion-based visual art creation: A survey and new perspectives. ACM Computing Surveys, 57(10), 1-37. DOI: 10.1145/3728459
Amador-Domínguez, E., Serrano, E., & Manrique, D. (2024). Neurosymbolic system profiling: A template-based approach. Knowledge-Based Systems, 287, 111441. DOI: 10.1016/j.knosys.2024.111441
Yu, S., Fang, C., Tuo, Z., Zhang, Q., Chen, C., Chen, Z., & Su, Z. (2025). Vision-based mobile app gui testing: A survey. ACM Computing Surveys, 58(6), 1-46. DOI: 10.1145/3773027
França, R. P., Monteiro, A. C. B., Arthur, R., & Iano, Y. (2021). An overview of deep learning in big data, image, and signal processing in the modern digital age. Trends in Deep Learning Methodologies, 63-87. DOI: 10.1016/B978-0-12-822226-3.00003-9
Zhang, X., & Jia, Y. (2023). Fractal Art Graphic Generation Based on Deep Learning Driven Intelligence. Computer-Aided Design and Applications, 152-165. DOI: 10.14733/cadaps.2024.S3.152-165
Wang, S., Du, Y., Guo, X., Pan, B., Qin, Z., & Zhao, L. (2024). Controllable data generation by deep learning: A review. ACM Computing Surveys, 56(9), 1-38. DOI: 10.1145/3648609
Li, J., Yang, J., Zhang, J., Liu, C., Wang, C., & Xu, T. (2020). Attribute-conditioned layout gan for automatic graphic design. IEEE Transactions on Visualization and Computer Graphics, 27(10), 4039-4048. DOI: 10.1109/TVCG.2020.2999335
Silva‐Silverio, A., Gómez‐Gil, P., & Sánchez‐Argüelles, D. O. (2025). Conditional GAN Approaches on Regression Labels: A State‐of‐the‐Art Review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 15(4), e70050. DOI: 10.1002/widm.70050
Wołczyk, M., Proszewska, M., Maziarka, Ł., Zieba, M., Wielopolski, P., Kurczab, R., & Smieja, M. (2022). Plugen: Multi-label conditional generation from pre-trained models. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 36, No. 8, pp. 8647-8656). DOI: 10.1109/TPAMI.2024.3382008
Ma, H., & Wong, H. C. (2026). A Survey of Diffusion Models: Methods and Applications. Applied Sciences, 16(5), 2482. DOI: 10.3390/app16052482
Croitoru, F. A., Hondru, V., Ionescu, R. T., & Shah, M. (2023). Diffusion models in vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(9), 10850-10869. DOI: 10.1109/TPAMI.2023.3261988
Luo, J., Yang, L., Liu, Y., Hu, C., Wang, G., Yang, Y., ... & Zhou, X. (2025). Review of diffusion models and its applications in biomedical informatics. BMC Medical Informatics and Decision Making, 25(1), 390. DOI: 10.1186/s12911-025-03210-5
Wu, T., Li, M., Chen, J., Ji, W., Lin, W., Gao, J., ... & Wu, F. (2024). Semantic alignment for multimodal large language models. In Proceedings of the 32nd ACM International Conference on Multimedia (pp. 3489-3498). DOI: 10.1145/3664647.3681014
Peng, Y. (2025). A CLIP-based cross-modal matching model for image-text retrieval. Information Technology and Control, 54(3), 1030-1048. DOI: 10.5755/j01.itc.54.3.41801
Peng, F., Yang, X., Xiao, L., Wang, Y., & Xu, C. (2023). Sgva-clip: Semantic-guided visual adapting of vision-language models for few-shot image classification. IEEE Transactions on Multimedia, 26, 3469-3480. DOI: 10.1109/TMM.2023.3311646
Huang, Q., & Huang, J. (2025). Comprehensive review of edge and contour detection: from traditional methods to recent advances. Neural Computing and Applications, 37(4), 2175-2209. DOI: 10.1007/s00521-024-10936-2
Chen, Z., Zhou, H., Lai, J., Yang, L., & Xie, X. (2020). Contour-aware loss: Boundary-aware learning for salient object segmentation. IEEE Transactions on Image Processing, 30, 431-443. DOI: 10.1109/TIP.2020.3037536
Wang, J., Zhou, C., & Huang, Y. (2025). Contour-aware multi-expert model for ambiguous medical image segmentation. IEEE Transactions on Medical Imaging. DOI: 10.1109/TMI.2025.3561117
Ma, S., Li, X., Tang, J., & Guo, F. (2024). Aggregate-aware model with bidirectional edge generation for medical image segmentation. Applied Soft Computing, 163, 111918. DOI: 10.1016/j.asoc.2024.111918
Jiang, H., Imran, M., Zhang, T., Zhou, Y., Liang, M., Gong, K., & Shao, W. (2025). Fast-DDPM: Fast denoising diffusion probabilistic models for medical image-to-image generation. IEEE Journal of Biomedical and Health Informatics. DOI: 10.1109/JBHI.2025.3565183
Zhang, H., Yuan, J., Tian, X., & Ma, J. (2021). GAN-FM: Infrared and visible image fusion using GAN with full-scale skip connection and dual Markovian discriminators. IEEE Transactions on Computational Imaging, 7, 1134-1147. DOI: 10.1109/TCI.2021.3119954
Ran, X., Xi, Y., Lu, Y., Wang, X., & Lu, Z. (2023). Comprehensive survey on hierarchical clustering algorithms and the recent developments. Artificial Intelligence Review, 56(8), 8219-8264. DOI: 10.1007/s10462-022-10366-3