Research on Image Representation Learning Method Based on Self-Supervised Learning

Aiming at the problems of negative sample dependence, representation degradation, and insufficient cross-scale modeling in self-supervised image representation learning, this paper proposes a self-supervised learning framework that combines multi-view consistent learning and cross-scale feature fusion. This method constructs a multi-branch collaborative structure, introduces a non-negative sample optimization strategy and a feature distribution constraint mechanism, and achieves efficient mining and stable expression of image semantic information. On the ImageNet dataset, the accuracy of linear evaluation reached 77.8%, which was 8.5% and 2.5% higher than that of SimCLR and SwAV, respectively; In downstream tasks, the target detection mAP increased by about 2.5%, and the semantic segmentation mIoU increased by about 2.5%. At the same time, the accuracy improves by 7.5% under noise disturbance, demonstrating stronger robustness. The experimental results show that this method is superior to the existing mainstream methods in terms of characterization quality, generalization ability and training stability, and has good application potential.

Bayoudh, K., Knani, R., Hamdaoui, F., & Mtibaa, A. (2022). A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets. The Visual Computer, 38(8), 2939-2970. DOI: 10.1007/s00371-021-02166-7
Mahadevkar, S. V., Khemani, B., Patil, S., Kotecha, K., Vora, D. R., Abraham, A., & Gabralla, L. A. (2022). A review on machine learning styles in computer vision—techniques and future directions. IEEE Access, 10, 107293-107329. DOI: 10.1109/access.2022.3209825
Ericsson, L., Gouk, H., Loy, C. C., & Hospedales, T. M. (2022). Self-supervised representation learning: Introduction, advances, and challenges. IEEE Signal Processing Magazine, 39(3), 42-62. DOI: 10.1109/MSP.2021.3134634
Wang, H., Liu, Z., Ge, Y., & Peng, D. (2022). Self-supervised signal representation learning for machinery fault diagnosis under limited annotation data. Knowledge-Based Systems, 239, 107978. DOI: 10.1016/j.knosys.2021.107978
Rani, V., Kumar, M., Gupta, A., Sachdeva, M., Mittal, A., & Kumar, K. (2024). Self-supervised learning for medical image analysis: a comprehensive review. Evolving Systems, 15(4), 1607-1633. DOI: 10.1007/s12530-024-09581-w
Liu, X., Zhang, F., Hou, Z., Mian, L., Wang, Z., Zhang, J., & Tang, J. (2021). Self-supervised learning: Generative or contrastive. IEEE Transactions on Knowledge and Data Engineering, 35(1), 857-876. DOI: 10.1109/TKDE.2021.3090866
Yin, J., Wu, H., & Sun, S. (2023). Effective sample pairs based contrastive learning for clustering. Information Fusion, 99, 101899. DOI: 10.1016/j.inffus.2023.101899
Hu, H., Wang, X., Zhang, Y., Chen, Q., & Guan, Q. (2024). A comprehensive survey on contrastive learning. Neurocomputing, 610, 128645. DOI: 10.1016/j.neucom.2024.128645
Xiao, B., Tang, Y., & Liu, Y. (2025). Integrating Materials Representations Into Feature Engineering in Machine Learning for Crystalline Materials: From Local to Global Chemistry‐Structure Information Coupling. Wiley Interdisciplinary Reviews: Computational Molecular Science, 15(4), e70044. DOI: 10.1002/wcms.70044
Kumar, P., Rawat, P., & Chauhan, S. (2022). Contrastive self-supervised learning: review, progress, challenges and future research directions. International Journal of Multimedia Information Retrieval, 11(4), 461-488. DOI: 10.1007/s13735-022-00245-6
Gui, J., Chen, T., Zhang, J., Cao, Q., Sun, Z., Luo, H., & Tao, D. (2024). A survey on self-supervised learning: Algorithms, applications, and future trends. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(12), 9052-9071. DOI: 10.1109/TPAMI.2024.3415112
Chen, Z., Hu, B., Chen, Z., & Zhang, J. (2024). Progress and thinking on self-supervised learning methods in computer vision: A review. IEEE Sensors Journal, 24(19), 29524-29544. DOI: 10.1109/jsen.2024.3443885
Jaiswal, A., Babu, A. R., Zadeh, M. Z., Banerjee, D., & Makedon, F. (2020). A survey on contrastive self-supervised learning. Technologies, 9(1), 2. DOI: 10.3390/technologies9010002
Yang, Z., Ding, M., Huang, T., Cen, Y., Song, J., Xu, B., ... & Tang, J. (2024). Does negative sampling matter? a review with insights into its theory and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(8), 5692-5711. DOI: 10.1109/TPAMI.2024.3371473
Li, P., Shao, B., Zhao, G., & Liu, Z. P. (2025). Negative sampling strategies impact the prediction of scale-free biomolecular network interactions with machine learning. BMC Biology, 23(1), 123. DOI: 10.1186/S12915-025-02231-W
Chen, C., Ma, W., Zhang, M., Wang, C., Liu, Y., & Ma, S. (2023). Revisiting negative sampling vs. non-sampling in implicit recommendation. ACM Transactions on Information Systems, 41(1), 1-25. DOI: 10.1145/3522672
Iliadis, D., De Baets, B., & Waegeman, W. (2022). Multi-target prediction for dummies using two-branch neural networks. Machine Learning, 111(2), 651-684. DOI: 10.1007/s10994-021-06104-5
Wang, S., Cheng, X., Li, Y., Song, X., Guo, R., Zhang, H., & Liang, Z. (2023). Rapid visual simulation of the progressive collapse of regular reinforced concrete frame structures based on machine learning and physics engine. Engineering Structures, 286, 116129. DOI: 10.1016/j.engstruct.2023.116129
Zhou, S., Xu, H., Zheng, Z., Chen, J., Li, Z., Bu, J., ... & Ester, M. (2024). A comprehensive survey on deep clustering: Taxonomy, challenges, and future directions. ACM Computing Surveys, 57(3), 1-38. DOI: 10.1145/3689036
Xu, J., Ren, Y., Tang, H., Yang, Z., Pan, L., Yang, Y., ... & He, L. (2022). Self-supervised discriminative feature learning for deep multi-view clustering. IEEE Transactions on Knowledge and Data Engineering, 35(7), 7470-7482. DOI: 10.1109/TKDE.2022.3193569
Jiao, L., Gao, J., Liu, X., Liu, F., Yang, S., & Hou, B. (2021). Multiscale representation learning for image classification: A survey. IEEE Transactions on Artificial Intelligence, 4(1), 23-43. DOI: 10.1109/tai.2021.3135248
Jiao, L., Wang, M., Liu, X., Li, L., Liu, F., Feng, Z., ... & Hou, B. (2024). Multiscale deep learning for detection and recognition: A comprehensive survey. IEEE Transactions on Neural Networks and Learning Systems, 36(4), 5900-5920. DOI: 10.1109/TNNLS.2024.3389454
Zhang, Z., Yang, Q., & Zi, Y. (2021). Multi-scale and multi-pooling sparse filtering: a simple and effective representation learning method for intelligent fault diagnosis. Neurocomputing, 451, 138-151. DOI: 10.1016/j.neucom.2021.04.066
Chen, N., Yang, R., Zhao, Y., Dai, Q., & Wang, L. (2025). Remote Sensing Image Segmentation Network That Integrates Global–Local Multi-Scale Information with Deep and Shallow Features. Remote Sensing, 17(11), 1880. DOI: 10.3390/rs17111880
Qin, J., Huang, Y., & Wen, W. (2020). Multi-scale feature fusion residual network for single image super-resolution. Neurocomputing, 379, 334-342. DOI: 10.1016/j.neucom.2019.10.076
Bian, K., & Priyadarshi, R. (2024). Machine learning optimization techniques: a survey, classification, challenges, and future research issues. Archives of Computational Methods in Engineering, 31(7), 4209-4233. DOI: 10.1007/s11831-024-10110-w
Kim, D., Sohn, C. B., Kim, D. Y., & Kim, D. Y. (2025). A Taxonomy and Theoretical Analysis of Collapse Phenomena in Unsupervised Representation Learning. Mathematics, 13(18), 2986. DOI: 10.3390/math13182986
Ribas, L. C., Casaca, W., & Fares, R. T. (2025). Conditional generative adversarial networks and deep learning data augmentation: a multi-perspective data-driven survey across multiple application fields and classification architectures. AI, 6(2), 32. DOI: 10.3390/ai6020032
Lin, J., Hu, G., & Chen, J. (2025). A data augmentation method for computer vision task with feature conversion between class. Computers and Electronics in Agriculture, 231, 109909. DOI: 10.1016/j.compag.2025.109909
Khan, A., Rauf, Z., Sohail, A., Khan, A. R., Asif, H., Asif, A., & Farooq, U. (2023). A survey of the vision transformers and their CNN-transformer based variants. Artificial Intelligence Review, 56(Suppl 3), 2917-2970. DOI: 10.1007/s10462-023-10595-0
Kim, J. W., Khan, A. U., & Banerjee, I. (2025). Systematic review of hybrid vision transformer architectures for radiological image analysis. Journal of Imaging Informatics in Medicine, 1-15. DOI: 10.1007/s10278-024-01322-4