nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo searchdiv qikanlogo popupnotification paper paperNew
2025, 01, v.55 63-74
SN-CLPGAN:基于谱归一化的中国传统山水画风格迁移方法
基金项目(Foundation): 国家自然科学基金(62471390、62306237); 陕西省重点研发计划(2024GX-YBXM-149); 西北大学研究生创新项目(CX2024204、CX2024206)
邮箱(Email): pxl@nwu.edu.cn;
DOI: 10.16152/j.cnki.xdxbzr.2025-01-005
摘要:

中国传统山水画的风格迁移为文化遗产数字化保护与传承提供了新的路径,近年来,深度学习技术已实现了不同图像间的风格迁移,并展现出栩栩如生的效果。中国传统山水画的风格迁移旨在继承中国古代画家独特的绘画技巧,但存在3个缺陷:(1)缺乏高质量的中国传统山水画图像数据集;(2)忽略了中国传统山水画特有的技法和笔墨细节;(3)风格迁移效果与真实山水画有所差距。为了弥补上述缺陷,首先,创建了一个基于风格迁移的中国传统山水画数据集STCLP,包含4 281幅高质量的中国山水画以及自然景观图像,并提出了一种基于谱归一化的中国山水画风格迁移方法SN-CLPGAN。其次,提出了在生成器中使用融合反射填充层的残差稠密块(residual-in-residual dense block, RRDB)学习中国山水画独特的笔触和技法。接着,引入了多尺度结构相似性指数测量(multi-scale structural similarity index measure, MS-SSIM)损失以减少2幅图像之间的像素差异,使生成图像更接近传统绘画的色彩和颜料。最后,采用了融合谱归一化(spectral normalization, SN)的U-Net判别器增强图像纹理细节,并确保了模型训练过程的稳定性。大量实验验证了提出的方法在中国传统山水画风格迁移任务中的有效性和先进性。

Abstract:

The style tranfer of Chinese landscape paintings offering new avenues for the digital preservation and inheritance of cultural heritage. In recent years, deep learning technologies have enabled style transfer between different images, achieving lifelike effects. Style transfer in Chinese landscape paintings aims to preserve the unique paintings skills of ancient Chinese painters, but faces three main challenges:(1) The lack of high-quality datasets of traditional Chinese landscape paintings.(2) The oversight of the unique techniques and ink details specific to traditional Chinese landscape paintings.(3) The gap between the style transfer outcomes and real landscape paintings. To address these deficiencies, this paper first introduces a Chinese landscape paintings dataset for style transfer, STCLP, which contains 4281 high-quality images of Chinese landscape paintings and natural landscapes. A generative adversarial network of style transfer in Chinese landscape painting based on spectral normalization is proposed, termed SN-CLPGAN.Additionally, it introduces the use of residual-in-residual dense blocks(RRDB) with reflect padding layers in the generator to learn the distinctive brushstrokes and techniques of Chinese landscape paintings. Furthermore, it employs the multi-scale structural similarity index measure(MS-SSIM) loss to minimize pixel-level differences between images, thereby producing images closer to traditional paintings in terms of color and pigmentation. Finally, the U-Net discriminator fused with SN is utilized to enhance the textural details of images, ensuring the stability of the model training process. Extensive experiments validate the effectiveness and advancement of the proposed method in the task of style transfer for Chinese landscape paintings.

参考文献

[1] 彭进业,余喆,屈书毅,等.基于深度学习的图像修复方法研究综述[J].西北大学学报(自然科学版),2023,53(6):943-963.PENG J Y,YU Z,QU S Y,et al.A survey of image inpainting methods based on deep learning[J].Journal of Northwest University (Natural Science Edition),2023,53(6):943-963.

[2] 胡琦瑶,杨皓文,王佳欣,等.基于弱监督深度学习的图像检索技术研究[J].西北大学学报(自然科学版),2020,50(5):793-801.HU Q Y,YANG H W,WANG J X,et al.Research on image retrieval based on weakly-supervised deep learning[J].Journal of Northwest University(Natural Science Edition),2020,50(5):793-801.

[3] GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial networks[EB/OL].(2014-06-10)[2024-04-10].http://arxiv.org/abs/1406.2661.

[4] ZHU J Y,PARK T,ISOLA P,et al.Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//2017 IEEE International Conference on Computer Vision.Venice:IEEE,2017:2242-2251.

[5] ISOLA P,ZHU J Y,ZHOU T,et al.Image-to-image translation with conditional adversarial networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:5967-5976.

[6] ZHENG Z T,LIU J Y,ZHENG N N.P2-GAN:Efficient stroke style transfer using single style image[J].IEEE Transactions on Multimedia,2023,25:6000-6012.

[7] WANG Z Z,ZHANG Z J,ZHAO L,et al.AesUST:Towards aesthetic-enhanced universal style transfer[C]//Proceedings of the 30th ACM International Conference on Multimedia.Lisboa:ACM,2022:1095-1106.

[8] WANG X T,YU K,WU S X,et al.ESRGAN:Enhanced super-resolution generative adversarial networks[C]//Computer Vision-ECCV 2018 Workshops.Cham:Springer International Publishing,2019:63-79.

[9] WANG X T,XIE L B,DONG C,et al.Real-ESRGAN:Training real-world blind super-resolution with pure synthetic data[C]//2021 IEEE/CVF International Conference on Computer Vision Workshops.Montreal:IEEE,2021:1905-1914.

[10] RONNEBERGER O,FISCHER P,BROX T.U-Net:Convolutional networks for biomedical image segmentation[M]//Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015.Cham:Springer International Publishing,2015:234-241.

[11] MIYATO T,KATAOKA T,KOYAMA M,et al.Spectral normalization for generative adversarial networks[EB/OL].(2018-02-16)[2024-04-10].https://arxiv.org/abs/1802.05957.

[12] ZHAO H,GALLO O,FROSIO I,et al.Loss functions for image restoration with neural networks[J].IEEE Transactions on Computational Imaging,2017,3(1):47-57.

[13] 孟宪佳,傅利平,刘栋,等.高性能计算发展现状及其在文化遗产保护中的应用展望[J].西北大学学报(自然科学版),2021,51(5):807-815.MENG X J,FU L P,LIU D,et al.Development status of high performance computing and its application prospect in cultural heritage protection[J].Journal of Northwest University (Natural Science Edition),2021,51(5):807-815.

[14] GATYS L A,ECKER A S,BETHGE M,et al.Texture synthesis using convolutional neural networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems.Montreal:ACM,2015:262-270.

[15] CHEN R F,HUANG W B,HUANG B H,et al.Reusing discriminators for encoding:Towards unsupervised image-to-image translation[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:8165-8174.

[16] YI Z L,ZHANG H,TAN P,et al.DualGAN:Unsupervised dual learning for image-to-image translation[C]//2017 IEEE International Conference on Computer Vision.Venice:IEEE,2017:2868-2876.

[17] KIM J,KIM M,KANG H,et al.U-GAT-IT:Unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation[EB/OL].(2020-04-08)[2024-04-10].https://arxiv.org/abs/1907.10830v4.

[18] 张美玉,刘跃辉,秦绪佳,等.基于拉普拉斯算子抑制伪影的神经风格迁移方法[J].计算机科学,2020,47(S2):209-214.ZHANG M Y,LIU Y H,QIN X J,et al.Neural style transfer method based on Laplacian operator to suppress artifacts [J].Computer Science,2020,47(S2):209-214.

[19] 茹超,周延,陈晓璇,等.一种面向文本图像的颜色迁移算法[J].西北大学学报(自然科学版),2017,47(6):815-821.RU C,ZHOU Y,CHEN X X,et al.A color transfer algorithm for text image[J].Journal of Northwest University(Natural Science Edition),2017,47(6):815-821.

[20] PENG X L,PENG S L,HU Q Y,et al.Contour-enhanced CycleGAN framework for style transfer from scenery photos to Chinese landscape paintings[J].Neural Computing and Applications,2022,34(20):18075-18096.

[21] ZHANG F Q,GAO H M,LAI Y P.Detail-preserving CycleGAN-AdaIN framework for image-to-ink painting translation[J].IEEE Access,2020,8:132002-132011.

[22] IOFFE S,SZEGEDY C,PARANHOS L,et al.Batch normalization:Accelerating deep network training by reducing internal covariate shift[EB/OL].(2015-03-02)[2024-04-10].https://arxiv.org/abs/1502.03167v3.

[23] LIM B,SON S,KIM H,et al.Enhanced deep residual networks for single image super-resolution[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops.Honolulu:IEEE,2017:1132-1140.

[24] NAH S,KIM T H,LEE K M.Deep multi-scale convolutional neural network for dynamic scene deblurring[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:257-265.

[25] WANG Z,BOVIK A C,SHEIKH H R,et al.Image quality assessment:From error visibility to structural similarity[J].IEEE Transactions on Image Processing,2004,13(4):600-612.

[26] KINGMA D P,BA J,HAMMAD M M.Adam:A method for stochastic optimization[EB/OL].(2017-01-30)[2024-04-10].https://arxiv.org/abs/1412.6980v9.

[27] CHONG M J,FORSYTH D.Effectively unbiased FID and inception score and where to find them[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:6069-6078.

[28] TALEBI H,MILANFAR P.NIMA:Neural image assessment[J].IEEE Transactions on Image Processing,2018,27(8):3998-4011.

[29] 孙满利,张景科.文物保护学的理论探讨[J].西北大学学报(自然科学版),2022,52(2):192-198.SUN M L,ZHANG J K.Theoretical discussion on conservation of cultural heritages[J].Journal of Northwest University (Natural Science Edition),2022,52(2):192-198.

[30] HO J,JAIN A,ABBEEL P,et al.Denoising diffusion probabilistic models[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems.Vancouver:ACM,2020:6840-6851.

基本信息:

DOI:10.16152/j.cnki.xdxbzr.2025-01-005

中图分类号:J212;TP391.41;TP18

引用信息:

[1]胡琦瑶,刘乾龙,彭先霖等.SN-CLPGAN:基于谱归一化的中国传统山水画风格迁移方法[J].西北大学学报(自然科学版),2025,55(01):63-74.DOI:10.16152/j.cnki.xdxbzr.2025-01-005.

基金信息:

国家自然科学基金(62471390、62306237); 陕西省重点研发计划(2024GX-YBXM-149); 西北大学研究生创新项目(CX2024204、CX2024206)

检 索 高级检索

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文