西北大学学报(自然科学版)

2025, 01, v.55 201-212

基于多层次瓶颈注意力模块的颅骨到面皮的生成方法

王洁姜文凯蒋佳琪梁增磊刘晓宁

耿国华

1.西北大学信息科学与技术学院

基金项目(Foundation): 国家自然科学基金(62271393); 陕西省重点研发计划(2021GY-028)

邮箱(Email): xnliu@nwu.edu.cn;

DOI: 10.16152/j.cnki.xdxbzr.2025-01-017

122	0	48
下载次数	被引频次	阅读次数

引用本文下载本文

PDF

引用导出

GB/T 7714-2015 MLA APA Refworks EndNote NoteExpress NoteFirst

摘要全文参考文献出版信息相关文章

摘要：

从未知颅骨恢复其生前面貌是考古学、法医学和刑侦学重要的研究方向。现有的计算机三维辅助复原过程繁琐，耗时长，该研究针对现有模型在颅骨到面皮(不含纹理、头发等的面貌)图像生成上存在失真、扭曲、不平滑等现象，提出一种结合生成对抗网络和多层次瓶颈注意力模块的颅骨到面皮图像生成方法。该方法的生成器由6层AdaResBlock和瓶颈注意力模块组成，从通道和空间两个维度引导生成器关注更重要的区域，并根据特征自适应地调整归一化方式。同时，针对生成器模型较大的问题，引入蓝图可分离卷积减小其体积。此外，将判别器分为两部分，前几层被用来进行编码，取消传统网络中的单独编码器模块，使模型更紧凑；后几层则采用多尺度判别策略，从不同层级对图像进行分类判别，增强其准确性。实验结果表明，在颅骨到面皮图像生成任务上，该方法生成的面皮图像质量高于现有的其他方法，在视觉质量和图像质量上都取得了最高的分数，复原效果更加真实，图像定量评价指标PSNR、SSIM平均提升1.115,0.017,LPIPS平均降低0.026,面皮平均相似度为0.855。

关键词： 颅面生成; 生成对抗网络; 图像转换; 瓶颈注意力模块; 蓝图可分离卷积;

Abstract：

Restoring the face from an unknown skull is an important research direction in archaeology, forensics and criminal investigation. The existing computer-aided 3D restoration process is cumbersome and time-consuming. In view of the distortion, twisting and non-smoothness of the existing model in the generation of skull-to-skin(face without texture and hair, etc.) images, this paper proposes a skull-to-skin image generation method combining a generative adversarial network and a multi-level bottleneck attention module. Specifically, the generator consists of six layers of AdaResBlock and a bottleneck attention module, which guides the generator to focus on more important areas from the two dimensions of channel and space, and adjusts the normalization method according to the feature adaptiveness. At the same time, in order to solve the problem of the large size of the generator model, the blueprint separable convolution is introduced to reduce its volume. In addition, the discriminator is divided into two parts. The first few layers are used for encoding, eliminating the separate encoder module in the traditional network, making the model more compact; the latter layers adopt a multi-scale discrimination strategy to classify and discriminate images from different levels to enhance their accuracy. Experimental results show that in the task of skull-to-skin image generation, the skin images generated by this method have higher quality than other existing methods, and have achieved the highest scores in both visual quality and image quality. The restoration effect is more realistic, and the image quantitative evaluation indicators PSNR and SSIM are improved by an average of 1.115 and 0.017, and LPIPS is reduced by an average of 0.026. The average facial similarity is 0.855.

KeyWords： craniofacial generation; generative adversarial network; image translation; bottleneck attention module; blueprint separable convolution;

如需获取全文，请访问cnki.net

参考文献

[1] 王琳，赵俊莉，段福庆，等.颅面复原方法综述[J].计算机工程，2019,45(12):8-18.WANG L,ZHAO J L,DUAN F Q,et al.Survey on craniofacial reconstruction method[J].Computer Engineering,2019,45(12):8-18.

[2] REZENDE D J,MOHAMED S,WIERSTRA D.Stochastic backpropagation and approximate inference in deep generative models[C]//International conference on machine learning(ICML).Beijing:PMLR,2014:1278-1286.

[3] HO J,JAIN A,ABBEEL P.Denoising diffusion probabilistic models[J].Advances in Neural Information Processing Systems,2020,33:6840-6851.

[4] GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial networks[J].Communications of the ACM,2020,63(11):139-144.

[5] XIE S,XU Y,GONG M,et al.Unpaired image-to-image translation with shortest path regularization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Los Angeles:IEEE,2023:10177-10187.

[6] BUI N T,NGUYEN H D,BUI-HUYNH T N,et al.Efficient loss functions for GAN-based style transfer[C]//Fifteenth International Conference on Machine Vision(ICMV).Yerevan:SPIE,2023:373-380.

[7] REN Y,YU X,CHEN J,et al.Deep image spatial transformation for person image generation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Seattle:IEEE,2020:7690-7699.

[8] ZHU J Y,PARK T,ISOLA P,et al.Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the IEEE international conference on computer vision(ICCV).Venice:IEEE,2017:2223-2232.

[9] LIU M Y,BREUEL T,KAUTZ J.Unsupervised image-to-image translation networks[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems(NeurIPS).Long Beach:ACM,2017:700-708.

[10] 税午阳，邓擎琼，吴秀杰，等.颅面复原技术在体质人类学中的应用[J].人类学学报，2021,40(4):706-720.SHUI W Y,DENG Q Q,WU X J,et al.An overview of craniofacial reconstruction technology application in physical anthropology[J].Acta Anthropologica Sinic,2021,40(4):706-720.

[11] VANEZIS P,BLOWES R W,LINNEY A D,et al.Application of 3-D computer graphics for facial reconstruction and comparison with sculpting techniques[J].Forensic Science International,1989,42(1/2):69-84.

[12] DEGREEF S,VANDERMEULEN D,CLAES P,et al.The influence of sex,age and body mass index on facial soft tissue depths[J].Forensic Science,Medicine,and Pathology,2009,5:60-65.

[13] GIETZEN T,BRYLKA R,ACHENBACH J,et al.A method for automatic forensic facial reconstruction based on dense statistics of soft tissue thickness[J].PLoS One,2019,14(1):1-19.

[14] VANDERMEULEN D,CLAES P,LOECKX D,et al.Computerized craniofacial reconstruction using CT-derived implicit surface representations[J].Forensic Science International,2006,159:164-174.

[15] DESVIGNES M,BAILLY G,PAYAN Y,et al.3D semi-landmarks based statistical face reconstruction[J].Journal of Computing and Information Technology,2006,14(1):31-43.

[16] DUAN F,HUANG D,TIAN Y,et al.3D face reconstruction from skull by regression modeling in shape parameter spaces[J].Neurocomputing,2015,151:674-682.

[17] SHUI W,ZHOU M,MADDOCK S,et al.A PCA-Based method for determining craniofacial relationship and sexual dimorphism of facial shapes[J].Computers in Biology and Medicine,2017,90:33-49.

[18] 周明全，杨稳，林芃樾，等.基于最小二乘正则相关性分析的颅骨身份识别[J].光学精密工程，2021,29(1):201-210.ZHOU M Q,YANG W,LIN P Y,et al.Skull identification based on least square canonical correlation analysis[J].Optics and Precision Engineering,2021,29(1):201-210.

[19] ZHAO L,MA L,CUI Z,et al.FAST-Net:A coarse-to-fine pyramid network for face-skull transformation[C]//International Workshop on Machine Learning in Medical Imaging(MLMI).Cham:Springer Nature Switzerland,2023:104-113.

[20] LIN P Y,YANG W,XIA S Y,et al.CFR-GAN:A generative model for craniofacial reconstruction[C]//2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).Houston:IEEE,2021:462-469.

[21] ZHANG N,ZHAO J,DUAN F,et al.An end-to-end conditional generative adversarial network based on depth map for 3D craniofacial reconstruction[C]//Proceedings of the 30th ACM International Conference on Multimedia(ACM,MM).Lisboa Portugal:ACM,2022:759-768.

[22] LI Y,WANG J,LIANG W,et al.CR-GAN:Automatic craniofacial reconstruction for personal identification[J].Pattern Recognition,2022,124:108400.

[23] ISOLA P,ZHU J Y,ZHOU T,et al.Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Hawaii:IEEE,2017:1125-1134.

[24] MIRZA M,OSINDERO S.Conditionalgenerative adversarial nets[EB/OL].(2014-11-06) [2022-10-23].https://arxiv.org/abs/1411.1784.

[25] WANG T C,LIU M Y,ZHU J Y,et al.High-resolution image synthesis and semantic manipulation with conditional gans[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Salt Lake:IEEE,2018:8798-8807.

[26] HUANG X,LIU M Y,BELONGIE S,et al.Multimodal unsupervised image-to-image translation[C]//Proceedings of the European conference on computer vision(ECCV).Munich:Springer,2018:172-189.

[27] LEE H Y,TSENG H Y,HUANG J B,et al.Diverse image-to-image translation via disentangled representations[C]//Proceedings of the European Conference on Computer Vision(ECCV).Munich:Springer,2018:35-51.

[28] KIM J,KIM M,KANG H,et al.U-GAT-IT:Unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation[C]//International Conference on Learning Representations.New Orleans:Ithaca,2019:1-19.

[29] TANG H,LIU H,XU D,et al.Attentiongan:Unpaired image-to-image translation using attention-guided generative adversarial networks[J].IEEE Transactions on Neural Networks and Learning Systems,2021,34(4):1972-1987.

[30] 陈晓雷，杨佳，梁其铎.结合语义先验和深度注意力残差的图像修复[J].计算机科学与探索，2023,17(10):2450-2461.CHEN X L,YANG J,LIANG Q D.Image inpainting combining semantic priors and deep attention residuals[J].Journal of Frontiers of Computer Science and Technology,2023,17(10):2450-2461.

[31] LIU M,DING Y,XIA M,et al.Stgan:A unified selective transfer network for arbitrary image attribute editing[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Long Beach:IEEE,2019:3673-3682.

[32] CHEN X,PAN J,JIANG K,et al.Unpaired deep image deraining using dual contrastive learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Ner Orleans:IEEE,2022:2017-2026.

[33] PARK J,WOO S,LEE JY,et al.Bam:Bottleneckattention module[EB/OL].(2018-07-17) [2022-11-10].https://arxiv.org/abs/1807.06514.

[34] HAASE D,AMTHOR M.Rethinking depthwise separable convolutions:How intra-kernel correlations lead to improved mobilenets[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Virtual:IEEE,2020:14600-14609.

[35] RADFORDA,METZ L,CHINTALA S.Unsupervisedrepresentation learning with deep convolutional generative adversarial networks[EB/OL].(2015-11-19) [2023-03-15].https://arxiv.org/abs/1511.06434.

[36] MAO X,LI Q,XIE H,et al.Least squares generative adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision(ICCV).Venice:IEEE,2017:2794-2802.

[37] CHEN R,HUANG W,HUANG B,et al.Reusing discriminators for encoding:Towards unsupervised image-to-image translation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Virtual:IEEE,2020:8168-8177.

[38] YANG X,JIA X,GONG D,et al.LARNeXt:End-to-end lie algebra residual network for face recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(10):11961-11976 .

基本信息:

DOI：10.16152/j.cnki.xdxbzr.2025-01-017

中图分类号:D919.6;TP391.41

引用信息:

[1]王洁,姜文凯,蒋佳琪,等.基于多层次瓶颈注意力模块的颅骨到面皮的生成方法[J].西北大学学报(自然科学版),2025,55(01):201-212.DOI:10.16152/j.cnki.xdxbzr.2025-01-017.

基金信息:

国家自然科学基金(62271393); 陕西省重点研发计划(2021GY-028)

请选择需要下载的pdf数据

西北大学学报(自然科学版)

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈

引用

GB/T 7714-2015 格式引文

MLA格式引文

APA格式引文

请选择需要下载的pdf数据

西北大学学报(自然科学版)

使用微信“扫一扫”功能。将此内容分享给您的微信好友或者朋友圈

引用

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈