A Transformer-Enhanced Iterative Unrolling Network for Sparse-View CT Image Reconstruction
基于Transformer增强型迭代展开网络的CT图像稀疏重建
-
Abstract:
Radiation dose reduction in Computed Tomography (CT) can be achieved by decreasing the number of projections. However, reconstructing CT images via filtered back projection algorithm from sparse-view projections often contains severe streak artifacts, affecting clinical diagnosis. To address this issue, this paper proposes TransitNet, an iterative unrolling deep neural network that combines model-driven data consistency, a physical a prior constraint, with deep learning’s feature extraction capabilities. TransitNet employs a novel iterative architecture, implementing flexible physical constraints through learnable data consistency operations, utilizing Transformer’s self-attention mechanism to model long-range dependencies in image features, and introducing linear attention mechanisms to reduce self-attention’s computational complexity from quadratic to linear. Extensive experiments demonstrate that this method exhibits significant advantages in both reconstruction quality and computational efficiency, effectively suppressing streak artifacts while preserving structures and details of images.
-
Keywords:
- Sparse-View CT /
- iterative unrolling /
- Transformer /
- linear attention /
- data consistency
摘要:通过减少投影数量可降低计算机断层扫描(CT)的辐射剂量。然而,使用滤波反投影算法从稀疏投影数据重建CT图像时会产生严重条状伪影,影响临床诊断。针对此问题,本文提出了一种基于迭代展开的深度神经网络TransitNet,将模型驱动的数据一致性与深度学习特征提取能力相结合。该网络采用新型迭代架构,通过可学习的数据一致性操作实现更灵活的物理约束,利用Transformer的自注意力机制建立图像特征的长程依赖关系,并引入线性注意力机制将自注意力的计算复杂度从平方级降低到线性级。大量实验表明,该方法在重建质量和计算效率方面均展现出显著优势,能有效抑制条状伪影,同时保持图像结构和细节。
-
关键词:
- 稀疏角度CT /
- 迭代展开 /
- Transformer /
- 线性注意力 /
- 数据一致性
-
-
Table 1 Results of Various Reconstruction Algorithms
Algorithms RMSE PSNR SSIM FBP 0.0455 26.8402 0.6003 RED-CNN 0.0222 33.0535 0.8613 FBPConvNet 0.0151 36.4081 0.8922 SwinIR 0.0138 37.2184 0.9187 Uformer 0.0124 38.1376 0.9423 TV 0.0198 34.0534 0.9976 RTV 0.0163 35.7495 0.9983 ItNet 0.0077 42.2334 0.9018 TransitNet 0.0051 45.8486 0.9590 Table 2 Results of Different Image Enhance Net
Image Enhance Net Training Time∕min/epoch #param. FLOPs Memory/GB RMSE PSNR SSIM U-Net Only 7.2 30 M 96.47 G 4.6 0.0079 42.2568 0.8636 Normal Transformer 34.2 48.3 M 178.94 G 28.4 0.0041 48.1445 0.9642 Transformer with Linear Attention 22.6 39.6 M 124.86 G 13.3 0.0051 45.8486 0.9590 Table 3 Results of Fixed and Learnable Data Consistency
Data Consistency RMSE SSIM PSNR Fixed 0.0067 0.9188 43.8670 Learnable 0.0051 0.9590 45.8486 Table 4 Results with Different Iteration Counts
Iteration
CountsTraining
Time∕min/epochRMSE SSIM PSNR 1 11.2 0.0115 0.8387 38.7860 2 15.2 0.0072 0.8636 42.8534 3 18.8 0.0057 0.9341 44.8825 4 22.6 0.0051 0.9590 45.8486 5 26.3 0.0048 0.9715 46.3752 6 29.8 0.0046 0.9734 46.7448 -
[1] SIDKY E Y, KAO C M, PAN X. Accurate image reconstruction from few-views and limited-angle data in divergent-beam CT[J/OL]. Journal of X-Ray Science and Technology: Clinical Applications of Diagnosis and Therapeutics, 2006, 14(2): 119-139. DOI: 10.3233/XST-2006-00155.
[2] SIDKY E Y, PAN X. Image reconstruction in circular cone-beam computed tomography by constrained, total-variation minimization[J/OL]. Physics in Medicine and Biology, 2008, 53(17): 4777-4807. DOI: 10.1088/0031-9155/53/17/021.
[3] DO S, KARL W C, KALRA M K, et al. Clinical low dose CT image reconstruction using high-order total variation techniques[C/OL]//SPIE Medical Imaging. San Diego, California, United States, 2010: 76225D [2025-02-13]. http://proceedings.spiedigitallibrary.org/proceeding.aspx?doi= 10.1117/12.844307. DOI: 10.1117/12.844307.
[4] TIAN Z, JIA X, YUAN K, et al. Low-dose CT reconstruction via edge-preserving total variation regularization[J/OL]. Physics in Medicine and Biology, 2011, 56(18): 5949-5967. DOI: 10.1088/0031-9155/56/18/011.
[5] LIU Y, MA J, FAN Y, et al. Adaptive-weighted total variation minimization for sparse data toward low-dose x-ray computed tomography image reconstruction[J/OL]. Physics in Medicine and Biology, 2012, 57(23): 7923-7956. DOI: 10.1088/0031-9155/57/23/7923.
[6] XU L, YAN Q, XIA Y, et al. Structure extraction from texture via relative total variation[J/OL]. ACM Transactions on Graphics, 2012, 31(6): 1-10. DOI: 10.1145/2366145.2366158.
[7] Ning Fang-Li, He Bi-Jing, Wei Juan, et al. An algorithm for image reconstruction based on lp norm[J/OL]. Acta Physica Sinica, 2013, 62(17): 174212. DOI: 10.7498/aps.62.174212.
[8] SIDKY E Y, CHARTRAND R, BOONE J M, et al. Constrained TpV minimization for enhanced exploitation of gradient sparsity: Application to CT image reconstruction[J/OL]. IEEE Journal of Translational Engineering in Health and Medicine, 2014, 2: 1-18. DOI: 10.1109/JTEHM.2014.2300862.
[9] RIGIE D S, LA RIVIÈRE P J. Joint reconstruction of multi-channel, spectral CT data via constrained total nuclear variation minimization[J/OL]. Physics in Medicine and Biology, 2015, 60(5): 1741-1762. DOI: 10.1088/0031-9155/60/5/1741.
[10] ZHANG Z, CHEN B, XIA D, et al. Directional-TV algorithm for image reconstruction from limited-angular-range data[J/OL]. Medical Image Analysis, 2021, 70: 102030. DOI: 10.1016/j.media.2021.102030.
[11] QIAO Z. A simple and fast ASD-POCS algorithm for image reconstruction[J/OL]. Journal of X-Ray Science and Technology, 2021, 29(3): 491-506. DOI: 10.3233/XST-210858.
[12] QIAO Z, REDLER G, EPEL B, et al. A balanced total-variation-Chambolle-Pock algorithm for EPR imaging[J/OL]. Journal of Magnetic Resonance, 2021, 328: 107009. DOI: 10.1016/j.jmr.2021.107009.
[13] QIAO Z, LIU P, FANG C, et al. Directional TV algorithm for image reconstruction from sparse-view projections in EPR imaging[J/OL]. Physics in Medicine & Biology, 2024, 69(11): 115051. DOI: 10.1088/1361-6560/ad4a1b.
[14] JIANG M, TAO H W, CHENG K. Sparse view CT reconstruction algorithm based on non-local generalized total variation regularization[J]. CT Theory and Applications, 2025, 34(1): 129-139. DOI: 10.15953/j.ctta.2023.170.
[15] 2022 31 1 1 12 10.15953/j.ctta.2021.053 [16] RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation[M/OL]//NAVAB N, HORNEGGER J, WELLS W M, et al. Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: Vol. 9351. Cham: Springer International Publishing, 2015: 234-241 [2025-01-03]. http://link.springer.com/ 10.1007/978-3-319-24574-4_28. DOI: 10.1007/978-3-319-24574-4_28.
[17] HAN Y S, YOO J, YE J C. Deep residual learning for compressed sensing ct reconstruction via persistent homology analysis[A/OL]. arXiv, 2016[2024-11-30]. http://arxiv.org/abs/1611.06391. DOI: 10.48550/arXiv.1611.06391.
[18] JIN K H, MCCANN M T, FROUSTEY E, et al. Deep Convolutional Neural Network for Inverse Problems in Imaging[J/OL]. IEEE Transactions on Image Processing, 2017, 26(9): 4509-4522. DOI: 10.1109/TIP.2017.2713099.
[19] CHEN H, ZHANG Y, KALRA M K, et al. Low-Dose CT with a residual encoder-decoder convolutional neural network[J/OL]. IEEE Transactions on Medical Imaging, 2017, 36(12): 2524-2535. DOI: 10.1109/TMI.2017.2715284.
[20] GUAN S, KHAN A A, SIKDAR S, et al. Fully Dense UNet for 2-D Sparse Photoacoustic Tomography Artifact Removal[J/OL]. IEEE Journal of Biomedical and Health Informatics, 2020, 24(2): 568-576. DOI: 10.1109/JBHI.2019.2912935.
[21] ZHENG A, GAO H, ZHANG L, et al. A dual-domain deep learning-based reconstruction method for fully 3D sparse data helical CT[J/OL]. Physics in Medicine & Biology, 2020, 65(24): 245030. DOI: 10.1088/1361-6560/ab8fc1.
[22] KANDARPA V S S, PERELLI A, BOUSSE A, et al. LRR-CED: low-resolution reconstruction-aware convolutional encoder–decoder network for direct sparse-view CT image reconstruction[J/OL]. Physics in Medicine & Biology, 2022, 67(15): 155007. DOI: 10.1088/1361-6560/ac7bce.
[23] YU J, ZHANG H, ZHANG P, et al. Unsupervised learning-based dual-domain method for low-dose CT denoising[J/OL]. Physics in Medicine & Biology, 2023, 68(18): 185010. DOI: 10.1088/1361-6560/acefa2.
[24] ZHU Y Z, LV Q W, GUAN Y, et al. Low-dose CT reconstruction based on deep energy models[J]. CT Theory and Applications, 2022, 31(6): 709-720. DOI: 10.15953/j.ctta.2021.077.
[25] HAN Y. Hierarchical decomposed dual-domain deep learning for sparse-view CT reconstruction[J/OL]. Physics in Medicine & Biology, 2024, 69(8): 085019. DOI: 10.1088/1361-6560/ad31c7.
[26] VASWANI A, SHAZEER N, PARMAR N, et al. Attention Is All You Need[A/OL]. arXiv, 2023[2024-11-30]. http://arxiv.org/abs/1706.03762. DOI: 10.48550/arXiv.1706.03762.
[27] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale[A/OL]. arXiv, 2021[2024-11-30]. http://arxiv.org/abs/2010.11929. DOI: 10.48550/arXiv.2010.11929.
[28] QIAO Y Y, QIAO Z W. Low-dose CT image reconstruction method based on CNN and transformer coupling network[J]. CT Theory and Applications, 2022, 31(6): 697-707. DOI: 10.15953/j.ctta.2022.114.
[29] LIU P, FANG C, QIAO Z. A dense and U-shaped transformer with dual-domain multi-loss function for sparse-view CT reconstruction[J/OL]. Journal of X-Ray Science and Technology, 2024, 32(2): 207-228. DOI: 10.3233/XST-230184.
[30] YAN H, FANG C, LIU P, et al. CGP-Uformer: A low-dose CT image denoising Uformer based on channel graph perception[J/OL]. Journal of X-Ray Science and Technology, 2023, 31(6): 1189-1205. DOI: 10.3233/XST-230158.
[31] FAN X L, WEN Y Q, QIAO Z W. Sparse reconstruction of computed tomography images with transformer enhanced U-net[J]. CT Theory and Applications, 2024, 33(1): 1-12. DOI: 10.15953/j.ctta.2023.183.
[32] LI Y, SUN X, WANG S, et al. MDST: multi-domain sparse-view CT reconstruction based on convolution and swin transformer[J/OL]. Physics in Medicine & Biology, 2023, 68(9): 095019. DOI: 10.1088/1361-6560/acc2ab.
[33] MCKERAHAN T. Linear Attention Mechanism: An efficient attention for semantic segmentation[J].
[34] WANG S, LI B Z, KHABSA M, et al. Linformer: Self-attention with linear complexity[A/OL]. arXiv, 2020 [2024-11-30]. http://arxiv.org/abs/2006.04768. DOI: 10.48550/arXiv.2006.04768.
[35] CHOROMANSKI K, LIKHOSHERSTOV V, DOHAN D, et al. Rethinking Attention with Performers[A/OL]. arXiv, 2022 [2024-11-30]. http://arxiv.org/abs/2009.14794. DOI: 10.48550/arXiv.2009.14794.
[36] SHEN Z, ZHANG M, ZHAO H, et al. Efficient attention: attention with linear complexities[C/OL]//2021 IEEE Winter Conference on Applications of Computer Vision (WACV). Waikoloa, HI, USA: IEEE, 2021: 3530-3538[2025-01-03]. https://ieeexplore.ieee.org/document/9423033/. DOI: 10.1109/WACV48630.2021.00357.
[37] KATHAROPOULOS A, VYAS A, PAPPAS N, et al. Transformers are RNNs: Fast autoregressive transformers with linear attention[A/OL]. arXiv, 2020[2024-12-09]. https://arxiv.org/abs/2006.16236. DOI: 10.48550/ARXIV.2006.16236.
[38] CHEN H, ZHANG Y, CHEN Y, et al. LEARN: Learned experts’ assessment-based reconstruction network for sparse-data CT[J/OL]. IEEE Transactions on Medical Imaging, 2018, 37(6): 1333-1347. DOI: 10.1109/TMI.2018.2805692.
[39] ZHANG Y, CHEN H, XIA W, et al. LEARN++: Recurrent dual-domain reconstruction network for compressed sensing CT[J/OL]. IEEE Transactions on Radiation and Plasma Medical Sciences, 2023, 7(2): 132-142. DOI: 10.1109/TRPMS.2022.3222213.
[40] CHAE S, YANG E, MOON W J, et al. Deep cascade of convolutional neural networks for quantification of enlarged perivascular spaces in the basal ganglia in magnetic resonance imaging[J/OL]. Diagnostics, 2024, 14(14): 1504. DOI: 10.3390/diagnostics14141504.
[41] AGGARWAL H K, MANI M P, JACOB M. MoDL: Model-based deep learning architecture for inverse problems[J/OL]. IEEE Transactions on Medical Imaging, 2019, 38(2): 394-405. DOI: 10.1109/TMI.2018.2865356.
[42] WANG J, ZENG L, WANG C, et al. ADMM-based deep reconstruction for limited-angle CT[J/OL]. Physics in Medicine & Biology, 2019, 64(11): 115011. DOI: 10.1088/1361-6560/ab1aba.
[43] LIAO H, LIN W A, ZHOU S K, et al. ADN: Artifact Disentanglement Network for Unsupervised Metal Artifact Reduction[J/OL]. IEEE Transactions on Medical Imaging, 2020, 39(3): 634-643. DOI: 10.1109/TMI.2019.2933425.
[44] PUTZKY P, WELLING M. Recurrent inference machines for solving inverse problems[A/OL]. arXiv, 2017[2024-11-30]. http://arxiv.org/abs/1706.04008. DOI: 10.48550/arXiv.1706.04008.
[45] XIANG J, DONG Y, YANG Y. FISTA-Net: Learning a Fast Iterative Shrinkage Thresholding Network for Inverse Problems in Imaging[J/OL]. IEEE Transactions on Medical Imaging, 2021, 40(5): 1329-1339. DOI: 10.1109/TMI.2021.3054167.
[46] WU W, HU D, NIU C, et al. DRONE: Dual-Domain Residual-based Optimization NEtwork for Sparse-View CT Reconstruction[J/OL]. IEEE Transactions on Medical Imaging, 2021, 40(11): 3002-3014. DOI: 10.1109/TMI.2021.3078067.
[47] GENZEL M, GÜHRING I, MACDONALD J, et al. Near-exact recovery for tomographic inverse problems via deep learning[A/OL]. arXiv, 2022[2024-11-30]. http://arxiv.org/abs/2206.07050. DOI: 10.48550/arXiv.2206.07050.
[48] SU T, CUI Z, YANG J, et al. Generalized deep iterative reconstruction for sparse-view CT imaging[J/OL]. Physics in Medicine & Biology, 2022, 67(2): 025005. DOI: 10.1088/1361-6560/ac3eae.
[49] JIA Y, MCMICHAEL N, MOKARZEL P, et al. Superiorization-inspired unrolled SART algorithm with U-Net generated perturbations for sparse-view and limited-angle CT reconstruction[J/OL]. Physics in Medicine & Biology, 2022, 67(24): 245004. DOI: 10.1088/1361-6560/aca513.
[50] SUN C, LIU Y, YANG H. An efficient deep unrolling network for sparse-view CT reconstruction via alternating optimization of dense-view sinograms and images[J/OL]. Physics in Medicine & Biology, 2024[2024-12-22]. https://iopscience.iop.org/article/ 10.1088/1361-6560/ad9dac. DOI: 10.1088/1361-6560/ad9dac.