PanGraphRNA: An efficient and flexible bioinformatics platform for graph pangenome-based RNA-seq data analysis

doi:10.1111/jipb.70231

PanGraphRNA: An efficient and flexible bioinformatics platform for graph pangenome-based RNA-seq data analysis

Yifan Bu^1,2,3, Zhixu Qiu3, Wen Sun^1,2, Yishui Han^1,2,3, Yifan Liu¹, Jing Yang^1,2,3, Minggui Song^1,2,3, Zenglin Li^1,2, Songyu Liu^1,2, Yuzhou Zhang¹ and Chuang Ma^1,2,3*

1. State Key Laboratory for Crop Stress Resistance and High‐Efﬁciency Production, College of Life Sciences, Northwest A&F University,Yangling 712100, China

2. Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Yangling 712100, China

3. Center of Bioinformatics, Northwest A&F University, Yangling 712100, China

^*Correspondence: Chuang Ma (cma@nwafu.edu.cn)

Received:2025-06-17 Accepted:2026-02-26 Online:2026-03-19
Supported by:
This project was supported by the National Natural Science Foundation of China (32170681; 32470717), the Sub‐project of National Key Research and Development Program(2024YFD1201301‐2), and the Chinese Universities Scienti?c Fund (Z1090224030; Z1090125001).

Abstract

Abstract: Transcriptome deep sequencing (RNA-seq) data analysis is often affected by reference bias introduced by the use of a single linear reference (SLR) genome. Graph-based pangenomes can mitigate this bias by integrating the SLR genome with complex genetic variations within a species; however, their application remains limited owing to a lack of dedicated analytical tools. Here, we present PanGraphRNA, an integrated bioinformatics platform for RNA-seq data analysis using a graph pangenome as reference. Built on the Galaxy web-based framework, PanGraphRNA provides functional modules for constructing, evaluating, and applying graph pangenomes across different population scales, thus enabling accessibility, traceability, and reproducibility throughout the analysis. Applied to both real and simulated RNA-seq data sets from Arabidopsis (Arabidopsis thaliana), PanGraphRNA outperformed the SLR approach, achieving higher read alignment accuracy and more precise gene expression quantification. PanGraphRNA enabled the identification of drought stress-induced genes and flowering time-related quantitative trait loci that were previously missed with the conventional SLR approach. Furthermore, we successfully applied PanGraphRNA to process RNA-seq data sets from rice (Oryza sativa) and maize (Zea mays). By providing standardized, containerized workflows, PanGraphRNA will facilitate transcriptomic research in key plant species, including Arabidopsis, rice, and maize.

Key words: galaxy, graph pangenome, population, reference bias, RNA sequencing

Yifan Bu, Zhixu Qiu, Wen Sun, Yishui Han, Yifan Liu, Jing Yang, Minggui Song, Zenglin Li, Songyu Liu, Yuzhou Zhang, Chuang Ma. PanGraphRNA: An efficient and flexible bioinformatics platform for graph pangenome-based RNA-seq data analysis[J]. J Integr Plant Biol., DOI: 10.1111/jipb.70231.

[1]	Yingjie Xue, Yikun Zhao, Yunlong Zhang, Rui Wang, Xiaohui Li, Zhihao Liu, Weiwei Wang, Shaoxi Zhu, Yaming Fan, Liwen Xu, Wei Zhao, Jiuran Zhao, Fengge Wang. Insights into the genomic divergence of maize heterotic groups in China [J]. J Integr Plant Biol., 2025, 67(6): 1467-1486.
[2]	Wei Tong, Yanli Wang, Fangdong Li, Fei Zhai, Jingjing Su, Didi Wu, Lianghui Yi, Qijuan Gao, Qiong Wu, Enhua Xia. Genomic variation of 363 diverse tea accessions unveils the genetic diversity, domestication, and structural variations associated with tea adaptation [J]. J Integr Plant Biol., 2024, 66(10): 2175-2190.
[3]	Baocheng Sun, Yu Wang, Qun Yang, Han Gao, Haiyu Niu, Yansong Li, Qun Ma, Qing Huan, Wenfeng Qian and Bo Ren. A high-resolution transcriptomic atlas depicting nitrogen fixation and nodule development in soybean [J]. J Integr Plant Biol., 2023, 65(6): 1536-1552.
[4]	Zhongfeng Li, Lingxue Jiang, Yansong Ma, Zhongyan Wei, Huilong Hong, Zhangxiong Liu, Jinhui Lei, Ying Liu, Rongxia Guan, Yong Guo, Longguo Jin, Lijuan Zhang, Yinghui Li, Yulong Ren, Wei He, Ming Liu, Nang Myint Phyu Sin Htwe, Lin Liu, Bingfu Guo, Jian Song, Bing Tan, Guifeng Liu, Maiquan Li, Xianli Zhang, Bo Liu, Xuehui Shi, Sining Han, Sunan Hua, Fulai Zhou, Lili Yu, Yanfei Li, Shuang Wang, Jun Wang, Ruzhen Chang, and Lijuan Qiu. Development and utilization of a new chemically-induced soybean library with a high mutation density [J]. J Integr Plant Biol., 2017, 59(1): 60-74.
[5]	Hui Wang, Pawan Khera, Bingyan Huang, Mei Yuan, Ramesh Katam, Weijian Zhuang, Karen Harris-Shultz, , Kim M. Moore, Albert K. Culbreath, Xinyou Zhang, Rajeev K. Varshney, Lianhui Xie, and Baozhu Guo. Analysis of genetic diversity and population structure of peanut cultivars and breeding lines from China, India and the US using simple sequence repeat markers [J]. J Integr Plant Biol., 2016, 58(5): 452-465.
[6]	Qin Xu, Shilai Xing, Caiyun Zhu, Wei Liu, Yangyang Fan, Qian Wang, Zhihong Song, Wenhui Yang, Fan Luo, Fei Shang, Lifang Kang, Wenli Chen, Juan Yan, Jianqiang Li, and Tao Sang. Population transcriptomics reveals a potentially positive role of expression diversity in adaptation [J]. J Integr Plant Biol., 2015, 57(3): 284-299.
[7]	Caiping Cai, Wenxue Ye, Tianzhen Zhang and Wangzhen Guo. Association analysis of fiber quality traits and exploration of elite alleles in Upland cotton cultivars/accessions (Gossypium hirsutum L.) [J]. J Integr Plant Biol., 2014, 56(1): 51-62.
[8]	Wenliang Wei, Yanxin Zhang, Haixia Lv, Donghua Li, Linhai Wang and Xiurong Zhang. Association Analysis for Quality Traits in a Diverse Panel of Chinese Sesame (Sesamum indicum L.) germplasm [J]. J Integr Plant Biol., 2013, 55(8): 745-758.
[9]	Xufeng Bai, Bi Wu and Yongzhong Xing. Yield-related QTLs and Their Applications in Rice Genetic Improvement [J]. J Integr Plant Biol., 2012, 54(5): 300-311.
[10]	Liangyong Ma, Jinsong Bao, Longbiao Guo, Dali Zeng, Ximing Li, Zhijuan Ji, Yingwu Xia, Changdeng Yang, and Qian Qian. Quantitative Trait Loci for Panicle Layer Uniformity Identified in Doubled Haploid Lines of Rice in Two Environments [J]. J Integr Plant Biol., 2009, 51(9): 818-824.
[11]	Dali Zeng, Jiang Hu, Guojun Dong, Jian Liu, Longjun Zeng, Guangheng Zhang, Longbiao Guo, Yihua Zhou, and Qian Qian. Quantitative Trait Loci Mapping of Flag-leaf Ligule Length in Rice and Alignment with ZmLG1 Gene [J]. J Integr Plant Biol., 2009, 51(4): 360-366.
[12]	Weiguo Zhao, Eun-Jin Park, Jong-Wook Chung, Yong-Jin Park, Ill-Min Chung, Joung-Kuk Ahn, and Gwang-Ho Kim. Association Analysis of the Amino Acid Contents in Rice [J]. J Integr Plant Biol., 2009, 51(12): 1126-1137.
[13]	Senapathy Senthilvel, Kunnummal Kurungara Vinod, Palaniappan Malarvizhi and Marappa Maheswaran. QTL and QTL × Environment Effects on Agronomic and Nitrogen Acquisition Traits in Rice [J]. J Integr Plant Biol., 2008, 50(9): 1108-1117.
[14]	Jian-Wen Shao, Xiao-Ping Zhang, Zhong-Xing Zhang and Guo-Ping Zhu. Effects of Population Size on Reproductive Success of the Endangered and Endemic Species Primula merrilliana [J]. J Integr Plant Biol., 2008, 50(9): 1151-1160.
[15]	Bo Qi, Paul Korir, Tuanjie Zhao, Deyue Yu, Shouyi Chen and Junyi Gai. Mapping Quantitative Trait Loci Associated with Aluminum Toxin Tolerance in NJRIKY Recombinant Inbred Line Population of Soybean (Glycine max) [J]. J Integr Plant Biol., 2008, 50(9): 1089-1095.

PanGraphRNA: An efficient and flexible bioinformatics platform for graph pangenome-based RNA-seq data analysis

HTML

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments