J Integr Plant Biol.

• Resources • Previous Articles    

PanGraphRNA: An efficient and flexible bioinformatics platform for graph pangenome-based RNA-seq data analysis

Yifan Bu1,2,3, Zhixu Qiu3, Wen Sun1,2, Yishui Han1,2,3, Yifan Liu1, Jing Yang1,2,3, Minggui Song1,2,3, Zenglin Li1,2, Songyu Liu1,2, Yuzhou Zhang1 and Chuang Ma1,2,3*   

  1. 1. State Key Laboratory for Crop Stress Resistance and High‐Efficiency Production, College of Life Sciences, Northwest A&F University,Yangling 712100, China

    2. Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Yangling 712100, China

    3. Center of Bioinformatics, Northwest A&F University, Yangling 712100, China

    *Correspondence: Chuang Ma (cma@nwafu.edu.cn)

  • Received:2025-06-17 Accepted:2026-02-26 Online:2026-03-19
  • Supported by:
    This project was supported by the National Natural Science Foundation of China (32170681; 32470717), the Sub‐project of National Key Research and Development Program(2024YFD1201301‐2), and the Chinese Universities Scienti?c Fund (Z1090224030; Z1090125001).

Abstract: Transcriptome deep sequencing (RNA-seq) data analysis is often affected by reference bias introduced by the use of a single linear reference (SLR) genome. Graph-based pangenomes can mitigate this bias by integrating the SLR genome with complex genetic variations within a species; however, their application remains limited owing to a lack of dedicated analytical tools. Here, we present PanGraphRNA, an integrated bioinformatics platform for RNA-seq data analysis using a graph pangenome as reference. Built on the Galaxy web-based framework, PanGraphRNA provides functional modules for constructing, evaluating, and applying graph pangenomes across different population scales, thus enabling accessibility, traceability, and reproducibility throughout the analysis. Applied to both real and simulated RNA-seq data sets from Arabidopsis (Arabidopsis thaliana), PanGraphRNA outperformed the SLR approach, achieving higher read alignment accuracy and more precise gene expression quantification. PanGraphRNA enabled the identification of drought stress-induced genes and flowering time-related quantitative trait loci that were previously missed with the conventional SLR approach. Furthermore, we successfully applied PanGraphRNA to process RNA-seq data sets from rice (Oryza sativa) and maize (Zea mays). By providing standardized, containerized workflows, PanGraphRNA will facilitate transcriptomic research in key plant species, including Arabidopsis, rice, and maize.

Key words: galaxy, graph pangenome, population, reference bias, RNA sequencing

Editorial Office, Journal of Integrative Plant Biology, Institute of Botany, CAS
No. 20 Nanxincun, Xiangshan, Beijing 100093, China
Tel: +86 10 6283 6133 Fax: +86 10 8259 2636 E-mail: jipb@ibcas.ac.cn
Copyright © 2026 by the Institute of Botany, the Chinese Academy of Sciences
Online ISSN: 1744-7909 Print ISSN: 1672-9072 CN: 11-5067/Q
备案号:京ICP备16067583号-22