1
|
Xu J, Gao Y, Lu Q, Zhang R, Gui J, Liu X, Yue Z. RiceSNP-BST: a deep learning framework for predicting biotic stress-associated SNPs in rice. Brief Bioinform 2024; 25:bbae599. [PMID: 39562160 PMCID: PMC11576077 DOI: 10.1093/bib/bbae599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Revised: 10/07/2024] [Accepted: 11/04/2024] [Indexed: 11/21/2024] Open
Abstract
Rice consistently faces significant threats from biotic stresses, such as fungi, bacteria, pests, and viruses. Consequently, accurately and rapidly identifying previously unknown single-nucleotide polymorphisms (SNPs) in the rice genome is a critical challenge for rice research and the development of resistant varieties. However, the limited availability of high-quality rice genotype data has hindered this research. Deep learning has transformed biological research by facilitating the prediction and analysis of SNPs in biological sequence data. Convolutional neural networks are especially effective in extracting structural and local features from DNA sequences, leading to significant advancements in genomics. Nevertheless, the expanding catalog of genome-wide association studies provides valuable biological insights for rice research. Expanding on this idea, we introduce RiceSNP-BST, an automatic architecture search framework designed to predict SNPs associated with rice biotic stress traits (BST-associated SNPs) by integrating multidimensional features. Notably, the model successfully innovates the datasets, offering more precision than state-of-the-art methods while demonstrating good performance on an independent test set and cross-species datasets. Additionally, we extracted features from the original DNA sequences and employed causal inference to enhance the biological interpretability of the model. This study highlights the potential of RiceSNP-BST in advancing genome prediction in rice. Furthermore, a user-friendly web server for RiceSNP-BST (http://rice-snp-bst.aielab.cc) has been developed to support broader genome research.
Collapse
Affiliation(s)
- Jiajun Xu
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
| | - Yujia Gao
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
| | - Quan Lu
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
| | - Renyi Zhang
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
| | - Jianfeng Gui
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
| | - Xiaoshuang Liu
- Research Center for Biological Breeding Technology, Advance Academy, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
| | - Zhenyu Yue
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
- Research Center for Biological Breeding Technology, Advance Academy, Anhui Agricultural University, 130, Changjiang West Road, Hefei, Anhui Province 230036, China
| |
Collapse
|
2
|
Anilkumar C, Muhammed Azharudheen TP, Sah RP, Sunitha NC, Devanna BN, Marndi BC, Patra BC. Gene based markers improve precision of genome-wide association studies and accuracy of genomic predictions in rice breeding. Heredity (Edinb) 2023; 130:335-345. [PMID: 36792661 PMCID: PMC10163052 DOI: 10.1038/s41437-023-00599-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 02/02/2023] [Accepted: 02/03/2023] [Indexed: 02/17/2023] Open
Abstract
It is hypothesized that the genome-wide genic markers may increase the prediction accuracy of genomic selection for quantitative traits. To test this hypothesis, a set of candidate gene-based markers for yield and grain traits-related genes cloned across the rice genome were custom-designed. A multi-model, multi-locus genome-wide association study (GWAS) was performed using new genic markers developed to test their effectiveness for gene discovery. Two multi-locus models, FarmCPU and mrMLM, along with a single-locus mixed linear model (MLM), identified 28 significant marker-trait associations. These associations revealed novel causative alleles for grain weight and pleiotropic associations with other traits. For instance, the marker YD91 derived from the gene OsAAP3 on chromosome 1 was consistently associated with grain weight, while the gene has a significant effect on grain yield. Furthermore, nine genomic selection methods, including regression-based and machine learning-based models, were used to predict grain weight using a leave-one-out five-fold cross-validation approach to optimize the genomic selection model with genic markers. Among nine prediction models, Kernel Hilbert Space Regression (RKHS) is the best among regression-based models, and Random Forest Regression (RFR) is the best among machine learning-based models. Genomic prediction accuracies with and without GWAS significant markers were compared to assess the effectiveness of markers. The rapid decreases in prediction accuracy upon dropping GWAS significant markers indicate the effectiveness of new genic markers in genomic selection. Apart from that, the candidate gene-based markers were found to be more effective in genomic selection programs for better accuracy.
Collapse
|
3
|
Sah RP, Nayak AK, Chandrappa A, Behera S, Azharudheen Tp M, Lavanya GR. cgSSR marker-based genome-wide association study identified genomic regions for panicle characters and yield in rice (Oryza sativa L.). JOURNAL OF THE SCIENCE OF FOOD AND AGRICULTURE 2023; 103:720-728. [PMID: 36054367 DOI: 10.1002/jsfa.12183] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 08/03/2022] [Accepted: 08/21/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND To improve production efficiency, positive alleles corresponding to yield-related attributes must be accumulated in a single elite background. We designed and used cgSSR markers, which are superior to random SSR markers in genome-wide association study, to identify genomic regions that contribute to panicle characters and grain yield in this study. RESULTS As evidenced by the high polymorphic information content value and gene diversity coefficient, the new cgSSR markers were determined to be highly informative. These cgSSR markers were employed to generate genotype data for an association panel evaluated for four panicle characters and grain yield over three seasons. For five traits, 17 significant marker-trait associations on six chromosomes were discovered. The percentage of phenotypic variance that could be explained ranged from 4% to 13%. Unrelated gene-derived markers had a strong association with target traits as well. CONCLUSION Trait-associated cgSSR markers derived from corresponding or related genes ensure their utility in direct allele selection, while other linked markers aid in allele selection indirectly by altering the phenotype of interest. Through a marker-assisted breeding approach, these marker-trait associations can be leveraged to accumulate favourable alleles for yield enhancement in rice. © 2022 Society of Chemical Industry.
Collapse
Affiliation(s)
- Rameswar Prasad Sah
- Crop Improvement Division, ICAR - National Rice Research Institute, Cuttack, India
| | - Amrit Kumar Nayak
- Department of Genetics and Plant breeding, Naini Agricultural Institute, Sam Higginbottom University of Agriculture, Technology and Sciences (SHUATS), Prayagraj, India
| | - Anilkumar Chandrappa
- Crop Improvement Division, ICAR - National Rice Research Institute, Cuttack, India
| | - Sasmita Behera
- Crop Improvement Division, ICAR - National Rice Research Institute, Cuttack, India
| | | | - G Roopa Lavanya
- Department of Genetics and Plant breeding, Naini Agricultural Institute, Sam Higginbottom University of Agriculture, Technology and Sciences (SHUATS), Prayagraj, India
| |
Collapse
|
4
|
Anilkumar C, Sunitha NC, Devate NB, Ramesh S. Advances in integrated genomic selection for rapid genetic gain in crop improvement: a review. PLANTA 2022; 256:87. [PMID: 36149531 DOI: 10.1007/s00425-022-03996-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Accepted: 09/11/2022] [Indexed: 06/16/2023]
Abstract
Genomic selection and its importance in crop breeding. Integration of GS with new breeding tools and developing SOP for GS to achieve maximum genetic gain with low cost and time. The success of conventional breeding approaches is not sufficient to meet the demand of a growing population for nutritious food and other plant-based products. Whereas, marker assisted selection (MAS) is not efficient in capturing all the favorable alleles responsible for economic traits in the process of crop improvement. Genomic selection (GS) developed in livestock breeding and then adapted to plant breeding promised to overcome the drawbacks of MAS and significantly improve complicated traits controlled by gene/QTL with small effects. Large-scale deployment of GS in important crops, as well as simulation studies in a variety of contexts, addressed G × E interaction effects and non-additive effects, as well as lowering breeding costs and time. The current study provides a complete overview of genomic selection, its process, and importance in modern plant breeding, along with insights into its application. GS has been implemented in the improvement of complex traits including tolerance to biotic and abiotic stresses. Furthermore, this review hypothesises that using GS in conjunction with other crop improvement platforms accelerates the breeding process to increase genetic gain. The objective of this review is to highlight the development of an appropriate GS model, the global open source network for GS, and trans-disciplinary approaches for effective accelerated crop improvement. The current study focused on the application of data science, including machine learning and deep learning tools, to enhance the accuracy of prediction models. Present study emphasizes on developing plant breeding strategies centered on GS combined with routine conventional breeding principles by developing GS-SOP to achieve enhanced genetic gain.
Collapse
Affiliation(s)
- C Anilkumar
- ICAR-National Rice Research Institute, Cuttack, India
| | - N C Sunitha
- University of Agricultural Sciences, Bangalore, India
| | | | - S Ramesh
- University of Agricultural Sciences, Bangalore, India.
| |
Collapse
|
5
|
Tp MA, Kumar A, Anilkumar C, Sah RP, Behera S, Marndi BC. Understanding natural genetic variation for grain phytic acid content and functional marker development for phytic acid-related genes in rice. BMC PLANT BIOLOGY 2022; 22:446. [PMID: 36114452 PMCID: PMC9482188 DOI: 10.1186/s12870-022-03831-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 08/30/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND The nutritional value of rice can be improved by developing varieties with optimum levels of grain phytic acid (PA). Artificial low-PA mutants with impaired PA biosynthesis have been developed in rice through induced mutagenesis. However, low-PA mutant stocks with drastically reduced grain PA content have poor breeding potential, and their use in rice breeding is restricted due to their detrimental pleiotropic effects, which include decreased seed viability, low grain weight, and low seed yield. Therefore, it is necessary to take advantage of the natural variation in grain PA content in order to reduce the PA content to an ideal level without compromising the crop's agronomic performance. Natural genetic diversity in grain PA content has not been thoroughly examined among elite genetic stocks. Additionally, given grain PA content as a quantitative trait driven by polygenes, DNA marker-assisted selection may be required for manipulation of such a trait; however, informative DNA markers for PA content have not yet been identified in rice. Here we investigated and dissected natural genetic variation and genetic variability components for grain PA content in rice varieties cultivated in Eastern and North-Eastern India during the last 50 years. We developed novel gene-based markers for the low-PA-related candidate genes in rice germplasm, and their allelic diversity and association with natural variation in grain PA content were studied. RESULTS A wide (0.3-2.8%), significant variation for grain PA content, with decade-wise and ecology-wise differences, was observed among rice varieties. Significant genotype x environment interaction suggested polygenic inheritance. The novel candidate gene-based markers detected 43 alleles in the rice varieties. The new markers were found highly informative as indicated by PIC values (0.11-0.65; average: 0.34) and coverage of total diversity. Marker alleles developed from two putative transporter genes viz., SPDT and OsPT8 were significantly associated with grain PA variation assayed on the panel. A 201 bp allele at the 3' UTR of SPDT gene was negatively associated with grain PA content and explained 7.84% of the phenotypic variation. A rare allele in the coding sequence of OsPT8 gene was positively associated with grain PA content which explained phenotypic variation of 18.49%. CONCLUSION Natural variation in grain PA content is substantial and is mostly controlled by genetic factors. The unique DNA markers linked with PA content have significant potential as genomic resources for the development of low-PA rice varieties through genomics-assisted breeding procedures.
Collapse
Affiliation(s)
| | - Awadhesh Kumar
- Crop Physiology and Biochemistry Division, ICAR-National Rice Research Institute, Cuttack, India
| | - Chandrappa Anilkumar
- Crop Improvement Division, ICAR-National Rice Research Institute, Cuttack, India
| | - Rameswar Prasad Sah
- Crop Improvement Division, ICAR-National Rice Research Institute, Cuttack, India.
| | - Sasmita Behera
- Crop Improvement Division, ICAR-National Rice Research Institute, Cuttack, India
| | - Bishnu Charan Marndi
- Crop Improvement Division, ICAR-National Rice Research Institute, Cuttack, India
| |
Collapse
|
6
|
Anilkumar C, Sah RP, Muhammed Azharudheen TP, Behera S, Singh N, Prakash NR, Sunitha NC, Devanna BN, Marndi BC, Patra BC, Nair SK. Understanding complex genetic architecture of rice grain weight through QTL-meta analysis and candidate gene identification. Sci Rep 2022; 12:13832. [PMID: 35974066 PMCID: PMC9381546 DOI: 10.1038/s41598-022-17402-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 07/25/2022] [Indexed: 11/17/2022] Open
Abstract
Quantitative trait loci (QTL) for rice grain weight identified using bi-parental populations in various environments were found inconsistent and have a modest role in marker assisted breeding and map-based cloning programs. Thus, the identification of a consistent consensus QTL region across populations is critical to deploy in marker aided breeding programs. Using the QTL meta-analysis technique, we collated rice grain weight QTL information from numerous studies done across populations and in diverse environments to find constitutive QTL for grain weight. Using information from 114 original QTL in meta-analysis, we discovered three significant Meta-QTL (MQTL) for grain weight on chromosome 3. According to gene ontology, these three MQTL have 179 genes, 25 of which have roles in developmental functions. Amino acid sequence BLAST of these genes indicated their orthologue conservation among core cereals with similar functions. MQTL3.1 includes the OsAPX1, PDIL, SAUR, and OsASN1 genes, which are involved in grain development and have been discovered to play a key role in asparagine biosynthesis and metabolism, which is crucial for source-sink regulation. Five potential candidate genes were identified and their expression analysis indicated a significant role in early grain development. The gene sequence information retrieved from the 3 K rice genome project revealed the deletion of six bases coding for serine and alanine in the last exon of OsASN1 led to an interruption in the synthesis of α-helix of the protein, which negatively affected the asparagine biosynthesis pathway in the low grain weight genotypes. Further, the MQTL3.1 was validated using linked marker RM7197 on a set of genotypes with extreme phenotypes. MQTL that have been identified and validated in our study have significant scope in MAS breeding and map-based cloning programs for improving rice grain weight.
Collapse
Affiliation(s)
- C Anilkumar
- ICAR-National Rice Research Institute, Cuttack, India.
| | | | | | | | - Namita Singh
- Indira Gandhi Krishi Vishwavidyalaya, Raipur, India
| | - Nitish Ranjan Prakash
- ICAR-Central Soil Salinity Research Institute, Regional Research Station, Canning Town, India
| | - N C Sunitha
- University of Agricultural Sciences, Bangalore, India
| | - B N Devanna
- ICAR-National Rice Research Institute, Cuttack, India
| | - B C Marndi
- ICAR-National Rice Research Institute, Cuttack, India
| | - B C Patra
- ICAR-National Rice Research Institute, Cuttack, India
| | | |
Collapse
|
7
|
Sah RP, Behera S, Dash SK, Azharudheen TPM, Meher J, Kumar A, Marndi BC, Kar MK, Subudhi HN, Anilkumar C. Unravelling genetic architecture and development of core set from elite rice lines using yield-related candidate gene markers. PHYSIOLOGY AND MOLECULAR BIOLOGY OF PLANTS : AN INTERNATIONAL JOURNAL OF FUNCTIONAL PLANT BIOLOGY 2022; 28:1217-1232. [PMID: 35910441 PMCID: PMC9334483 DOI: 10.1007/s12298-022-01190-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 05/08/2022] [Accepted: 05/17/2022] [Indexed: 06/03/2023]
Abstract
Assessing genetic diversity and development of a core set of elite breeding lines is a prerequisite for selective hybridization programes intended to improve the yield potential in rice. In the present study, the genetic diversity of newly developed elite lines derived from indicax tropical japonica and indicax indica crosses were estimated by 38 reported molecular markers. The markers used in the study consist of 24 gene-based and 14 random markers related to grain yield-related QTLs distributed across the rice genome. Genotypic characterization was carried out to determine the genetic similarities between the elite lines. In total, 75 alleles were found using 38 polymorphic markers, with polymorphism information content ranging from 0.10 to 0.51 with an average of 0.35. The genotypes were divided into three groups based on cluster analysis, structure analysis and also dispersed throughout the quadrangle of PCA, but nitrogen responsive lines clustered in one quadrangle. Seven markers (GS3_RGS1, GS3_RGS2, GS5_Indel1, Ghd 7_05SNP, RM 12289, RM 23065 and RM 25457) exhibited PIC values ≥ 0.50 indicating that they were effective in detecting genetic relationships among elite rice. Additionally, a core set of 11 elite lines was made from 96 lines in order to downsize the diversity of the original population into a small set for parental selection. In general, the genetic information collected in this work will aid in the study of grain yield traits at molecular level for other sets of rice genotypes and for selecting diverse elite lines to develop a strong crossing programme in rice. Supplementary Information The online version contains supplementary material available at 10.1007/s12298-022-01190-8.
Collapse
Affiliation(s)
- Rameswar Prasad Sah
- Crop Improvement Division, ICAR-National Rice Research Institute, Cuttack, Odisha India
| | - Sasmita Behera
- Crop Improvement Division, ICAR-National Rice Research Institute, Cuttack, Odisha India
| | - Sushant Kumar Dash
- Crop Improvement Division, ICAR-National Rice Research Institute, Cuttack, Odisha India
| | | | - Jitendriya Meher
- Crop Improvement Division, ICAR-National Rice Research Institute, Cuttack, Odisha India
| | - Awadhesh Kumar
- Crop Physiology and Biochemistry, ICAR-National Rice Research Institute, Cuttack, Odisha India
| | - Bishnu Charan Marndi
- Crop Improvement Division, ICAR-National Rice Research Institute, Cuttack, Odisha India
| | - Meera Kumari Kar
- Crop Improvement Division, ICAR-National Rice Research Institute, Cuttack, Odisha India
| | - H. N. Subudhi
- Crop Improvement Division, ICAR-National Rice Research Institute, Cuttack, Odisha India
| | - C. Anilkumar
- Crop Improvement Division, ICAR-National Rice Research Institute, Cuttack, Odisha India
| |
Collapse
|