Muhammad Jaffer Ali,Guangnan Xing,Jianbo He,Tuanjie Zhao,Junyi Gai,*
aSoybean Research Institute,Nanjing Agricultural University,Nanjing 210095,Jiangsu,China
bMARA National Center for Soybean Improvement,Nanjing Agricultural University,Nanjing 210095,Jiangsu,China
cMARA Key Laboratory of Biology and Genetic Improvement of Soybean(General),Nanjing Agricultural University,Nanjing 210095,Jiangsu,China
dState Key Laboratory for Crop Genetics and Germplasm Enhancement,Nanjing Agricultural University,Nanjing 210095,Jiangsu,China
eJiangsu Collaborative Innovation Center for Modern Crop Production,Nanjing Agricultural University,Nanjing 210095,Jiangsu,China
ABSTRACT
Food insecurity is a major threat presented by climate change in the 21st century.The food and feed sectors are dependent on nutrient-rich crops,such as soybean,which contains 40%protein and 20% oil.As a crop that fixes its own nitrogen,soybean is a lucrative crop for farmers.
Among the devastating abiotic stresses,flooding reduces crop growth and yield.Soybean growing areas usually receive 60%–70% of their annual precipitation during the period of soybean growth and development.Waterlogging during vegetative stages reduces seed yields by 17%–40%[1]owing to root and shoot damage[2].
Soybean fields may be flooded either by heavy rain in lowland areas or upland due to crop rotation with paddy rice[3].Flooding may be by waterlogging,if limited to roots,or submergence if the entire plant is underwater[4].Flooding damage to the crop depends upon the growth stage when flooding occurs.Final yield,nodule dry weight,and respiration rate were greatly reduced when waterlogging occurred at vegetative stages[5,6].Several plant growth-related physiological disturbances reduced the growth,dry matter,pod formation,seeds per pod,and hundred-seed weight[7].For soybean waterlogging tolerance,evaluation procedures[8],quantitative trait locus(QTL)mapping[1,2,9–11],genomewide association mapping[12],transcription factor expression[13],and proteomic approaches to identify stress-response mechanisms[14]have been studied.
The germination and emergence of soybean can be hindered by the presence of excessive water as a result of continuous rainfall[15].Excessive water during germination(pre-germination or seed flooding)causes seed damage and decay leading to decreased field emergence,subsequent growth,and yield.Southern China experiences seed-flooding stress due to rainfall just after field sowing of soybean.Flooding injury to seeds was measured by electrical conductivity(EC)of steep water harboring cellular substances that leak from the submerged seeds[16].Seed soaking for 4 days at 25 °C were used to evaluate seed-flooding tolerance using germination rate(GR)as indicator,with black-seeded showing higher tolerance than yellow-seeded cultivars[17].GR and EC as used by previous researchers were adopted as evaluation criteria with the addition of normal seedling rate(NSR)as seed-flooding tolerance indicators by Sayama et al.[18].Water absorption speed(WAS)has also been used to identify seedflooding tolerant and sensitive cultivars,with tolerant cultivars showing lower WAS than sensitive ones[19],but WAS is difficult to measure.In our group,relative seedling length(RSL)in paper rolls after 48 h of seed-flooding is used as a standard seed flooding testing procedure and used to evaluate breeding materials[20].Flooding for 48 h was identified by germination testing as the most appropriate stress duration,with moderate seed decay allowing efficient evaluation.Use of a germination paper roll supplied moisture to seeds and permitted healthy growth of seedlings in comparison with paper towel and Petri dish methods.RSL was shown[20]to be flooding-responsive and was chosen as the major seedflooding indicator for its higher heritability(h2),genotypic coefficient of variation(GCV),and correlation with other indicators and its easier measuring procedure,in comparison to previously used indicators and all newly identified vigor traits by Ali et al.[20].
Little work has been done on mapping QTL conferring tolerance to pre-germination flooding stress,owing possibly to the sensitivity of the stage and the complexity of the stress.Sayama et al.[18]mapped 24 QTL,which were joined into four seed-flooding tolerance(SFT)QTL named Sftl,Sft2,Sft3,and Sft4 on chromosomes 12 and 8.Recently,a genome-wide association study identified quantitative-trait nucleotides associated with seed-flooding tolerance[21]while transcriptome analysis identified differentially expressed genes in two genotypes with contrasting seed-flooding tolerance[22].
The genetic architecture of complex traits has been investigated in bi-parental populations,which has effectively explored and identified larger effect loci.However,highresolution mapping of QTL requires high recombination,which is actually limited in bi-parental populations,reducing their ability to resolve large numbers of QTL in such populations[23].Joint linkage mapping based on multiple genetically connected populations represents an approach affording higher recombination[24].A maize nested association mapping(NAM)population composed of 25 RIL populations having one parent in common detected QTL by joint linkage mapping of several important traits in maize[25–28].Mapping in small-scale NAM populations is not only possible but more effective than mapping in single populations.For example,in Arabidopsis higher efficiency of QTL detection was observed in joint analysis of NAM composed of three RIL populations than by separate analysis of each RIL population[29].A NAM population with many lines derived from crosses using multiple parents was considered to be more promising for genetic dissection of complex traits owing to its property of harboring multiple alleles per locus.Multi-parental populations increase diversity and QTL resolution by combining alleles from multiple sources[29].
Complex traits have been genetically dissected via highthroughput technologies developed for genome sequencing.The thorough and accurate QTL-alleles detection in germplasm population has been advanced by restricted two-stage multi-locus genome-wide association studies(RTM-GWAS)[30].This novel approach can lower down the risk of both overflowing and missing heritability and can correct population bias due to admixture and inbreeding in a germplasm population[30].The idea of multiple-allele detection per locus is implemented by grouping SNPs into SNP linkage disequilibrium blocks(SNPLDBs)generating multiple haplotypes or alleles per locus.The genetic architecture of the population and its full genetic information is presented in the form of a QTL-allele matrix.This matrix is actually a compact form of core information in population genetics study[31].RTMGWAS comprehensively detected the QTL-allele system of 100-seed weight and seed isoflavone content in a soybean landrace population[32,33].Studies of 100-seed weight detected 263 alleles at 55 QTL with 79.57% contribution to phenotypic variance(PV)[32],while seed isoflavone content was under the control of 44 QTL and 199 alleles with 72.2%contribution to PV[33].Accordingly,Li et al.[34]suggested RTM-GWAS analysis as an efficient method for identification of QTL for days to flowering in a NAM population using SNPLDB marker system.QTL detected by RTM-GWAS included almost all of the QTL detected by four other methods(CIM,MCIM,JICIM,and MLM-GWAS)[34].Recently,Khan et al.[35]used RTM-GWAS to identify a QTL system conferring drought tolerance in a soybean NAM population,fitting a G×E model with a set of main effect QTL-alleles and G×E(GEI)QTLalleles.
The objectives of the present study were to map seedflooding tolerance QTL with RSL as indicator,to establish the QTL-allele matrix of a NAM population constructed with three parents to predict breeding potential in seed-flooding tolerance of the population,and to investigate the RSL/SFT gene system by gene annotation and gene ontology(GO)enrichment analysis.
Two RIL populations from the crosses Linhe×M8206(LM,104 RILs)and Zhengyang×M8206(ZM,126 RILs)were selected to compose a NAM population(designated as the LZM NAM population)which resulted in a total of 230 F2:7-derived RILs developed by single-seed descent.Because M8206 was the common parent of the two RIL populations,the NAM was also a half-sib population.In preliminary experiments,M8206 showed higher seed flooding tolerance than Linhe and Zhengyang,and Zhengyang was less susceptible than Linhe[20].
The RILs and their respective parents were planted for seed increase in a uniform environment in a randomized complete block design(RCB)experiment with three replications of each line,0.8 m×0.8 m hill plots at the Jiangpu Experimental Station of Nanjing Agricultural University,Nanjing,China in 2013 and 2014,respectively.At the third-node stage thinning was performed,retaining eight plants in each hill plot.Fully mature seeds were dried to constant weight at 35–40°C.
Intact seeds from two replications in each year were sampled for evaluation of seed-flooding tolerance.Two samples of 40 seeds for each line in each of the two replications in each year were used for seed-flooding evaluation under treatment and control experiment.The seeds were soaked by complete immersion in distilled water for 48 h in an incubator at 25°C and then dried on filter paper for 6 h and transferred to a rectangle of germination paper(Hangzhou Whatman-Xinhua Filter Paper Co.,Ltd.,Hangzhou,Zhejiang,China),which was then rolled to enclose the seeds as described by Ali et al.[20].The experiment was arranged in a completely randomized design with two replications in two years under continuous light(25,000 lx)at 25 °C.Five days after the transfer to paper rolls,seedling growth was recorded.Seedling length was measured with a common ruler and the relative seedling length(RSL)of each inbred line was calculated as the value under the treatment(flooding stress)divided by that under the corresponding control(normal condition).
The procedure of Khan et al.[35]was used where drought indicators data were subjected to analysis of variance(ANOVA)completely randomized design for two years and environments using the PROC GLM procedure under the random model of SAS/STAT software[36].Heritability was estimated as h2=(σ2g)/(σ2g+σ2ge/n+σ2e/nr),whereσ2g,σ2ge,andσ2eare estimated variances of genotype,genotype×environment interaction,and random error,respectively,while n and r refer to the numbers of environments and replications,respectively[37].The genotype-by-environment interaction(GEI)heritability was calculated as h2ge=(σ2ge/n)/(σ2g+σ2ge/n+σ2/nr).The variance components were estimated using the REML method of PROC VARCOMP in SAS/STAT software[36].
In the present study,SNP genotyping of the NAM population was performed by restriction-site-associated DNA sequencing(RAD-seq).The method was the same as described by Khan et al.[35]in mapping QTL for drought tolerance but another NAM population was used in the present study.The major points are briefly described here.The leaves of soybean seedlings were used for genomic DNA extraction using the CTAB method[38].The NAM lines were sequenced by multiplexed shotgun genotyping using Illumina HiSeq2000[39].The resulting DNA fragments were between 400 and 600 bp,totaling 1144.56 million paired-end reads of 90 bp(including a 6-bp index)in length(in total 110.87 Gb of sequence),with depth 3.86 and coverage 4.57%.All the sequences were aligned against the genome of Williams 82[40]with SOAP2 software and filtered at a rate of the missing and heterozygous allele calls≤0.3 and minor allele frequency(MAF)≥0.01,and then imputed with fastPHASE[41–43].In all,55,936 SNPs in total were identified in the NAM population.
To establish a marker system with multiple haplotypes/alleles,SNP linkage disequilibrium blocks(SNPLDB)were established by the confidence interval method,adopting the default configuration of Haploview software[44]except for maximum distance and minimum MAF,which were set to 200 kb and 0.01,respectively[44].A SNPLDB may include multiple SNPs with multiple haplotypes/alleles but also may include only a single SNP with two haplotypes/alleles.If the allele frequency of a haplotype is less than 1%,it is replaced with an approximate haplotype with highest frequency.Finally,a total of 6137 SNPLDBs were identified in the NAM population[34].
The novel RTM-GWAS strategy was applied for detecting the seed-flooding tolerance QTL and their alleles using RTMGWAS software,publicly accessible at https://github.com/njau-sri/rtm-gwas[30].The main-effect plus QEI model used by Khan et al.[35]was used.The 524 markers were preselected after the first stage of RTM-GWAS using a single locus model,while at the second stage,a multi-locus model was built by applying stepwise regression to the preselected markers.For population structure correction,the top 10 eigenvectors of the genetic similarity coefficient matrix built on SNPLDBs were incorporated as covariates.A threshold of 0.05 was used for both marker preselection at the first stage and QTL identification at the second stage.Significant maineffect and QEI-effect markers/QTL were identified along with their P-values and corresponding allele effects.QTL-allele matrices for main and QEI effects were established for both the NAM population and the three parents.The nomenclature of QTL followed a popular method[45]and QTL were ordered by their positions on chromosomes[45].The genetic variation associated with unmapped minor QTL was calculated as(h2-R2Total)where R2Totalis the total contribution of the detected main-effect QTL to phenotypic variance.
The prediction of the best RSL/SFT genotype was calculated from the parental main-effect QTL-allele matrix,not including QEI-effects because environment(year or climate)was not fixed but random.The 26,335 possible crosses among the 230 lines were simulated.The QTL-allele matrix was used to generate 2000 homozygous progeny for each cross by singleseed descent starting from individual F2plants.The genotypic values of the progenies were calculated for all possible crosses under the independent assortment and linkage models,respectively,following He et al.[30].The predicted 95thpercentile genotypic value of a cross was used as the threshold and the 10 best crosses were identified.
The gene system inferred from the detected QTL system for seed-flooding tolerance was established as follows:(1)The annotated genes were searched within the intervals of associated SNPLDBs or their flanking SNPLDBs if no gene was present inside the detected SNPLDB;(2)the association between SNPLDBs and SNPs in annotated genes were chisquare tested to identify candidate genes among annotated genes.The annotated gene was recognized as a candidate gene if all SNPs in an SNPLDB were associated with those in the annotated gene.For candidate genes,Gene Ontology classification was performed based on SoyBase(https://www.soybase.org/goslimgraphic_v2/dashboard.php).
The evaluation results of RSL for the three parents confirmed the higher tolerance of M8206(with RSL value of 0.96)in comparison to Linhe and Zhengyang(with RSL value of 0.58 and 0.68,respectively)(Table 1).The NAM population showed continuous phenotypic variation from 0.30 to 1.34 with a mean of 0.82(Table 1).The smooth distribution of the segregating population indicates that the multiple-locus system has a prominent effect on relative seedling length under seed-flooding stress.The population showed significant RSL variation among lines as well as among line×year interactions under seed-flooding stress(Table S1).The heritability and GCV values were estimated by ANOVA to be 55.80%and 9.43%,respectively.
Fig.1 shows the average linkage-based hierarchical cluster analysis and pairwise genetic similarity coefficient between lines using SNPLDB markers in the half-sib(NAM)LZM population.The genetic structure shows that the NAM population was exactly related to the two RIL populations i.e.LM and ZM without evident population bias.Using the RTMGWAS method,33 SNPLDBs were identified as being associated with RSL(Table 2;Fig.2),distributed throughout the genome except on chromosomes 3,16,and 18,with chromosome 13 carrying the most(6)SNPLDBs associated with RSL/SFT.The positions of SNPLDB in base pairs are listed in Table 2.Among the 33 loci,26 were significant main-effect QTL explaining 50.949%of phenotypic variation(PV),while 12 were significant QEI QTL explaining 14.794% PV(Tables 2 and 3).The genetic contributions of individual main-effect QTL varied from 0.537% to 3.996%.QTL with phenotypic variation greater than 2% were assigned as large-contribution(LC)major QTL while those with less than 2% were as smallcontribution(SC)major QTL with significant QTL are presented in Table 3.The heritability estimates showed that 55.80% of PV was accounted for by genetic variation.The mapping results suggested that 28.417% of variation was contributed by the 10 LC major QTL with R2ranging from 2.055% to 3.996%(Tables 2 and 3);similarly,16 SC major QTL explained 22.532% of PV with R2ranging from 0.537% to1.916%;while the remaining 55.80%–28.417%–22.532%=4.85%was explained by unmapped minor QTL.The heritability of QEI was calculated from ANOVA(Table S1)as 17.67%.Among the 12 significant QEI QTL,one was a LC QEI QTL with R2of 3.019%and the remaining 11 significant QEI QTL were SC QTL with R2ranging from 0.509% to 1.993%,while the remaining contribution of 17.67%–3.019%–11.775%=2.88% is accounted for by unmapped minor QEI QTL.Among the 26 main effect QTL and 12 QEI QTL,five,RSL-a-01-1,RSL-a-05-3,RSL-a-07-1,RSL-a-08-1,and RSL-a-10-2 showed both significant main and QEI effects,with single-locus joint contribution to PV of 1.046%–5.074%,indicating their contributions to both main effects and Q×E interaction(Table 2).
Table 1–Frequency distribution and descriptive statistics of RSL in the LZM NAM population.
Fig.1–Average linkage-based hierarchical cluster analysis and pairwise genetic similarity coefficients between lines using SNPLDB markers in the half-sib(NAM)LZM population.The color bars on the horizontal and vertical axes represent the source recombinant inbred lines(red and blue represent ZM and LM,respectively)arranged according to the results from cluster analysis.The colors in the matrix represent pairwise genetic similarity coefficients in the half-sib LZM population.
For each QTL,2 to 3 alleles were detected with a mean number of alleles per locus of 2.36,for a total of 78 alleles for the 33 loci.The 63 alleles of the 26 significant main-effect QTL showed allele effects ranging from-0.2166 to 0.1847,including 35 positive alleles with effects ranging from 0.0028 to 0.1847 and 28 negative alleles with effects ranging from-0.2166 to-0.0028.The details of the allele effects associated with the significant QTL are described in Table 3 and Fig.3A.The 27 QEI-allele effects of the 12 significant QEI loci were further categorized into 14 positive(ranging from 0.0019 to 0.1005)and 13 negative ones(ranging from-0.1005 to-0.0003)(Table 3;Fig.3B).Five QTL,RSL-a-01-1,RSL-a-05-3,RSL-a-07-1,RSL-a-08-1,and RSL-a-10-2 with 12 alleles were significant for both main and QEI effect.The main-effect and QEI-effect allele effects along with the genotypes of the NAM population were further organized into a 26×230(or 63×230 by allele)maineffect QTL-allele matrix and 12×230(or 27×230 by allele)QEIeffect QTL-allele matrix(Fig.4).These QTL-allele matrices contain information about the population genetic structure of RSL and in fact,are the matrices of the genetic constitution of the NAM population.No line carried all of the negative or positive alleles(Fig.4).The number of positive alleles increased with the increase of the phenotypic value,showing the genetic backgrounds of response that why a line performed well under the seed-flooding condition.For example,181 out of 260 alleles were positive in the group of 10 highly tolerant(RSL>0.94)lines(Z003,Z002,Z103,Z107,L085,Z128,Z054,Z097,L051,and Z054)with an average of 18.1 positive alleles per line(Table S2).
Each allele at a locus is contributed either by a single parent or both parents.The parental lines carried both positive and negative alleles at the detected loci,indicating that they carried both favorable and deleterious alleles regardless of their phenotypes.This finding further supports the potential for accumulating favorable alleles in progeny using the QTL-allele matrix obtained from RTM-GWAS(Fig.4).From the main-effect QTL and their allele effects,a QTL-allele matrix was also established for the three parents as shown in Table 4.
Since the environment factor‘Year’was random and could not be defined by its exact effects on a growth chamber experiment,the following analysis will focus on the maineffect QTL system rather than on the QEI QTL system;that is,on the population genetic constituents under average-year conditions rather than in individual years.
In Table 4,like the lines in the LZM population,the three parents carry positive and negative alleles.When the best allele for each of the loci is identified,the best genotype can be
predicted as the combination of all of them.The highest genotypic value in the genetic constitution of the three parents was predicted to be 1.924,a value much higher than those of the three parents,whose values fell in the range 0.652–1.069.These values are genotypic rather than phenotypic values and were calculated as parental allele effects added to the overall mean(Table 4).
Table 2–SNPLDB associated with seed-flooding tolerance using RSL as indicator.
Fig.2–Manhattan(A)and quantile-quantile plots(B)of the genome wide association study of RSL in LZM NAM population.The red line in the Manhattan plot represents the significance threshold of P≤0.05 or-log10P≥1.3.P,the model-based joint probability value in RTM-GWAS.
Table 3–Summary of the significant QTL-allele system that governs RSL in the LZM NAM population.
Although the best genotype may be predicted from the cross of three parents,the probability of the best genotype occurring naturally is very low.An alternative way to obtain a best genotype is to select the best progeny from the best crosses among the 230 NAM lines.To predict the optimal crosses(two-way or simple cross)from the NAM population,the computer program RTM-GWAS under the independent assortment and linkage models was used for simulation comparisons with the 95th percentile of the progeny distribution as indicator,assuming each cross to generate 2000 progeny.Thus 26,335 possible single crosses among the 230 lines in the NAM population were inspected for their 95th percentile genotypic values.From these,the 10 top crosses under the independent-assortment model and 10 under the linkage model were identified and are presented in Table 5 with their 95th percentile genotypic RSL values ranging from 1.494 to 1.562 and 1.826 to 1.914,respectively.The predicted values from the linkage model are higher than those from the independent assortment model and the 95th percentile genotypic value 1.914 is near the highest genotype value of 1.924.
From the 33 RSL QTL,90 possible genes were identified,among which 33 candidate genes lying in 24 QTL regions were identified by chi-square tests for SNP association between the candidate genes and the QTL(Table S3).GO analysis grouped the 33 candidate genes into six categories(Table S4;Fig.5).These were biosynthetic pathway(I),with 6 candidate genes contributing 10.44% to PV;transport activity(II),with 4 candidate genes contributing 4.37%to PV;signal transduction(III),with 3 candidate genes contributing 4.51% to PV;metabolic process(IV),with 5 candidate genes contributing 11.74% to PV;response to stress(V),with 10 candidate genes contributing 16.22% to PV;and unknown(VI)function with 5 candidate genes contributing 9.58%to PV(Table S4;Fig.5).The 33 candidate genes were scattered throughout the genome except on chromosomes 3,6,16,18,and 20.The candidate gene annotation indicated that the RSL/SFT gene system involves multiple biological-related processes with diverse functional genes comprising a gene network rather than only a few genes.
To our knowledge,the 33 SFT QTL have not been reported previously for soybean,especially the main-effect QTL distinguished from the QEI QTL.Among the main-effect QTL,RSL-a-06-1 overlapped a QTL for young plant waterlogging tolerance reported by Ahmed et al.[46].The main-effect and QEI QTL RSL-a-10-2 and main-effect QTL RSL-a-19-1 mapped here were near the QTL at an early vegetative growth stage reported by Githiri et al.[10].Some QTL overlapped with those of flooding traits or flooding-associated traits,such as seed weight(Table S4).These findings indicate that the QTL for RSL found in LZM NAM population using RTM-GWAS has more efficiency to cover tolerance range than indicators,materials and mapping procedures used in previous flooding-related QTL mapping.
The QTL identified were main-effect or QEI QTL,or both.In addition to the main-effect QTL,the QEI QTL accounted for a certain part of the total QTL-alleles(12 QTL with 27 alleles contributing 17.67% phenotypic variation for QEI effect vs.26 QTL with 63 alleles contributing 55.80% phenotypic variation for main effect in the present study).Because the environmental change was mainly change between years or climate,seed flooding tolerance was largely influenced by climate conditions.The most devastating change in the current era is climate shift,which has increased the importance of QEI QTL detection and its utilization in plant adaptation to varying environmental conditions[47].The combination of maineffect and QEI-effect QTL-alleles could be exploited to increase genotypic tolerance to stress in a specific environment[48].Like other abiotic stresses,QEI has great influence on flooding tolerance breeding schemes,as the extent of flooding damage is variable depending upon other environmental factors such as temperature.The detection and exploitation of both main and QEI effects of the QTL may lead to success of breeding programs for increasing flooding tolerance in genotypes because stable QTL across different environments are crucially important for successful marker-assisted breeding as compared to environment-specific QTL[49,50].
Fig.3–The allele effects of the main-effect RSL/SFT QTL(A)and QTL×Env.interaction QTL(B)in the NAM population.Alternating brown and blue bars indicate QTL,with each bar representing an allele of a QTL with positive or negative effect.Vertical gray lines(dash dots)separate chromosomes and gray lines(squared dots)separate QTL on chromosomes.The allele effects at each locus are shown in increasing order.
Based on the detected QTL system,33 candidate genes were identified from 24 of the 33 QTL,and were either directly or indirectly associated with seed flooding tolerance involved in 6 biological processes.The diversity of the biological processes confirmed the complex nature of flooding tolerance.Very few candidate genes conferring seed flooding tolerance have been reported.The candidate gene Glyma10g34490 in the QTL RSL-a-10-2 was found to be directly involved in tolerance to flooding in soybean[51].Glyma01g32090.1 encodes alanine glyoxylate aminotransferase,which is involved in fermentation under flooding conditions[52].Roots and shoots under flooding conditions will increase fermentation reactions for generating ATP as a tolerance mechanism[52].The enzyme catalyzes the final step of glycolysis.A single pyruvate molecule and one ATP molecule are generated when the enzyme transfers the phosphate group from phosphoenolpyruvate(PEP)to adenosine diphosphate(ADP).Glyma0411211 associated with a VQ motif was found[53]to be involved in soybean growth,development and nodulation.
In summary,soybean seed flooding tolerance depends on a complex and integrated QTL/gene network involving multiple biological pathways.
The RSL/SFT QTL alleles were organized into a compact matrix showing the complete RSL/SFT genetic architecture of the three parents as well as their NAM population.Using this matrix,the performance of possible crosses or the genetic potential of the population can be predicted.This matrix was made possible by detection of most of the QTL along with their alleles by RTM-GWAS procedure indicated that linkage among alleles in the population promoted the accumulation of elite RSL alleles and facilitated recombination.Thus,the QTL-allele matrix can provide an approach for genotyping lines and predicting the genetic potential among them.Accordingly,breeders can predict the probability of creating the most favorable allele combinations or segregants to obtain genotypes tolerant to seed-flooding.In fact,the experimental error was not very small in the present study(Table S1).Even so,the RTM-GWAS procedure identified many of the maineffect and QEI-effect QTL or explained almost all the genetic factors conferring RSL/SFT in the NAM population(Table 3).If the experimental precision can be further increased,more QTL alleles(mainly those with contributions lower than the detected QTL)can be identified.
The reasons for the high power and efficiency of RTMGWAS are its built-in advantages.The first is that the SNPLDB markers with multiple haplotypes fits a population with multiple alleles,making mapping more powerful.The second is that RTM-GWAS performs two mapping stages.At the first stage a large number of markers is pre-selected(524 of the 6137 in this study)to reduce noise and make the final mapping more precise.The third is that at the second stage,a multi-locus multi-allele model is used for stepwise regression.This can control QTL selection within the experiment heritability value thereby controlling both false positives and false negatives.Because the multi-locus model test has examined all the involved loci,there is no need to use additional multiple tests such as Bonferroni correction.The fourth is that no additional population adjustment is needed because the NAM population was constructed from multiple bi-parental populations and the genetic similarity coefficient matrix is a sufficient population structure adjustment[30,32–34].
Fig.4–Main-effect(A)and QTL×environment interaction(B)RSL/SFT QTL-allele matrices in the LZM NAM population.The horizontal axis shows RSL values,with green color lines denoting ZM and pink color lines denoting LM RILs.The vertical axis shows the QTL of the RSL,with each cell in the row indicates an allele with its effects expressed as the color thickness.Cells with warm colors represent positive alleles and those with cool colors negative alleles,and the depth of the color corresponds to the size of the allele effect.
Based on the relatively thorough identification,the QTLallele matrix provides all the genetic information in the involved population and from it all kinds of genetic analysis and breeding plan can be designed and achieved.In addition,the genetic architecture of cultivars with both positive and negative alleles enables a breeder to successfully optimize and predict transgressive segregation and best genotypes.The predictions using linkage and independent assortment models results in the identification of the best crosses[33].Accordingly,the approach will help the breeder in making decisions without missing potential sources of tolerance alleles important for stress management.
By use of RTM-GWAS with relative seedling length(RSL)as an indicator of seed flooding tolerance to identify a QTL-allele system in a NAM population composed of two RIL populations with one common parent,26 main-effect and 12 QEI in a total of 33 QTL with 78 alleles were detected,showing the complex genetic architecture of the RSL/SFT genetic system.From the QTL-allele matrices of the three parents and their NAM population,the best genotype and optimized crosses were predicted,showing the high potential for obtaining transgressive recombinants for RSL/SFT.The candidate genes identified in this study belong to diverse categories of biological process,cellular component,and molecular function,showing the involvement of diverse mechanisms in the RSL/SFT genetic control system.Thus,the results may provide a way to match the breeding by design concept.
Table 4–The genetic constitution of RSL of the three parents in the LZM NAM population(mean over two environments).
Table 5–Ten top combinations for RSL/SFT predicted from the possible crosses among 230 lines in the LZM NAM population.
CRediT authorship contribution statement
Muhammad Jaffer Ali and Junyi Gai designed the experiments;Guangnan Xing and Tuanjie Zhao prepared the plant materials;Muhammad Jaffer Ali performed data collection;Muhammad Jaffer Ali,Guangnan Xing,Jianbo He,and Junyi Gai analyzed the data;Muhammad Jaffer Ali and Guangnan Xing prepared the tables and figures and wrote the manuscript;Junyi Gai revised and finalized the manuscript.
Declaration of competing interest
The authors declare there are no conflicts of interest.
Acknowledgments
This work was financially supported by the National Key Research and Development Program of China(2016YFD0100201,2017YFD0101500,2017YFD0102002),the National Natural Science Foundation of China(31571694,31671718,31571695),the MOE 111 Project(B08025),the MOE Program for Changjiang Scholars and Innovative Research Team in University(PCSIRT_17R55),the MARA CARS-04 Program,the Jiangsu Higher Education PAPD Program,the Fundamental Research Funds for the Central Universities(KYT201801)and the Jiangsu JCIC-MCP.The funders had no role in work design,data collection and analysis,and decision and preparation of the manuscript.
Fig.5–Gene ontology classification of potential genes associated with seed-flooding tolerance in the LZM NAM population for RSL.
Appendix A.Supplementary data
Supplementary data for this article can be found online at https://doi.org/10.1016/j.cj.2020.06.008.