TIAN Yu,YANG Lei,LU Hong-feng,ZHANG Bo,LI Yan-fei,LIU Chen,4,GE Tian-li,LIU Yu-lin,HAN Jia-nan,LI Ying-hui,QlU Li-juan
1 National Key Facility for Gene Resources and Genetic Improvement/Key Laboratory of Crop Germplasm Utilization,Ministry of Agriculture and Rural Affairs/Institute of Crop Sciences,Chinese Academy of Agricultural Science,Beijing 100081,P.R.China
2 Novogene Bioinformatics Institute,Beijing 100015,P.R.China
3 School of Plant and Environmental Sciences,Virginia Polytechnic Institute and State University,Blacksburg,VA 24060,USA
4 School of Life Sciences,Liaoning Normal University,Dalian 116081,P.R.China
5 College of Forestry,Northwest A&F University,Yangling 712100,P.R.China
Abstract Plant height is an important agronomic trait,which is governed by multiple genes with major or minor effects.Of numerous QTLs for plant height reported in soybean,most are in large genomic regions,which results in a still unknown molecular mechanism for plant height.Increasing the density of molecular markers in genetic maps will significantly improve the efficiency and accuracy of QTL mapping.This study constructed a high-density genetic map using 4 011 recombination bin markers developed from whole genome re-sequencing of 241 recombinant inbred lines (RILs) and their bi-parents,Zhonghuang 13 (ZH) and Zhongpin 03-5373 (ZP).The total genetic distance of this bin map was 3 139.15 cM,with an average interval of 0.78 cM between adjacent bin markers.Comparative genomic analysis indicated that this genetic map showed a high collinearity with the soybean reference genome.Based on this bin map,nine QTLs for plant height were detected across six environments,including three novel loci (qPH-b_11,qPH-b_17 and qPH-b_18).Of them,two environmentally stable QTLs qPH-b_13 and qPH-b_19-1 played a major role in plant height,which explained 10.56-32.7% of the phenotypic variance.They were fine-mapped to 440.12 and 237.06 kb region,covering 54 and 28 annotated genes,respectively.Via the function of homologous genes in Arabidopsis and expression analysis,two genes of them were preferentially predicted as candidate genes for further study.
Keywords:soybean,plant height,whole genome re-sequencing,bin map,QTL
Soybean (Glycinemax(L.) Merr.) (G.max) is one of the most essential oilseed crops worldwide and a leading source of plant protein and oil (Zhanget al.2004;Liet al.2017).Plant height (PH) is one of important agronomic traits associated with seed yield for erect soybean plants(Zhanget al.2018).Plant height is a quantitative trait controlled by multiple genes (Pantheeet al.2007;Liuet al.2013;Leeet al.2015).Therefore,quantitative trait loci (QTL) mapping is an effective strategy to understand the genetic basis of plant height in soybean.
In previous studies,numerous QTLs for plant height were reported in soybean (Leeet al.1996;Mansuret al.1996;Spechtet al.2001;Chapmanet al.2003;Wanget al.2004;Zhanget al.2004;Kimet al.2012;Liuet al.2013;Leeet al.2015).However,most QTLs were mapped within large genomic regions due to the relatively low to moderate density of genetic maps and subsequently,limiting the cloned genes for plant height in soybean due to the limited recombination events in the mapping populations.Compared to traditional highly time-consuming and laborious markers such as simple sequence repeats (SSRs),amplified fragment length polymorphisms (AFLPs) and restriction fragment length polymorphisms (RFLPs),single nucleotide polymorphisms(SNPs) show the most abundant and stable genetic variations in genome (Stoeltinget al.2013).A highdensity genetic linkage map based on high throughput-SNP genotyping platform is vital to improve the accuracy of QTLs mapping and explore the genetic basis of important agronomic traits (Liet al.2014).
With the development of next-generation sequencing technology,several whole genome sequencing technologies have become powerful tools for detecting SNPs and genotyping materials in large mapping populations,including whole genome re-sequencing(WGR) (Huanget al.2009;Liet al.2013;Zhouet al.2015) and reduced-representation genome sequencing(restriction-site-associated DNA sequencing,genotypingby-sequencing and specific-locus amplified fragment sequencing,etc.) (Elshireet al.2011;Liet al.2017;Caiet al.2018;Jiet al.2018).As a result,SNPs have become the favored molecular marker in many plant breeding and genetics studies (Songet al.2013).In addition,recombination bin markers using SNPs developed from whole genome re-sequencing have become novel and efficient molecular markers,which have been used to construct high-density genetic maps (Huanget al.2009).Bin markers can cover more variations and provide good simplicity,which not only greatly reduce the impact of single SNP error on genotyping,but also greatly improve the accuracy of genotyping and mapping(Xieet al.2010;Chenet al.2014).The bin map has been demonstrated to be more powerful for fine mapping QTLs related to important agronomic traits than traditional genetic map (Huanget al.2009;Yuet al.2011;Xuet al.2013;Chenet al.2014;Liuet al.2015;Wanget al.2016).For example,The rice“green revolution”gene for plant height was fine-mapped within 100 kb genomic region using a high-density bin genetic map developed from resequencing 150 recombinant inbred lines (RILs) at~0.02×depth (Huanget al.2009).A major QTL for root-knot nematode resistance in soybean was fine-mapped to 29.7 kb genomic region with an average of 0.19× depth for 246 RILs (Xuet al.2013).A major QTL for maize leaf width,qLW4,was greatly reduced to a 270-kb genomic region by re-sequencing 670 RILs (Wanget al.2018).
Liuet al.(2013) have identified 11 QTLs for plant height using 508 molecular markers (313 SNPs,167 SSRs,four EST-SSRs and 24 InDels) and 254 RILs derived from Zhonghuang 13 (ZH,as male parent)×Zhongpin 03-5373(ZP,as female parent),but most QTL regions (0.23-15.26 Mb)were too wide due to relatively low-density of the markers on the genetic map.In the current study,the objectives were to construct a high-density genetic map using 4 011 bin markers developed by whole genome re-sequencing and to improve the efficiency and resolution of QTLs mapping for plant height in soybean.The results will contribute to the analysis of genetic basis of plant height and marker-assisted polygene pyramid breeding in soybean.
A RIL population which consists of 254 lines derived from a cross between ZP×ZH was developed and described in a previous study (Liuet al.2013).The parent ZP was an advanced breeding line developed and conserved by the Institute of Crop Sciences,the Chinese Academy of Agricultural Sciences (ICS-CAAS),Beijing,China.The other parent ZH (ZDD23876) was an elite cultivar selected by Prof.Wang Lianzheng (ICS-CAAS),and the voucher specimen was deposited in the National Crop Gene Bank,Chinese Academy of Agricultural Sciences.Two parents and RILs were planted in six environments(Liuet al.2013),including Changping (40.2°N,116.2°E)in Beijing from 2009 to 2011 (CP09,CP10 and CP11),Sanya (18.2°N,109.5°E) in Hainan Province from 2009 to 2010 (SYa09 and SYa10),and Shunyi (40.1°N,116.7°E)in Beijing in 2010 (SYi10).The parents ZP and ZH with determinate growth habit exhibited significant phenotypic variation for plant height,but no significant difference(P=0.57) for maturity time.Descriptive statistics of PH trait used in the study are summarized in Appendix A.The average plant height of ZP (72.6 cm) was consistently higher than that of ZH (64.6 cm) in Beijing (CP09,CP10,CP11,and SYi10).Similarly,the average plant height of ZP (22.1 cm) was also higher than that of ZH (20.3 cm)in Hainan (SYa09 and SYa10).For RILs,the average value of plant height in different environments ranged from 19.3 to 79.3 cm,and the min.and max.values varied from 12.21 to 55.5 cm and from 31.13 to 122.5 cm,respectively.Skewness and kurtosis values were less than 1.0 in all planted environments.Field trials were described in a previous study by Liuet al.(2013).Briefly,a complete randomized design was used for every environment with three replications.Each plot comprised a 1.5-m row,with 0.55 m apart between rows and a distance of 0.10 m between adjacent plants.The plant height of 10 randomly selected plants in the middle of each row was measured when the plants reached maturity.Plant height was defined as the distance from cotyledon node to the peak of the main stem.Best linear unbiased predictors (BLUP)(Henderson 1975) were performed for the plant height data across six environments.
Approximately 1.0 μg genomic DNA was extracted from young leaf of each sample using the CTAB method (Doyle and Doyle 1990) and prepared in libraries for sequencing using a TruSeq Nano DNA HT sample preparation kit (Illumina,USA) following the manufacturer’s recommendations.Briefly,the genomic DNA samples were fragmented by sonication to a size of~350 bp,then end-polished,A-tailed,and ligated with the fulllength adapters for Illumina sequencing with further PCR amplification.Index codes were added to facilitate the differentiation of sequences from each sample.The PCR products were purified (AMPure XP Bead System;Beckman Coulter,USA) and the libraries were analyzed for their size distribution using an Agilent 2100 Bioanalyzer (Agilent Technologies,USA) and quantified using real-time PCR.
The libraries were sequenced using the Illumina HiSeq X platform (Illumina).To ensure reliable reads without artificial bias,some reads were removed including the low-quality paired reads.The remaining high-quality paired-end reads were mapped against the soybean reference genome (Wm82.a2.v1,https://phytozome.jgi.doe.gov) using Burrows-Wheeler Aligner Software with the command“mem -t 4 -k 32 -M”(Li and Durbin 2009).To reduce mismatches generated by the PCR amplification before sequencing,SAMtools (v0.1.19) (Liet al.2009a)were used to remove the duplicated reads.After the alignment,SNP calling was performed on a population scale using a Bayesian approach by Genome Analysis Toolkit (GATK) Software (Mckennaet al.2010).The SNP annotation was then performed using the ANNOVAR package (version 20130520).
Of 254 lines sequenced,241 lines were utilized to construct the bin genetic map using the sliding window method (Huanget al.2009),while 13 lines were removed because they missed more than 20% of the sequencing data.Genotypic data were scanned with a window size of 15-SNPs and a step size of 1.For each individual,the ratio of SNPs from either ZH or ZP within the window was calculated.Windows containing 11 or more SNPs from either parent were defined as homozygous,while those with less were defined as heterozygous.Adjacent windows with the same genotypes were combined into blocks and the recombinant breakpoints were assumed as the boundary of adjacent blocks with different genotypes.A bin map was constructed by aligning and comparing the genotypic maps of individual RILs over 100 kb intervals.Consecutive 100 kb intervals lacking a recombination event within the entire population were joined into bins.The high-density genetic map was constructed using the Kosambi mapping function of the IciMapping3.1 (Liet al.2008).Pearson correlation coefficient was performed for the co-linearity analysis of bin makers between 20 linkage groups and reference genome using SAS 9.3 (http://www.sas.com/).
Inclusive composite interval mapping (ICIM) was used to detect QTLs for plant height in six environments and their BLUP value using QTL IciMapping3.1 (Liet al.2008),with 1 cM walk speed and theP-values for markers entering into variables (PIN)=0.001.A logarithm of the odds (LOD)score of 2.5 was used as the threshold to declare the presence of a putative QTL.In addition,ICIM-EPI method was also used to detect epistatic interactions between markers with the LOD threshold of 4.0.The name of QTLs identified in different environments of this study was a composite of PH and a letter“b”to be distinguished from a previous study by Liuet al.(2013),followed by the chromosome number.If the QTL had the same or overlapping marker intervals,they were designated as the same QTL.
To analyze the expression pattern of candidate genes for plant height,two parents and eight RILs with extreme plant height were selected to construct dwarf and tall plant bulks.Fresh leaf and stem tissues at three developmental stages,V1 (fully developed leaves at unifoliolate nodes),V3 (three nodes on the main steam with fully developed leaves beginning with the unifoliolate nodes),and R1 (one open flower at any node on the main stem),were collected from two parents and four shortest RILs and four tallest RILs,respectively.Then all samples were immediately frozen in liquid nitrogen,and preserved at -80°C for RNA extraction.Total RNA for all samples was extracted using an RNA Prep Pure Plant kit (Tiangen Co.,Beijing,China).Two dwarf and tall RIL bulks were constructed using equal quality RNA of every RILs.For reverse transcription,the first-strand cDNA was synthesized using FastKing RT Kit(with gDNase) (TransGen Co.,Beijing,China).Realtime PCR was performed on an ABI 7300 Real-Time PCR System using a SYBR PremixEx Taq? kit (TaKaRa,Japan) and three replications were performed for each sample.The primers for expression analysis were designed using Primer 5.0,and blasted on the Phytozome database (https://phytozome.jgi.doe.gov) to ensure specific amplification.The soybean geneGmactin11(Glyma.18G290800) was utilized as the internal control for normalization.The relative gene expression levels were calculated using the 2-ΔΔCtmethod (Livak and Schmittgen 2001).All primers for expression analysis are listed in Appendix B.
The parental lines (ZH and ZP),and derived RILs were re-sequenced using the Illumina HiSeq X platform.In total,this study sequenced 415.93 Gb raw data,including 35.07 Gb for parental lines and 380.86 Gb for RILs.After removing the low-quality paired reads,this study generated 34.86 Gb high quality paired-end reads for parental lines and 376.09 Gb for 241 RILs,with the average sequencing depth of approximate 16.99×for parental lines and 2.09× for RILs,respectively.All sequence read data in this study have been deposited into Sequence Read Archive (SRA) under BioProject accession PRJNA685296.A total of 125 747 328 reads for ZH and 107 868 279 reads for ZP were aligned with the soybean reference genome (Wm82.a2.v1,https://phytozome.jgi.doe.gov).A total of 3 597 884 and 3 894 277 SNPs were identified in ZH and ZP,respectively.This study identified 2 218 849 SNPs in parents ZH and ZP(Appendix C),which were classified into eight segregation patterns (nn×np,ab×cd,ab×cc,aa×bb,lm×ll,ef×eg,hk×hk,and cc×ab) by a genotype encoding rule (Appendix D).A total of 2 181 697 749 reads for RILs were aligned with reference genome,with an average of 9 052 687.76 reads per individual.Since the two parents (ZH and ZP) are homozygous inbred lines carrying aa and bb genotypes,826 749 SNPs from“aa×bb”segregation patterns were identified in the RILs for linkage analysis (Appendix E).These SNPs covered 946.96 Mb soybean genome,and the number of SNPs ranged from 13 965 on chromosome 04 (chr04) to 101 548 on chr15,with a mean of 41 337 SNPs per chromosome and approximate 1.15 kb per SNP throughout 20 chromosomes.Large-effect SNPs were also detected including 25 103 non-synonymous SNPs and 951 SNPs that resulted in gain (574)/loss (84) of stop codon,and splicing sites (293) with the change of the junction sites of exon and intron (Appendix E).
A total of 4 281 recombination bin markers with an average physical length of 221.2 kb were detected using 826 749 SNPs in the entire RILs population (Fig.1).Of them,3 218 (75.17%) bin markers were less than 200 kb in length,128 bin markers were more than 1 Mb in length,and one of them (mk4477:15 402 031-25 930 390 bp) on chr20 was more than 10 Mb in length on pericentromeric region defined by Songet al.(2016) (Fig.2),which indicated that no-recombination event in the interval was identified in the entire RILs population of this study.This study further filtered out the bin markers that were unsuitable for genetic map construction such as markers with a high heterozygous ratio (>5%),especially on chr18 (Fig.1).Finally,4 011 bin markers were used to construct a high-density bin map.The Chi-square test was used to detect the segregation distortion markers.A total of 689 (17.18%) of 4 011 bin markers for the genetic map showed segregation distortion at the 0.05 level of significance.Segregation distortion markers were filtered out in some studies (Liet al.2017;Yanget al.2017),which might reduce the coverage of the genome and miss the QTLs located in segregation distortion regions (SDRs).In the current study,segregation distortion markers were retained to construct the genetic map and QTL mapping.
Fig.2 Distribution of the physical lengths of the bin markers in RILs population derived from Zhonghuang 13 (ZH,as male parent)×Zhongpin 03-5373 (ZP,as female parent).
The total genetic distance of the genetic map constructed using 4 011 bin markers was 3 139.15 cM,with an average interval of 0.78 cM between adjacent bin markers (Table 1).The length of different chromosomes ranged from 103.56 cM of chr16 to 222.35 cM of chr02.The intervals of 3 942 (98.28%) adjacent bin markers were less than 5 cM.Forty-four gaps were from 5 to 10 cM and 25 gaps were more than 10 cM.The largest gap was 33.13 cM on chr11.To compare the order of the bin markers,a collinear graph was plotted using the physical position of each bin marker on the soybean reference genome (Wm82.a2.v1,https://phytozome.jgi.doe.gov) against its genetic position on the genetic map.A relatively high collinearity was observed between 20 linkage groups and reference genome (Fig.3).Pearson correlation coefficient for 20 linkage groups varied from 0.792 to 0.983,indicating that it was easy to examine candidate genes by comparative mapping.
Fig.3 The collinearity of bin markers on the genetic map and the soybean reference genome.Red lines represent the linearity order of bin markers on the linkage map,and the blue lines represent the linearity order of physical position in the soybean reference genome.
Using the high-density bin genetic map and BLUP analysis (Fig.4),nine QTLs for plant height were identified and explained 3.01-32.7%of the phenotypic variance across six environments (Table 2;Appendix F).These QTLs distributed on eight chromosomes (2,11,12,13,16,17,18,and 19) ranging from 159.68 to 935.78 kb genomic region according to the soybean reference genome(Wm82.a2.v1,https://phytozome.jgi.doe.gov).Five QTLs (qPH-b_2,qPH-b_11,qPH-b_13,qPH-b_17,andqPH-b_18) showed a positive additive effect,and alleles of highparent ZP could increase the plant height.However,four QTLs(qPH-b_12,qPH-b_16,qPH-b_19-1andqPH-b_19-2) exhibited a negative additive effect.Alleles of dwarfparent ZH could increase plant height,which explained the transgressive segregation for plant height in the RILs population.Of nine QTLs,three novel loci (qPH-b_11,qPH-b_17andqPH-b_18) were identified in the current study,but not detected in a previous study which just used around one thousand molecular markers (Liuet al.2013).
Fig.4 High-density bin genetic map and distribution of nine QTLs for plant height in soybean.Bin markers are distributed on the 20 linkage groups (chr01-chr20).The black lines in each linkage groups represent the genetic position (centimorgans,cM) of the bin markers.Nine QTLs for plant height and flanking bin markers are also shown on the right side of each linkage group,and four stable QTLs are showed in red color.
Particularly,four QTLs (qPH-b_12,qPH-b_13,qPH-b_17,andqPH-b_19-1) were detected in at least two environments and could be considered as environmentally stable QTLs,explaining 3.31-32.7%of the phenotypic variance for plant height.Of them,qPH-b_12was identified in two environments (SYa09 and CP10),explaining 6.26 and 3.39% of the phenotypic variance for plant height,respectively (Table 2).In a previous study by Liuet al.(2013),qPH-12was mapped to 15.24 Mb of genomic regions.In the current study,itwas reduced to a 539.14-kb of genomic region,namedqPH-b_12.qPH-b_17was a minor but stable QTL covering 935.78 kb of genomic region,explaining 3.31-4.02% of the phenotypic variance for plant height,which was not observed in the previous study (Liuet al.2013).As reported in previous studies (Mansuret al.1993;Josie and Kassem 2007;Leeet al.2015;Okiet al.2018),qPH-b_13andqPH-b_19-1were two major QTLs.The former explained 19.93-32.7% of the phenotypic variance across six environments and BLUP analysis,and the latter explained 10.56% (CP09)-15.79% (CP11) of the phenotypic variance for plant height (Table 2).
Table 1 Characteristics of the high-density genetic map based on bin markers
Table 2 QTLs identified for plant height (PH) across six environments and BLUP analysis
Epistatic interactions were evaluated using a full twodimensional genome scan.Nine epistatic QTLs were detected using LOD values of 4.0 as the threshold,explaining 5.92-17.69% of the phenotypic variance for plant height (Appendix G).EI-PH-b_2,EI-PH-b_3 and EIPH-b_3 were three major epistatic interactions between markers.In addition,two environmentally stable QTLsqPH-b_13andqPH-b_19-1weren’t involved in epistatic interactions across six environments,indicating that the soybean plant height could be regulated using the additive effects from these two major QTLsqPH-b_13andqPH-b_19-1.
Environmentally stable and major QTLqPH-b_13was located on chr13 and flanked by markers mk2810 and mk2813.Compared to the 850 kb genomic region previously identified using around one thousand molecular markers in this RILs population (Liuet al.2013),the genomic region forqPH-b_13was narrowed down to 440.12 kb in the current study.According to the soybean reference genome (Wm82.a2.v1,https://phytozome.jgi.doe.gov),54 annotated genes were observed inqPH-b_13genomic region (Appendix H).Based on the function of homologous genes inArabidopsis,of them,Glyma.13G291700encoded a trichome birefringencelike protein whose family was involved in the synthesis and deposition of secondary wall cellulose,and mutants showed a dwarfed phenotype inArabidopsis(Bischoffet al.2010).Glyma.13G292500encoded TCP transcription factor which was a key regulator of cell proliferation,and some TCP mutants exhibited dwarfism and reduced responsiveness to gibberellin (GA) action inArabidopsis(Davièreet al.2014).Glyma.13G292600was annotated by the KEGG Orthology (KO) (K19038:E3 ubiquitin-protein ligase ATL41) which was related to plant height by regulating GA responses inArabidopsis(McGinniset al.2003).This study further analyzed expression levels of these candidate genes in the stem and leaf tissues of two parents and dwarf and tall RIL bulks at the three different developmental stages.These genes differentially expressed in stems and leaves of two parents and dwarf and tall RIL bulks at a significance level ofP=0.05 (Fig.5).The expression levels of genesGlyma.13G291700,Glyma.13G292500andGlyma.13G292600were significantly different between dwarf parent ZH and tall parent ZP at different developmental stages (V1,V3 or R1) in stem or leaf tissues,respectively (Fig.5-A-C).Differential expression of these genes observed between parents may not be totally associated with plant height.Therefore,this study analyzed the expression patterns of these candidate genes in dwarf and tall RIL bulks.Of them,expression level of geneGlyma.13G292600in tall parent and tall RIL bulk was consistently higher than that in dwarf parent and dwarf RIL bulk,respectively,except leaf tissue of the R1 stage.Especially in stems,the expression level ofGlyma.13G292600in tall RIL bulk was significantly higher than that in dwarf RIL bulk at three developmental stages.The result indicated thatGlyma.13G292600may be preferentially considered as the underlying gene forqPH-b_13.
Fig.5 Expression of five candidate genes in stem and leaf tissues of two parents and dwarf and tall RIL bulks at three developmental stages (V1,V3 and R1) detected by qRT-PCR. The y axis represents the expression levels of candidate genes (A-E) relative to expression of Gmactin11.ZH,Zhonghuang 13;ZP,Zhongpin03-5373.Error bars indicate SD (n=3).Asterisks indicate significant difference determined by Student’s t-test (*,P<0.05;**,P<0.01).
Another environmentally stable and major,qPH-b_19-1,was reduced to 237.06 kb genomic region covering 28 annotated genes (Appendix H).Of them,Glyma.19G194300,a known determinate growth habit gene (Tianet al.2010),was also related to the difference of plant height withDt1anddt1genotypes (Fanget al.2017).However,it couldn’t be the candidate gene forqPH-b_19-1because ZH and ZP showed the determinate growth habit phenotypes,and carried Gmtfl1-ta and Gmtfl1-bb alleles ofdt1genotype,respectively.This study also further screened the targe SNP ofdt1locus from extremely dwarf and tall RILs by the whole-resequencing data,which showed these RILs all carrieddt1genotype with determinate growth habit phenotype (Appendix I).Therefore,there is a novel gene for plant height inqPH-b_19-1region.Glyma.19G194100encoded little zipper 3 protein that was involved in proper functioning of stem cells in the shoot apical meristem(SAM) and its mutant had notably reduced growth inArabidopsis(Kimet al.2008).Glyma.19G195200was annotated by the KO (K14488:SAUR family protein)which was related to cell expansion through modulation of auxin transport,regulating hypocotyl length and leaf size inArabidopsis(Spartzet al.2012).Glyma.19G195200was involved in plant hormone signal transduction path to control cell enlargement,which may be related to stem development.Further,the expression levels of both genesGlyma.19G194100andGlyma.19G195200in stem and leaf tissues were significantly different between dwarf parent ZH and tall parent ZP at different stages,respectively (Fig.5-D and E).However,expression pattern ofGlyma.19G194100,rather than that ofGlyma.19G195200,was consistent with that in parents.The expression level ofGlyma.19G194100was significantly different between dwarf RIL bulk and tall RIL bulk at V1 and R1 stages.Therefore,the different expression levels of these candidate genes for QTLsqPH-b_13andqPH-b_19-1provided a strong evidence for their function as the underlying genes for plant height.
The homolog genes of these candidate genes in this study were obtained by the blast analysis using Phytozome data,and to our knowledge,the function of these homolog genes haven’t been published in soybean.Interestingly,this study found thatGlyma.12G208600,close paralog of candidate genesGlyma.13G292600forqPH-b_13,was also detected and located in the QTLqPH-b_12genomic interval.Similarly,Glyma.03G194300,close paralog of candidate genesGlyma.19G194100forqPH-b_19-1,was located in the reported QTLPlantheight 33-1(Kimet al.2012),Plantheight34-1(Kimet al.2012)andPlantheight26-17(Sunet al.2006).The results indicated that homolog gene pairs in these QTL regions may regulate the plant height together.
Segregation distortion,which distorts the frequency of alleles from the expected Mendelian ratio,was observed in many crops (Luet al.2002;Liet al.2010).It is important to identify segregation distortion regions (SDRs)because they may contain vital QTLs/genes for plant breeding (Liet al.2010).For example,15 SDRs were identified from three F2populations in potato,and a genear1underlying the abnormal rooting was cloned from SDRs located in chr01 (Zhanget al.2019).A major QTL underlying resistance to crown rot in barley was identified in SDR on chr3H (Liet al.2009b).The current study found that two QTLs for plant height,qPH-b_19-1andqPH-b_19-2,were mapped in SDR on chr19 (41 983 457-46 342 806 bp),explaining 10.56-15.79% and 4.74% of the phenotypic variance across different environments,respectively.The results indicated that segregation distortion marker was useful for detecting QTL/genes located in SDRs.Gametic and zygotic selections were common genetic factors associated with a distorted segregation ratio in many crops (Luet al.2002;Liuet al.2015;Zhanget al.2019).In the current study,some known sterility loci were also identified in SDRs.For example,st_A06-2/6(Baumbachet al.2012;Spethet al.2015) located in chr01 SDR (737 009-4 953 024 bp),st4(Spethet al.2015) located in chr01 SDR (49 450 001-50 232 424 bp) andst7(Slatteryet al.2011;Spethet al.2015) located in chr02 SDR (46 697 769-48 571 857 bp).However,no reported sterility loci were identified inqPH-b_19-1andqPH-b_19-2regions.Recently,it was also reported that inadvertent selection for stable QTLs were also a potential factor related to segregation distortion in the RIL population of maize (Liuet al.2015).For example,two QTLsqSyn10-FT-17andqSyn10-FT-18related to early flowering were located in SDRs due to inadvertent selection.In the current study,the genotype of stable QTLqPH-b_19-1favored to parent ZP with allele of decreased plant height,indicating that selection for dwarf might be a factor for segregation distortion inqPH-b_19-1.
A genetic map was fundamental to understand the genetic basis of soybean genome and identify genes related to important agronomic traits (Xieet al.2010;Liet al.2014).Insufficiency of molecular markers and the lack of high throughput-genotyping platform limit map-based cloning in soybean (Xuet al.2013).In the current study,4 011 bin markers developed from the whole genome re-sequencing were used to construct high-density genetic map covering 3 139.15 cM,with an average interval of 0.78 cM between adjacent bin markers (Table 1).Compared to the average interval of 5.4 cM between adjacent markers in previous SSR/InDel/single SNP map with the RILs population derived from ZH×ZP (Liuet al.2013),the marker density of bin map was largely increased,and the interval of QTLs was greatly reduced.For example,a major QTL for plant height,qPH-13,identified in other researches within a relatively larger region (Josie and Kassem 2007;Leeet al.2015;Okiet al.2018) was also identified on chr13 by the marker interval Map-2496 and Satt554 (850 kb) using this RILs population,explaining 28.43-39.93% of the phenotypic variance (Liuet al.2013).Based on this high-density bin genetic map,qPH-b_13was reduced to 440.12 kb genomic region including 54 annotated genes according to the soybean reference genome (Wm82.a2.v1,https://phytozome.jgi.doe.gov).Another major QTL for plant height,qPH19-1,was identified on chr19 by Map-3972 and Map-3990(690 kb),explaining 5.14-19.42% of the phenotypic variance for plant height (Liuet al.2013).In the current study,qPH-b_19-1was narrowed down to 237.06 kb interval.In a previous study by Liuet al.(2013),qPH-12was only identified in one environment (SYa09) with 15.24 Mb interval,explaining 7.85% of the phenotypic variance for plant height.In addition,qPH-b_12was identified in two environments (SYa09 and CP10),explaining 6.26 and 3.39% of the phenotypic variance for plant height,respectively (Table 2).The interval ofqPH-b_12was also mapped to 539.14 kb.Similarly,qPH-16was identified on chr16 between Map-3130 and Map-3141 (960 kb) and explained 6.7% of plant height variance in SYa10 (Liuet al.2013).Based on the high-density bin genetic map,qPH-b_16explained 3.34% of the phenotypic variance for plant height within a narrowed 199.50 kb genomic region including 20 annotated genes according to the soybean reference genome (Wm82.a2.v1,https://phytozome.jgi.doe.gov) (Table 2).Three novel QTLs (qPH-b_11,qPH-b_17andqPH-b_18) were also identified and explained 3.01-4.02% of the phenotypic variance for plant height,and the size of these QTLs ranged from 141.14 to 935.78 kb.Of these,qPH-b_17,a novel and stable QTL,was identified in two environments (CP10 and SYi10),explaining 4.02 and 3.31% of the phenotypic variance for plant height,respectively (Table 2).
ZH is a wide-adaptable,high-yielding and semi-dwarf cultivar with a total planting area of more than 100 million acres in China.The genome of“ZH”was assembled using single-molecule real-time sequencing,chromosome conformation capture sequencing and optical mapping data (Shenet al.2018).It is of great significance to elucidate the genetic basis of important agronomic traits in ZH.qPH-13,a major QTL for plant height,was identified on chr13,explaining 28.43-47.60% of the phenotypic variance in a previous study by Liuet al.(2013).The alleleqPH-13in ZH can decrease plant height,which is useful to select and develop dwarf soybean varieties.qPH-13was also identified as a major QTL for plant height in other genetic background with similar genomic region.Josieet al.(2007) identified a major QTLqPLHGon chr13 flanked by the marker Satt554 using 100 RILs derived from Essex×Forrest,explaining 25% of the phenotypic variance for plant height (Leeet al.2015).Plant height 38-2,a major QTL for plant height,was identified on chr13 between markers of BARC-045235-08913 and BARC-038355-10050 (36 064 524-40 719 029 bp,4.65 Mb)using 91 RILs derived from Wyandot×PI567301B,explaining 18.2 and 19.4% of the phenotypic variance for plant height in 2011 and 2012,respectively (Leeet al.2015).qSI13-1for plant height and average inter-node length was also identified on chr13 between Sat_375 and Satt657 (35 072 147-39 735 531 bp,4.66 Mb) using a F2population derived from Fukuyutaka×Y2,explaining 40% of the phenotypic variance for plant height,and the allele of Y2 with short plant height was partially dominant in a set of near-isogenic lines with the only difference ofqSI13-1allele (Okiet al.2018).Although these loci were identified in previous studies,the QTL was not fine mapped,and the genes located in those QTLs were not discovered due to the limited population size and marker density.Based on whole genome re-sequencing for two parents and our RILs population,the current study narrowed down the interval ofqPH-b_13to 440.12 kb(38875989-39316110 bp) including 54 annotated genes according to the soybean reference genome(Wm82.a2.v1,https://phytozome.jgi.doe.gov),which covered smaller region than reported QTLs (Josie and Kassem 2007;Liuet al.2013;Leeet al.2015;Okiet al.2018).Another environmentally stable and major QTL,qPH-b_19-1,was reduced to 237.06 kb region including 28 annotated genes,which covered determinate growth habitDt1locus.The plant height of lines carryingDt1was significantly higher than that of carryingdt1,while the plant height of lines carrying the different alleles ofdt1genotype wasn’t significantly different (Mirandaet al.2020).By comparing the phenotype and genotype of parents inDt1,the result indicated thatqPH-b_19-1carried novel genes for plant height.According to the function of homologous genes inArabidopsis,five genes of them were predicted as candidate genes for further study which could be a valuable resource for gene clone and understanding the genetic basis of plant height.
The current study constructed a high-density genetic map by re-sequencing two soybean lines ZH and ZP,and the derived RILs population.The results demonstrated that this high-density genetic map is efficient and accurate for QTL detection.Based on this high-density genetic map,this study identified nine QTLs including three novel loci for plant height,and the interval sizes of them were greatly reduced as compared with previous studies in which the number of markers or population size was smaller.These QTLs will facilitate the identification of candidate genes and the analysis of genetic basis for plant height.
This research was supported by the National Key R&D Program of China (2016YFD0100201) and the Agricultural Science and Technology Innovation Program (ASTIP) of Chinese Academy of Agricultural Sciences.
The authors declare that they have no conflict of interest.
Appendicesassociated with this paper are available on http://www.ChinaAgriSci.com/V2/En/appendix.htm
Journal of Integrative Agriculture2022年4期