Wenlong Gong,Lin M,Qiu Go,Bo Wei,Jingui Zhng,Xiqing Liu,Pn Gong,Zn Wng,,,Guiqin Zho,
a College of Pratacultural Science, Gansu Agricultural University, Lanzhou 730070, Gansu, China
b College of Grassland Science and Technology, China Agricultural University, Beijing 100193, China
c Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
d National Animal Husbandry Service, Beijing 100125, China
Keywords:Erect milkvetch Genetic map Flowering-related traits QTL mapping SLAF-seq
A B S T R A C T Erect milkvetch (Astragalus adsurgens) is a perennial legume forage crop with economic and ecological value in livestock grazing and soil-erosion control in arid and semiarid areas worldwide. Genomic information and molecular tools to support breeding and research in the species are limited.The objectives of this investigation were to map its genome using DNA markers and to identify quantitative trait loci(QTL)in the species.An F1 mapping population of 250 plants was developed from a cross between two parents with differing flowering-related traits. A high-density genetic linkage map containing 4821 markers on eight linkage groups(LGs)with a total genetic length of 1395 cM and a mean interval of 0.29 cM between adjacent markers was constructed with SLAF-seq technology. Comparative genomic analyses revealed the highest genome sequence similarity (8.71%) between erect milkvetch and Medicago truncatula, followed by Glycine max(7.65%),Cicer arietinum(7.53%),and Lupinus angustifolius(5.21%).A total of 64 significant QTL for flowering-related traits on six LGs were detected, accounting for 9.38 to 19.1% of the associated phenotype variation. Five and 48 key candidate genes for floret number and inflorescence length were identified based on the Glycyrrhiza uralensis genome. These candidate genes were involved in ubiquitination/degradation, pollen development, cell division, cytokinin biosynthetic process, and plant flowering.These findings shed light on the regulation of flowering traits in erect milkvetch and provide genomic resources for future molecular breeding of the crop.
Erect milkvetch (Astragalus adsurgensPall., 2n= 2x= 16) is a perennial forage species that originated in Eurasia and North America [1]. Compared to its wild relatives, cultivated erect milkvetch has larger leaves,a later flowering habit,and higher biomass[2].Because of its robust primary roots and large aboveground biomass, erect milkvetch has a strong ability to withstand wind and hold sand. In this manner,this leguminous crop improves the soil,prevents soil erosion, and protects the environment [3].
Erect milkvetch is distributed primarily in arid and semiarid areas of northern China, and generally shows better adaptability to harsh environments than alfalfa (Medicago sativaL.) in these cold and dry areas. Although the crop plays valuable roles in animal husbandry and ecological services, it is difficult to expand its planting area and improve forage yield because of its low seed yield and harvest efficiency caused by wide variation in inflorescence length and flowering time [4]. Basic genomic information and molecular tools for genetic improvement and the development of new cultivars of the species are not available.
High-density genetic maps have been widely used in QTL mapping, marker-assisted selection (MAS), and comparative genomic studies,and also provide essential tools for the analysis of genome structure and evolution in plant species[5,6].High-density linkage and QTL mapping for key agronomic traits can accelerate germplasm improvement and have been extensively applied to crops and livestock[7-10].Several DNA marker systems have been used for genetic linkage map construction,including amplified fragment length polymorphism, restriction fragment length polymorphism,random amplified polymorphic DNA, simple sequence repeat,and single-nucleotide polymorphism (SNP) [11]. SNP markers are considered the best choice for high-density map construction because they are the most abundant and are codominant in plant genomes [12,13]. The rapid development of next-generation sequencing technologies has made it possible to construct highdensity SNP genetic maps. However, it is still cost-prohibitive to sequence large populations, limiting the discovery of SNP markers viade novosequencing.
Specific-locus amplified fragment sequencing (SLAF-seq), a modified reduced-representation sequencing technique, has several advantages, including reduced sequencing cost and increased sequencing depth [14]. It can be used in species that lack a reference genome sequence,offering high potential forde novoSNP discovery in herbaceous perennials with a large and highly heterozygous genome. SLAF-seq has been successfully applied to genetic linkage map construction in many plant species [15-25].Zhang et al. [17] constructed a high-density genetic map forAgropyronGaertn. With 1023 SNP markers mapped to seven linkage groups (LGs) that spanned 908 cM. Leaf shape is an essential trait in forage breeding. QTL mapping for leaf shape in pea (Pisum sativumL.) was conducted in an F2population. Two QTL were mapped to LG7 and explained 7.16%and 6.56%of phenotypic variance[18].Zhang et al.[19]constructed a high-density genetic linkage map of 14 LGs with 1971 markers using SLAF-seq and identified six consistent QTL associated with seed shattering in Siberian wildrye (Elymus sibiricusL.). A genetic map spanning 3781 cM with 8078 SNP markers across 20 chromosomes of soybean(Glycinemax L.Merr.)was constructed,and 23 QTL associated with drought tolerance were identified that could be used to accelerate the breeding of cultivars for drought tolerance via MAS [20].
Flowering accompanies the transition of plants from vegetative to reproductive growth [26]. In crops, seed maturity and yield are influenced by flowering traits such as flowering time, floret number and inflorescence length [4]. Harvesting forage crops at the proper stage balances forage quality and yield while maintaining healthy stubble for the following stand, making flowering traits important [27]. The genetic and genomic base of flowering time have been investigated extensively in major crops, whereas such information is scarce for herbaceous perennials, except for some QTL in orchard grass (Dactylis glomerataL.)[28],switchgrass(Panicum virgatumL.) [29], and alfalfa [27].
QTL mapping of flowering traits is still lacking in erect milkvetch. To develop genomic resources and elucidate the genetic structure of flowering-related traits in this species, we aimed to develop a high-density genetic linkage map, identify floweringrelated QTL, and annotate candidate genes. These findings contribute basic genomic knowledge to the information pool of an important forage species and lay the foundation for further genetic improvement of flowering-related traits in erect milkvetch.
A mapping population of 250 F1erect milkvetch plants was developed from a cross between an erect genotype, 33-15(CF019650) with late flowering (female parent) and a prostrate genotype 12-2 (CF020070) with early flowering (male parent)(Fig. S1). The parents were provided by the National Herbage Germplasm Resource Conservation Center of China (Beijing).Accession 12-2 is a wild type from Dongsheng City, Inner Mongolia, China, and 33-15 is one kind of genotype of accession ‘‘Zhongsha 1”. Owing to differences in their improvement status and geographical origins, the parents differed in flowering time, floret number, and inflorescence length. They were planted in the autumn of 2015 and crossed in the spring of 2016. The F1seeds were sown in the autumn of 2016. Young leaves were taken from each F1plant and the parents in April 2017, immersed in liquid nitrogen, and stored at -80 °C. Genomic DNA was extracted from young leaves using a New Plant Genomic DNA Extraction Kit(Qiagen, Germantown, MD, USA). DNA quality was examined by 1%agarose gel electrophoresis and Multiscan Spectrum (Spectra Max i3, Beijing, China). The working concentration of template DNA was diluted with ddH2O to 50 ng μL-1and stored at -20 °C.
The F1population and the parents were grown in the field at Machikou Town, Changping, Beijing, China (elevation 68 m, mean annual precipitation 550.3 mm, mean annual sunshine duration 2684 h, mean annual temperature 6.7 °C, and frostfree period 200 days) to collect phenotypic data for QTL mapping. In each row, 25 plants were planted, and the row spacing and individual spacing were 1 m. Flowering time (FT) was recorded as the date when 50% of the first inflorescence of each plant bloomed and was converted into the number of days from regreening to flowering time[28].Inflorescence length(IL)and floret number(FN)were determined from 10 randomly selected intact fresh inflorescences per plant in the flowering period. Inflorescence length was measured with a vernier caliper. Floret number was determined directly on each inflorescence.Inflorescence length and floret number were recorded for all genotypes over two consecutive years(2017 and 2018), whereas FT was evaluated only in 2017.
Construction of a SLAF library was performed as described previously[19].TheGlycinemax genome was selected as the reference to predict the enzyme digestion scheme. Enzyme digestion was performed with genomic DNA of the mapping population after the optimum endonuclease combination was determined [30]. A poly-A tail was added to the 3′end of the digested fragments, followed by ligation of dual-index sequencing adapters, PCR amplification, purification, sample mixing, and selection of target fragments, which were those that passed the quality test for paired-end sequencing on the Illumina HiSeq 4000 system (Illumina, Los Angeles, CA, USA). The reads of each sample were obtained from raw sequencing data based on the duplex barcodes.
The quantity and quality of the sequencing reads were evaluated after adapter removal. To assess the reliability of the library construction process,Oryza sativaL.japonicacv. Nipponbare was used as a control and underwent the same process of library construction and sequencing. The efficiency of enzyme digestion was evaluated by comparing the matching efficiency of the control data to judge the accuracy and effectiveness of the test process.Library construction and sequencing were performed by Biomarker Technologies, Beijing, China.
Development of SLAF markers was performed as described by Sun et al. [31]. Low-quality reads with a quality score <Q30 were removed. According to the duplex barcodes, the remaining reads were assigned to each mapping individual, and the barcode and the terminal 5 bases region of each read were trimmed off to obtain clean reads. All the clean reads were clustered according to sequence similarity,and sequences with more than 90%identity were defined as SLAF loci.
The minor-allele frequency was evaluated for each allele at each SLAF locus. Three types of SLAF markers were obtained according to the number of alleles and the differences between gene sequences: polymorphic (two to four alleles), nonpolymorphic(fewer than two alleles) and repetitive (more than four alleles)[31].The Genome Analysis Toolkit(GATK)(https://software.broadinstitute.org/gatk/documentation/) was used for SNP calling among parents and F1plants[32].Polymorphic SLAF markers were encoded by biallelic coding rules (Table 1). Given that erect milkvetch is a cross-pollinated species and the mapping population in our study was an F1population, it was necessary to genotyping each locus that was heterozygous in the parental genotypes and polymorphic between the parents. SLAF markers without the aa × bb segregation pattern were used for genetic map construction. To ensure the quality of the genetic map, polymorphic SLAF markers were filtered according to the following rules: (1)sequencing depth in the parents ≤10; (2) number of SNPs >5;(3) complete degree ≤60%; (4) completely homozygous parents;and (5) removal of markers with segregation distortion (chisquare (χ2) testP <0.001).
Table 1The rules for genotyping.
HighMap software [33] was used for genetic linkage map construction. Modified logarithm of odds (MLOD) values were calculated for every two markers based on the single-linkage clustering algorithm and used to assign markers to different LGs.Markers with MLOD values <6 were excluded, and the remaining markers were defined as mapping markers.The LG was used as the unit for constructing the genetic map using the maximumlikelihood method,which ordered markers and corrected genotyping errors within the LGs. The SMOOTH algorithm was used for error correction, the k-nearest neighbor algorithm was used for missing genotype imputation, and the Kosambi mapping function was used for estimating map distances. Finally, the quality of the genetic map was evaluated with a heat map and a haplotype map. Basic Local Alignment Search Tool (BLAST) (https://blast.ncbi.nlm.nih.gov/Blast.cgi) searches were conducted of all the SNP marker sequences in the genetic map of erect milkvetch against the genomes of four closely related species includingCicer arietinumL. (https://phytozome.jgi.doe.gov/pz/portal.html#!info?alias=Org_Carietinum_er),G.max (https://phytozome.jgi.doe.gov/pz/portal.html#!info?alias=Org_Gmax),LupinusangustifoliusL. (https://www.ncbi.nlm.nih.gov/genome/?term=Lupinus+angustifolius) andM. truncatula(https://phytozome.jgi.doe.gov/pz/portal.html#!info?alias=Org_Mtruncatula). The greatest matching length was defined as the final match result of the marker.According to the physical positions of markers in each genome, colinear regions among different genomes were identified with MCScanX software [34].
QTL mapping was performed with a high-density genetic linkage map and phenotypic data for flowering-related traits.A permutation test was repeated 1000 times to establish a logarithm of odds (LOD) threshold, and a LOD score larger than the 5% cutoff value was used to identify significant QTL [35]. MapQTL 5.0[36]was used to estimate the phenotypic variation of each significant QTL.
To identify the potential function of flowering-related traits with significant QTL regions and annotate candidate genes, BLAST was conducted for candidate markers of the significant QTL region agaomst theGlycyrrhiza uralensisFisch. genome (http://ngs-dataarchive.psc.riken.jp/Gur-genome/download.pl.) to determine the physical location of the markers in the genome.The Gene Ontology(GO) (http://geneontology.org), Cluster of Orthologous Groups(COG) (https://www.ncbi.nlm.nih.gov/COG/) of proteins, Kyoto Encyclopedia of Genes and Genomes (KEGG) (http://www.genome.jp/kegg), Swiss-Prot (https://www.uniprot.org/) and Non-Redundant Protein Sequence Database (NR) (ftp://ftp.ncbi.nlm.nih.gov/blast/db) databases were used for functional annotation.
A descriptive statistical analysis of phenotypic data was conducted using SPSS software [37]. First, the range of phenotypic variation between parents and F1plants and among F1plants and years was characterized. Second, Pearson correlations were calculated to reveal relationships between flowering-related traits. A difference was assigned as statistically significant whenP<0.01.Kurtosis and skewness coefficients were calculated with thepsychpackage in R [38]. A kurtosis coefficient less than 3 indicates that the measured data are relatively nonconcentrated and a skewness coefficient greater than 0 indicates a right skew [39]. Finally, normal distribution was tested with theggplot2package in R.
After electronic digestion prediction,RsaI+HaeIII were selected as optimal endonucleases. The length and total number of SLAF markers were predicted to be 314-444 bp and 166,383, respectively.The percentages of the paired-end alignments of erect milkvetch and the controlOryza sativaL.japonicacv. Nipponbare digested byRsaI +HaeIII were 91.40% and 87.27%, respectively,indicating the high quality of SLAF library construction.A summary of the sequencing data, including the numbers of reads and bases,the mean percentage of Q30(quality score of at least 30)bases and the mean content of guanine and cytosine(GC),is shown in Table 2.A total of 215.51 Gb of raw data containing 1079.19 million reads were generated from all plants, and the mean percentages of Q30 and GC were 94.95% and 41.12%, respectively. A total of 0.18 Gb of raw data containing 915,453 reads were obtained for the control species, with mean percentages of Q30 and GC of 95.84% and 41.86%, respectively (Table 2). The number of reads was 19,117,777 for the male parent (12-2) and 24,175,434 for the female parent (33-15), while the mean number of reads across all F1individuals was 4,143,599 (Table 2). Up to 482,222 SLAF markers were detected, of which respectively 298,513 and 311,081 were identified in the male and female parents, whereas the mean number of SLAFs across all F1plants was 173,987. The mean sequencing depths of the parents and F1plants were 20.69 and 5.64, respectively (Table 3).
The SLAFs could be divided into three types according to the number of alleles and the differences between allelic sequences.There were 198,621 polymorphic SLAFs with a polymorphic ratio of 41.19%. The numbers of nonpolymorphic and repetitive SLAFs were 278,738 (57.80%) and 4863 (1.01%), respectively (Table S1).After removal of loci with missing parental information, 89,741 polymorphic SLAFs were successfully coded according to the genotype coding rule(Table 1;Fig.1).The successfully coded polymorphic SLAFs were further grouped into eight segregation patterns(ab × cd, ef × eg, hk × hk, lm × ll, nn × np, aa × bb, ab × cc,and cc×ab),and the percentage of effective polymorphisms based on map construction was 11.87%. Removal of low-quality markers left 4850 mapping markers (Table S2).
Table 2Summary of sequencing data.
Table 3Summary of SLAF marker information.
After calculating and filtering the MLOD values of the polymorphic SLAFs, 4,821 SLAF markers were identified, with a mapping rate of 99.40% (Table 4). A mean of 2.14 SNPs were found in each SLAF marker in the map.Based on the eight LGs,the linear arrangement of the mapped markers and the genetic distances between adjacent markers was estimated. Genetic maps of the two parents were constructed, with that of the male parent spanning 1199 cM with 3006 markers and that of the female parent spanning 1525 cM with 2667 markers (Table S3). The resulting highquality integrated genetic map spanned 1395 cM with 4821 markers (Fig. S2; Table 4). LG3 contained the most markers (928), and LG8 the fewest (238). The longest map distance was found for LG7 (187 cM) and the shortest for LG6 (151 cM) (Table 4). The mean interval between adjacent markers was 0.29 cM. LG8 had the largest mean interval of 0.69 cM,whereas LG3 and LG6 showed the smallest mean interval of 0.20 cM (Table 4). The ratio of the number of gaps ≤5 cM to the total number of gaps has been used to represent map quality [40]. Markers with gaps ≤ 5 cM accounted for 99.58% of markers indicating that the distribution of markers on the map was relatively uniform (Table 4).
Fig. 1. Distribution of SLAF markers among eight segregation patterns.
A total of 10,322 SNPs were contained in the eight LGs,of which LG3 harbored the most SNPs (1992) and LG8 the fewest (484)(Table S4). Each SNP is expected to have four different variants with a ratio of transversions to transitions of 1:2 at most of the loci.The proportions of SNPs with transversions and transitions in the eight LGs had a mean level between 0.5 and 0.6(Table S4).The statistical information of markers showing segregation distortion for each LG is shown in Table 4.A total of 466 markers showed significant (P <0.05) segregation distortion, accounting for 9.67% of the total markers mapped. LG2 contained the most markers with segregation distortion(180),whereas LG3 contained no markers with segregation distortion.
Table 4Description of basic characteristics of the eight linkage groups.
To evaluate the quality of the constructed genetic linkage map,haplotype maps and heat maps were used to assess marker integrity,haplotype source,and linkage relationships.The marker integrity of all mapped plants was the proportion of markers with determined genotypes in the total number of markers. The mean marker integrity across all individuals reached 99.99%, indicating genotyping accuracy (Fig. S3). Double exchanges may have been generated by genomic recombination events or genotyping errors caused by sequencing. Haplotype maps were generated for the F1population and the two parents using 10,322 mapped SNP markers. The source of the largest haplotype segment in the F1plants was consistent, suggesting negligible influence on genetic map quality. The linkage relationships between adjacent markers on the same LG were extremely strong. With increasing genetic distance, the linkage relationships between markers and between a specific marker and distant markers gradually changed from strong to weak, indicating that SNP markers were well ordered on most LGs.
Comparative genomic analysis was conducted for all SNP marker sequences between erect milkvetch and four closely related species (Fig. 2). A total of 363 (7.53%) matching sequences were identified for erect milkvetch andC. arietinum, and the number of orthologs ranged from 23 (chromosome 8) to 62 (chromosome 7), with a mean of 45.38 per chromosome. There were 369(7.65%) sequences matching between erect milkvetch andG.max, and the number of orthologs ranged from 10 (chromosome 2) to 34 (chromosome 18), with a mean of 18.45. Erect milkvetch andL. angustifoliusshared 251 (5.21%) matching sequences, and the number of orthologs ranged from 6 (chromosomes 7 and 18)to 18 (chromosome 15), with a mean of 12.55. A total of 420(8.71%) matching sequences were detected for erect milkvetch andM. truncatula, and the number of orthologs ranged from 37(chromosome 5) to 69 (chromosome 7), with a mean of 52.5.According to the numbers and proportions of sequences that were matched across the four species, the species with the closest genetic relationship to erect milkvetch wasM. truncatula, andL.angustifoliuswas the most distant.
Fig.2. The linear relationship between erect milkvetch and other four closely related grass species.(a)Cicer arietinum.(b)Glycine max.(c)Lupinus angustifolius.(d)Medicago truncatula.
Descriptive statistical analysis is presented in Table 5 for the phenotypic data from the mapping population (Table 5).The coefficient of variation (CV) among the three traits ranged from 5.26% (FT2017) to 21.27% (IL2017) (Table 5). The mean CV among all the traits was 17.38%, indicating a large difference among plants. All traits showed transgressive inheritance.All but FN2018 were biased towards the male parent, implying possible male-driven genetic imprinting (Table 5). As shown in Table S5, the correlations between IL and FN in 2017 and 2018 were 0.56 and 0.49, respectively, indicating a significant positive correlation between IL and FN in the same year and suggesting less of an environmental effect on this correlation. There was no significant correlation between FT and IL or FN.
Table 5Phenotypic variation of flowering-related traits in parents and F1 plants.
The kurtosis coefficients of the other traits were all less than 3,except for IL2017, which was leptokurtic (Table S5), indicating a larger difference within the F1population. The three traits all showed right-skewed distributions, with the largest skewness of 2.16 for IL2017 and the smallest skewness of 0.15 for IL2018(Table S5), indicating that the phenotypic data of the three flowering-related traits were distributed mainly in the middle and higher regions. As shown in Fig. 3, the normal distribution curves of all traits were symmetrically distributed, and the histogram also showed a distribution trend of high values in the middle and low values on both sides.The results were consistent with the kurtosis and skewness statistics,indicating that the floweringrelated traits followed essentially a normal distribution.
A total of 64 significant QTLs for the three flowering-related traits were detected by QTL mapping, distributed on LG1, LG2,LG4,LG5,LG7 and LG8(Table 6;Fig.4).The maximum LODs for significant marker-trait associations ranged from 5.01 to 9.71,and the proportion of phenotypic variation explained (PVE) by markers ranged from 9.38% to 19.1% (Table 6). There were 38 significant QTL detected for IL over two years, of which 12, 17 and nine QTL were detected on LG5, LG7, and LG8, respectively, and their maximum LODs and PVE values ranged from 5.01 to 9.71 and from 10.1% to 19.1%, respectively (Table 6; Fig. 4). QTL interval associated with A3 and C9 were consistently detected in both years and encompassed 205 and 196 markers, with maximum PVE values of 18% and 17.4%, respectively (Table 6; Fig. 4). The C20 locus showed the maximum PVE (19.1%), while the A3 locus showed the maximum LOD value (9.71) (Table 6; Fig. 4). A total of 19 significant QTL distributed on LG2, LG4, LG7, and LG8 were detected for FN across two years, and their maximum LOD and PVE values ranged from 5.61 to 9.31 and from 11.1% to 19%, respectively.The sizeable overlapping fragments at loci B6 and D1, as well as loci B8 and D3, indicated that the effects of these QTL were relatively stable across different years(Table 6;Fig.4).A total of seven significant QTL interval for FT were detected on LG1 and LG2,with the number of markers on each locus ranging from 8 (E7) to 111(E3). The maximum LOD and PVE values ranged from 6.01 to 8.69 and from 9.38% to 16.22%, respectively (Table 6; Fig. 4).
Fig. 3. Phenotype frequency distributions for flowering time, floret number, and inflorescence length in the F1 population of erect milkvetch.
Significant QTL detected for FN and IL were annotated with the referenceG.uralensisgenome to reveal genetic pathways for flowering traits in erect milkvetch. A total of 37 candidate genes were identified for the 19 significant QTL detected for FN(Table S6).Five key candidate genes were found by comparing the genes identified across two years (Table S7), of which theGlyur001214s00026368andGlyur000014s00002548genes are involved in translation elongation factor activity, while theGlyur000070s00008071gene is involved in multiple biological processes including superoxide reaction, transcription factor activity, and regulation of various plant growth hormones, suggesting that these genes may be involved in the regulation of FN (Table S7).
Table 6QTL analysis of flowering-related traits.
A total of 38 QTL were found for the IL trait (Table 6). Annotation identified 127 candidate genes (Table S8). After comparing the genes identified across two years, 48 key candidate genes remained (Table S9), and most genes were involved in ion transmembrane transporter, various hydrolysis processes, or transcription factor activity. TheGlyur000108s00010437gene is involved in the brassinosteroid biosynthetic process, and theGlyur000151s00010721gene is involved in multiple floweringrelated biological processes including photomorphogenesis,flower development, and flowering photoperiod (Table S9). Two genes,Glyur001887s00036747andGlyur001887s00036746, which are involved in flowering and photoperiod, were found at 129.484 cM on LG5. The other two candidate genes identified,Glyur000315s00012762andGlyur000053s00006470, are involved in photosynthesis regulation and the synthesis of abscisic acid,respectively, and may function in the regulation of IL (Table S9).
Fig.4. QTL mapping for flowering time,floret number,and inflorescence length in erect milkvetch.Each panel shows the QTL distribution of a target trait.In each panel,the X-axis represents the linkage groups and the Y-axis displays the corresponding LOD value (blue line) and phenotypic contribution (red line).
The self-incompatibility of erect milkvetch is a challenge for constructing a genetic linkage map. The complex natural polyploidy also hinders SNP mining in this species. We performed SNP mining by SLAF-seq and constructed a linkage map in an F1population by crossing two erect milkvetch accessions with large variations in flowering-related traits. The polymorphic SLAF marker rate in our study was higher than that reported forJuglans regia(31.97%) [41], peanut (2.54%) [24] and Siberian wildrye (26.29%)[19], indicating a wide difference between the two parental genotypes(Table S1).The mean genetic distance between neighbouring markers in our genetic map was lower than those reported forJuglans regia(0.95 cM) [41] and Siberian wildrye (1.66 cM) [19].
The number of SLAF markers on each LG varied from 238(LG8)to 928 (LG3), and there were two large gaps, one each on LG4(15.79 cM) and LG6 (14.17 cM) (Table 4). Similar results were observed inZoysia japonica(four gaps greater than 10 cM) [5]andJ. curcas(two gaps greater than 15 cM) [6]. This phenomenon may result from the lack of marker polymorphisms and marker detection in these regions. Marker clustering on LG3 and LG6 may have been due to uneven recombination rates and nonrandom distributions of mapped markers on the linkage map between the mapping parents in these regions.
Segregation distortion is detected when observed genotypic ratios for markers deviate from expected Mendelian ratios [42].High segregation distortion may affect the accuracy of genetic linkage map construction and QTL mapping[43].Among the 4821 SLAF markers used for map construction,9.7%(466)showed segregation distortion,an acceptable value for map construction(Table 4).The results of marker integrity and linkage relationship evaluation indicates the reliability of the constructed genetic linkage map.
Since flowering-related traits greatly influence the seed production and utilization of forage crops,many QTL have been identified for such traits in recent years[44,45].In this study,Zhongsha 1 and 12-2 were used as the parents because of their different genetic backgrounds and flowering-related traits [2]. We conducted QTL mapping for three flowering-related traits, FT, IL and FN, and significant QTL for IL and FN detected across two years overlapped for a wide range of LGs, including A3 and C9 as well as B6 and D1, suggesting that these QTL are stably expressed.
Comparative genomic analysis showed that erect milkvetch was most closely related to alfalfa.The colinearity between erect milkvetch and other species was much lower than that reported in other research [5], suggesting the possibility of independent chromosome rearrangement events in erect milkvetch and the selected species during evolution [19]. However, sufficient colinearity was observed in some chromosome segments and may be useful for candidate-gene annotation (Fig. 2).
We identified 127 and 37 potential candidate genes associated with IL and FN, respectively (Tables S6, S8). FN is usually closely associated with pollen development and plant male sterility as well as seed yield and quality[46].Among the 37 candidate genes for FN,Glyur000060s00004758encodes an SKP1-like protein that is highly expressed in inflorescences and floral primordium and is suggested[47]to be involved in many physiological and biochemical processes, including ubiquitination-degradation, plant flowering, and pollen development regulation.Glyur000353s00015541encodes a callose synthase that functions in the development of pollen viability,suggesting its potential role in floret development in erect milkvetch [28].Glyur000006s00001667, a MYB-like transcription factor that is generally accepted as a floret organformation regulator [48], was also identified as a candidate gene.
Because IL influences seed yield and stability, elucidating its molecular basis may improve the seed yield potential of erect milkvetch[49].Only a few genes have been reported to participate in regulating IL inArabidopsis thaliana[50],petunia(Petunia hybridL.)[51],soybean[49],rice[52],and maize(Zea maysL.)[53].In the present study, 127 candidate genes associated with IL in erect milkvetch were identified,of which 48 were identified consistently across two years. Among these genes,Glyur000284s00027072encodes cytokinin riboside 7(LOG7),which is involved in the cytokinin biosynthetic process and may affect IL in plants [54]. Two SBP-box transcription factors (Glyur000384s00022282andGlyur000384s00022283) were also identified as candidate genes.As unique transcription factors in flowering plants, SBP-box proteins can bind to the squamosa promoter and function in floral organ development, and they are considered to be floweringrelated genes in tea (Camellia sinensis(L.) O. Ktze.) [55], rice [56],maize[57],and wheat(Triticum aestivumL.)[58].The diverse candidate genes identified in this study offer opportunities for developing molecular tools to accelerate the development of improved erect milkvetch cultivars.
Availability of data and materials
All the sequence data used in the study have been deposited in the Genome Sequence Archive under ID CRA003932 in Beijing Institute of Genomics (BIG) Data Center, Chinese Academy of Sciences (http://bigd.big.ac.cn/gsa). The other data generated in the study are included in this published article and its additional files.
CRediT authorship contribution statement
Wenlong Gong:Methodology, Investigation, Writing - original draft,Writing-review&editing.Lin Ma:Methodology,Investigation,Data curation,Writing-review&editing.Qiu Gao:Data curation, Writing - review & editing.Bao Wei:Investigation.Jiangui Zhang:Data curation.Xiqiang Liu:Investigation.Pan Gong:Investigation.Zan Wang:Conceptualization, Methodology, Writing -review & editing, Supervision, Project administration, Funding acquisition.Guiqin Zhao:Conceptualization, Methodology, Writing - review & editing, Supervision.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported by the Chinese Universities Scientific Fund (2021RC001), the China Agriculture Research System(CARS34),and the National Program for Forage Germplasm Conservation (2130135).
Appendix A. Supplementary data
Supplementary data for this article can be found online at https://doi.org/10.1016/j.cj.2022.01.008.