AN Feng ,ZHANG Kang ,ZHANG Ling-kui ,Ll Xing ,CHEN Shu-min ,WANG Hua-senCHENG Feng
1 Collaborative Innovation Center for Efficient and Green Production of Agriculture in Zhejiang Mountainous Areas,College of Horticulture Science,Zhejiang A&F University,Hangzhou 311300,P.R.China
2 Institute of Vegetables and Flowers,Chinese Academy of Agricultural Sciences,Beijing 100081,P.R.China
Abstract DNA methylation plays an important role in plant growth and development,and in regulating the activity of transposable elements (TEs). Research on DNA methylation-related (DMR) genes has been reported in Arabidopsis,but little research on DMR genes has been reported in Brassica rapa and Brassica oleracea,the genomes of which exhibit significant differences in TE content. In this study,we identified 78 and 77 DMR genes in Brassica rapa and Brassica oleracea,respectively. Detailed analysis revealed that the numbers of DMR genes in different DMR pathways varied in B.rapa and B.oleracea. The evolutionary selection pressure of DMR genes in B.rapa and B.oleracea was compared,and the DMR genes showed differential evolution between these two species. The nucleotide diversity (π) and selective sweep (Tajima’s D) revealed footprints of selection in the B.rapa and B.oleracea populations. Transcriptome analysis showed that most DMR genes exhibited similar expression characteristics in B.rapa and B.oleracea. This study dissects the evolutionary differences and genetic variations of the DMR genes in B.rapa and B.oleracea,and will provide valuable resources for future research on the divergent evolution of DNA methylation between B.rapa and B.oleracea.
Keywords:DNA methylation,Brassica rapa,Brassica oleracea,evolutionary selection,genetic variation,gene expression
DNA methylation is an epigenetic modification that controls multiple processes,such as gene expression,stress responses,and the expression of transposable elements(TEs) (Gallego-Bartolomé 2020). Many studies have demonstrated that the manipulation of DNA methylation provides an alternative method for crop improvement,thereby serving as an important target for manipulation in crop improvement (Springer 2013;Bloomfieldet al.2014;King 2015). DNA methylation is divided into three processes,includingde novomethylation,maintenance of methylation,and active DNA demethylation,and these three processes are catalyzed by various enzymes (Zhanget al.2018). In plants,de novomethylation refers to the RNA-directed DNA methylation (RdDM) pathway,which is employed by plant-specific RNA polymerases (Zhanget al.2018). The RdDM pathway relies on a specialized transcription machinery including RNA polymerase IV (Pol IV) and Pol V (Herret al.2005;Haag and Pikaard 2011;Peiet al.2019). The single-stranded RNAs,which are transcribed from RdDM target loci by Pol IV,are converted into double-stranded RNAs (dsRNAs) by RNA-dependent RNA polymerase 2 (RDR2) (Onoderaet al.2005;Haag and Pikaard 2011;Blevinset al.2015;Grimanelli and Ingouff 2020). Dicer-like protein 2 (DCL2),DCL3,andDCL4cleave these dsRNAs to yield small-interfering RNAs (siRNAs) (Xieet al.2004). Subsequently,the siRNAs are loaded onto Argonaute 4 (AGO4) orAGO6,and pair with scaffold RNAs transcribed by Pol V,recruiting domains rearranged methylase 2 (DRM2),which catalyzesde novoDNA methylation (Zilbermanet al.2003;Yeet al.2012;Zhonget al.2014;Zhanget al.2018). The mechanisms for the maintenance of DNA methylation (MDM) depend on the cytosine sequence context (CG,CHG,or CHH,H=T,C,A),and are catalyzed by different DNA methyltransferases (Cokuset al.2008;Chodavarapuet al.2010;Zhanget al.2018). In brief,CG cytosine methylation is maintained by methyltransferase 1(MET1). CHG cytosine methylation is maintained by chromomethylase 3 (CMT3) andCMT2(Lindrothet al.2001;Stroudet al.2014). CHH cytosine methylation is maintained byCMT2andDRM2(Jeddelohet al.1999;Zilbermanet al.2003). There are two forms of DNA demethylation,the passive DNA demethylation (PDD) and the active DNA demethylation (ADD). PDD depends on DNA replication (Zhu 2009;Liuet al.2018),while ADD is mediated by a family of DNA glycosylases including Demeter (DME),Demeter-like 2 (DML2),andDML3,as well as repressor of silencing 1 (ROS1) (Choiet al.2002;Pentermanet al.2007;Ortega-Galisteoet al.2008). In plants,ADD is a complex process,and the ADD pathway includes the participation of multiple protein complexes and transcription factors (Duanet al.2017;Frostet al.2018).
The genusBrassicabelongs to the plant family Brassicaceae (Warwicket al.2006;Chenget al.2017),and includes many important horticultural species.Among them,BrassicarapaandBrassicaoleraceaare two important representative vegetable crops. Recent research has proven thatBrassicaspecies have undergone additional whole-genome triplication (WGT)events since they diverged fromArabidopsis thaliana,followed by extensive gene loss and the divergence of three subgenomes (Wanget al.2011). Extensive gene loss resulted in various subgenome-dominant phenomena inB.rapaandB.oleracea,such as gene loss bias and expression dominance between homoeologous genes from different subgenomes (Zhanget al.2019). Among the three subgenomes,the dominant subgenome is referred to as the least fractionated subgenome (LF),while the other two recessive subgenomes are referred to as the medium fractionated subgenome (MF1) and the most fractionated subgenome (MF2) (Wanget al.2011;Chenget al.2012a;Liu S Yet al.2014). Recently,the impact of DNA methylation on polyploid genome evolution has been documented inB.rapaandB.oleracea(Chenet al.2015;Groveret al.2018). It was also found that variation in DNA methylation broadly exists among different plant species (Niederhuthet al.2016). Wholegenome DNA methylation,and small RNA and gene expression were systematically analyzed inB.rapa(Chenet al.2015;Takahashiet al.2018). The results showed that the genic regions of single-copy genes had significantly higher methylation than multi-copy genes.Generally,the level of gene expression was negatively correlated with the mean gene methylation content and depended on the copy number or subgenomes inB.rapa(Takahashiet al.2018). In addition,the differences in TEs betweenB.rapaandB.oleraceaare notable (Wanget al.2011;Zhaoet al.2013;Liu S Yet al.2014),and TEs are closely related to DNA methylation (Chenget al.2016a). These findings indicate the importance of studying the divergent evolution of DNA methylationrelated (DMR) genes that underlie the differences in TEs and their methylation betweenB.rapaandB.oleracea.However,systematic identification and characterization of DMR genes inB.rapaandB.oleraceaare still lacking.In recent years,with the rapid development of genomics,high-throughput sequencing technology has provided a large quantity of nucleic acid data for bothB.rapaandB.oleracea,which is very convenient for the study of DMR genes inB.rapaandB.oleracea.
In this study,we identified the DMR genes by comparative genomic analysis ofA.thaliana,B.rapa,andB.oleracea,and analyzed the retention and evolutionary differences in the DMR genes after WGT betweenB.rapaandB.oleracea. We compared the evolutionary selection pressures and expression differences of DMR genes betweenB.rapaandB.oleracea. In addition,we estimated the population diversity of DMR genes by calculating nucleotide diversity (π) and Tajima’s D in 199B.rapaand 119B.oleraceaaccessions. This study provides valuable resources for future research on the divergent evolution of DNA methylation between the genomes ofB.rapaandB.oleracea.
The sequences ofA.thalianagenes involved in the DMR pathways were retrieved from the TAIR database (https://www.arabidopsis.org/). TheB.rapa(v3.1) andB.oleraceagenomes (v2.0) were downloaded from BRAD (http://brassicadb.cn/) (Chenget al.2011). These genomes and gene datasets were used to identify the DMR genes inB.rapaandB.oleracea. We used two methods (BLASTP and SynOrths) to identify the DMR proteins inB.rapaandB.oleracea. In the BLASTP v2.10.0 Software (NCBIBLAST package,ftp://ftp.ncbi.nih.gov/blast/),we used the following parameters:E-value of ≤10?10,identity of ≥0.65,and coverage of ≥0.70. In the SynOrths v1.5 Software(Chenget al.2012b),we used the default parameters.
DMR genes were mapped to theA.thaliana,B.rapa,andB.oleraceachromosomes with MapChart v2.32 (Voorrips 2002) according to their positions in the chromosomes.SynOrths v1.5 was employed to analyze the synteny for DMR genes amongA.thaliana,B.rapa,andB.oleracea,and Circos v0.69 (Krzywinskiet al.2009) was used to draw the synteny relationships of the DMR genes. The domains in DMR protein sequences were predicted by Pfam (http://pfam.xfam.org/) with E-value ≤10?3.
DMR gene pairs were aligned by codons in MAFFT v7.467 Software (Katoh and Standley 2013). Non-synonymous(Ka) and synonymous (Ks) substitution rates andKa/Ksratios were determined using theKaKs_Calculator v2.0 program (Wanget al.2010). The frequency distribution of these data was visualized by a violin plot using the R package vioplot (https://github.com/TomKellyGenetics/vioplot). The significance was tested using theF-test.
By using the resequencing data from the 199B.rapaand 119B.oleraceaaccessions (Chenget al.2016b),genetic diversity measures were calculated. The π and Tajima’s D index values were estimated for the whole genome using the VCFtools v0.1.16 Software (Daneceket al.2011). The sliding window of VCFtools v0.1.16 was the middle distance between each gene and its two upper and lower genes. The π and Tajima’s D index of the DMR genes inB.rapaandB.oleraceawere then estimated by the same method. The frequency distribution of this data was visualized by a violin plot using the R package vioplot (https://github.com/TomKellyGenetics/vioplot). The significance was tested using theF-test.
The transcriptional patterns of the DMR genes in multiple tissues were obtainedviatranscriptional sequencing data from the DDBJ/EMBL/GenBank Sequence Read Archive(SRA) databases (codes GSE43245 and GSE42891) for bothB.rapaandB.oleracea,including root,stem,leaf,flower,and silique tissues. The clean data were mapped onto theB.rapaandB.oleraceagenomes using Hisat2 v2.1.0 (Kimet al.2019) with default parameters. The mapped output was processedviafeatureCounts v2.0.0 (Liaoet al.2014)and a Python script to obtain transcripts per million (TPM) for all the DMR genes in each sample. The TPM values of the DMR genes were transformed by log2(TPM+1). Finally,heat maps of the DMR genes were generated using pheatmap in the R package (pheatmap,https://cran.r-project.org/web/packages/pheatmap/index.html).
We systematically collected the genes involved in DMR that have been reported previously (Deleriset al.2016;Liu and Lang 2020). In total,50 genes inArabidopsiscorresponding to the three main pathways for DNA methylation were obtained,with the functional information listed in Appendix A. The resultant gene numbers in the RdDM,MDM,and ADD categories were 17,8,and 25,respectively.
We identified the orthologs (homeologs) of these 50A.thalianagenes in theB.rapaandB.oleraceagenomes using two methods (Guoet al.2014). The tool SynOrths was applied to determine the syntenic orthologs,while BLASTP was performed to detect those orthologs without synteny relationships (see Section 2.1). Ultimately,78 and 77 DMR genes were identified inB.rapaandB.oleracea,respectively (Table 1;Appendices B and C). We then named these genes according to their subgenome locations. For example,the syntenic orthologs of geneAbcwere located in subgenomes LF,MF1,and MF2,so these syntenic orthologs were named “Abc.1”,“Abc.2”,and “Abc.3”,respectively. If two syntenic genes were located in one subgenome LF,we further distinguished them as “Abc.1.a” and “Abc.1.b”. The non-syntenicorthologs were named after the numbering of these syntenic genes,such as “Abc.4”. Following these rules,a comprehensive annotation for these DMR genes was generated (Appendices B and C). With the subgenome information,we found that 34,12,and 18 DMR genes were located in subgenomes LF,MF1,and MF2 inB.rapa,respectively;while 32,15,and 21 DMR genes were located in subgenomes LF,MF1,and MF2,respectively,inB.oleracea. These data show that more syntenic orthologs are located in subgenome LF than in either subgenomes MF1 or MF2,which is consistent with the fact that genes in LF are generally over-retained inBrassicaspecies(Wanget al.2011;Chenget al.2018).
Table 1 The numbers of DNA methylation-related (DMR) genes in Brassica rapa and Brassica oleracea
To understand the distributions of DMR genes in theA.thaliana,B.rapa,andB.oleraceagenomes,we mapped the identified DMR genes onto theA.thaliana,B.rapa,andB.oleraceachromosomes.We drew a map of each DMR gene on theA.thaliana,B.rapa,andB.oleraceachromosomes. The resulting maps showed that most of the DMR genes were located in syntenic regions. Compared withA.thaliana,the uneven distribution of these genes inBrassicawas probably due to the extensive subgenome rearrangements (Appendix D). InA.thaliana,the RdDM and ADD genes are mainly distributed on chromosome 3,while MDM genes are mainly distributed on chromosome 5. InB.rapa,most of the DMR genes are localized on the first six chromosomes,and only three of the DMR genes are located on chromosome 8. InB.oleracea,none of the DMR genes were found on chromosome 6.
We also performed synteny analysis with the DMR genes amongA.thaliana,B.rapa,andB.oleracea.The syntenic relationships of these genes are listed in Appendix E. A total of 24 chromosomes(A.thaliana:5,B.rapa:10,andB.oleracea:9) with a total of 134 DMR genes (A.thaliana:43,B.rapa:64,andB.oleracea:69) were used to map the synteny relationships (Fig.1). The results showed that approximately 56% of DMR genes showed “oneto-one synteny” relationships betweenA.thalianaandB.rapa(B.oleracea),while approximately 44%of the DMR genes showed “one-to-many synteny”relationships,which revealed that some DMR genes were retained while some DMR genes were lost after a WGT event (Fig.1).
Fig.1 Syntenic analysis of DNA methylation-related (DMR) genes. Lines link the synteny genes from the Arabidopsis thaliana,Brassica rapa,and Brassica oleracea genomes. ChrX represents an A.thaliana chromosome,A0X represents a B.rapa chromosome,and C0X represents a B.oleracea chromosome. Blue lines represent the RNA-directed DNA methylation (RdDM)pathway,green lines represent the maintenances of DNA methylation (MDM) pathway,and orange lines represent the active DNA demethylation (ADD) pathway.
BrassicarapaandB.oleraceashowed differential retention of genes related to DNA methylation regulation.These syntenic orthologs to theA.thalianagenes were generated through a WGT event and retained through extensive homoeologous gene loss. Because of neofunctionalization,subfunctionalization,and nonfunctionalization (Lynchet al.2001),differences exist in the copy numbers of the genes.
We wondered whether there are differences in the functional domains of the DMR genes besides the presence-absence variation (PAV). Considering the importance of these DMR genes in the pathway and copy number variations,AGOsandMETswere selected as examples to illustrate the differences. The analysis revealed that five AGO4 proteins possessed six types of domains,while three AGO6 proteins possessed five types of domains (Fig.2-A). Compared with other AGO4s,the BrAGO4.3 protein lost all types of domains inB.rapa,while the BoAGO4.3 protein was lost directly inB.oleracea. The BoAGO4.4 protein lost the fifth domain,ArgoMid domain,which is a site-specific DNA-guided endoribonuclease. However,there is noBrAGO4.4gene inB.rapa. Similarly,compared with other AGO6s,BrAGO6.4 and BrAGO6.5 only retained the Piwi domain,and the other four domains were lost. Among the MET proteins,compared to the other MET1 proteins,BrMET1.1 and BoMET1.1 had one less copy of the dnMT1-RFD domain,which is part of a cytosine-specific DNA methyltransferase enzyme. In addition,there are noBrMET1.5inB.rapa(Fig.2-B). In summary,many DMR genes inB.rapaandB.oleraceaalso had partial or total loss of their protein domains,such asROSs,CLSY1,andPIE1(Appendix F). These results indicated that DMR genes had undergone functional differentiation during evolution betweenB.rapaandB.oleracea.
Fig.2 Conserved domains of argonaute (AGO) (A) and methyltransferase (MET) (B) proteins. Amino acid sequences are indicated by black lines,and a colored box represents a domain. The stars represent Arabidopsis thaliana genes,the triangles represent Brassica rapa genes,and the circles represent Brassica oleracea genes.
We counted the numbers of these single-copy and multiple-copy DMR genes and compared them to the numbers of whole genome-wide single-copy and multiplecopy paralogous genes (background),respectively,inB.rapa(1.77;14 531/8 191) andB.oleracea(3.59;33 251/9 261) (Table 2). The results show that the genes of the RdDM (4;12/3) and MDM (6;6/1) have a higher proportion than the background inB.rapa,indicating that these genes are more fragmented,but the ADD genes (1;10/10) are preferentially retained inB.rapa. In addition,the DMR genes have a lower proportion than the background inB.oleracea,suggesting that the DMR genes inB.oleraceaare preferentially retained (Table 2).These results reveal that the DMR genes show differential retention betweenB.rapaandB.oleracea.
Table 2 Numbers and ratios of single copy to multiple copy paralogs of Arabidopsis DNA methylation-related (DMR) genes in Brassica rapa and Brassica oleracea
At the same time,the DMR genes were described based on the differences in DMR genes inB.rapaandB.oleracea. Only a limited number of genes,such asAGO6,CLSY1,RDR6,PIE1,andROS3,were identified as multiple copies in one species and as single copy in another (Table 1). Most genes were identified as multiple copies both inB.rapaandB.oleracea. In addition,some genes were identified as single copy both inB.rapaandB.oleracea(Table 1).
In addition,we further explored the particular class of DMR genes in detail because these DMR genes have no copies inB.rapaorB.oleracea(Table 1;Appendice B and C). These genes can be further divided into those with the loss of all copies in bothB.rapaandB.oleracea(RRP6L1,ARP6,andDML2),and loss of all copies in eitherB.rapaorB.oleracea(SUVH9,CMT3,andDME). These genes play important roles inArabidopsis.The geneRRP6L1,which belongs to the RdDM group,controls genome-wide DNA methylation. The absence ofRRP6L1will cause the release of non-coding RNAs from the chromatin,resulting in DNA hypomethylation(Zhang and Zhu 2014). The geneARP6,component of the SWR1 chromatin-remodeling complex,belongs to the ADD group and can inhibit DNA hypermethylation and gene silencing (Nieet al.2019). The geneDML2,which belongs to the ADD group,inhibits DNA hypermethylation at specific genomic regions.DML2,one of the families of DNA glycogenesis,plays an extremely important role in DNA demethylation (Zhu 2009) (Appendix A). The results here showed that no orthologs of these genes were found inB.rapaorB.oleracea,which supported a specific evolution of DNA methylation inBrassicaspecies (Table 1;Appendices B and C).
The geneCMT3,which belongs to the MDM group,maintains CHG cytosine methylation (Barteeet al.2001).The results showed that no orthologs ofCMT3were found inB.oleracea,and only one copy ofCMT3was found inB.rapa(Table 1). Similarly,the geneSUVH9,which belongs to the RdDM group,can bind to methylated DNA at RdDM loci and recruit Pol V to produce noncoding RNAs (Liu Z Wet al.2014). We found no orthologs ofSUVH9inB.rapa,and only one copy ofSUVH9was found inB.oleracea. The geneDMEis from the family of DNA glycosylases,which belongs to the ADD group(Zhanget al.2019). There were no orthologs ofDMEfound inB.rapa,but one copy ofDMEwas found inB.oleracea. These results suggested differences in the evolution of DNA methylation inB.oleraceacompared to that inB.rapa.
To obtain a more thorough understanding of the evolutionary constraints that act on the DMR genes inA.thaliana,B.rapa,andB.oleracea,theKa,Ks,andKa/Ksratios were calculated for the DMR genes inA.thaliana-B.rapaandA.thaliana-B.oleraceapairs (Appendices G and H).
TheKavalues ranged from 0.01761 to 0.9154 with an average of 0.1738 inA.thaliana-B.rapapairs,and from 0.0180 to 1.1523 with an average of 0.1823 inA.thaliana-B.oleraceapairs. Studies have shown that MDM genes are more susceptible to natural selection. This was specifically reflected in the fact that inA.thaliana-B.rapapairs,the MDM pathway displayed the highest meanKaratio (0.2276),while inA.thaliana-B.oleraceapairs,the MDM pathway also displayed the highest meanKaratio(0.2764) (Fig.3-A). Similarly,theKa/Ksvalues ranged from 0.0573 to 0.5700 with an average of 0.2591 inA.thaliana-B.rapapairs,and from 0.0489 to 0.5689 with an average of 0.2550 inA.thaliana-B.oleraceapairs. InA.thaliana-B.rapaandA.thaliana-B.oleraceapairs,the MDM pathway displayed the highest meanKa/Ksratio.Nevertheless,there were some genes with higherKa/Ksvalues in the ADD pathways,including their respective copies,such asMBD7andROS3(Fig.3-B). In addition,theKa/Ksratios of RdDM genes exhibited different value ranges betweenB.rapaandB.oleracea,and the same was true of the MDM genes. TheKa/Ksratios of DMR genes in the same subgenomes showed different value ranges (Appendix I),such as in MF2,and the averageKa/Ksratios of DMR genes inA.thaliana-B.oleraceapairs were higher than inA.thaliana-B.rapapairs.
We investigated the genetic diversity of these DMR genes from the population genetic level in bothB.rapaandB.oleracea. Genes under different strengths of selection will show variations in genetic diversity,which can be observed on the population level of the corresponding genome. Previously,two different datasets forB.rapaandB.oleraceawere released (Chenget al.2016b),providing a good basis for estimating the selection pressure on these DMR genes in the two crop populations ofBrassica. Using the two datasets,we calculated π and selection sweep (Tajima’s D) of these individual DMR genes inB.rapaandB.oleracea.
The π values ranged from 0.000075 to 0.014117 with an average of 0.004558 inB.rapa,and from 0.000033 to 0.018084 with an average of 0.004352 inB.oleracea(Appendix J). In the same species,some different categories of DMR genes showed similar trends in nucleotide diversity. For example,the RdDM and MDM genes showed similar nucleotide diversity inB.rapa,while the MDM and ADD genes showed similar nucleotide diversity inB.oleracea. Furthermore,different categories of DMR genes showed different nucleotide diversities betweenB.rapaandB.oleracea. In general,this difference was mainly reflected in the different pathways (Fig.3-C),as well as different subgenomes(Appendix K). Among them,the RdDM genes and ADD genes inB.rapahad higher nucleotide diversity than inB.oleracea(Fig.3-C). Among the ADD genes,BrZDP.3andBoMFP2.2had the highest nucleotide diversity,whileBrACX4.2andBoMBD9.2had the lowest genetic diversity inB.rapaandB.oleracea,respectively (Appendix H).Similarly,in LF and MF2,the DMR genes had a higher nucleotide diversity inB.rapathan inB.oleracea(Appendix K). In addition,although the nucleotide diversity of MDM genes inB.rapawas lower than that ofB.oleracea,overall,the nucleotide diversity of DMR genes in most different categories ofB.rapawas higher than that ofB.oleracea(Fig.3-C;Appendix K).
At the same time,Tajima’s D values showed the conservation of DMR genes inB.rapaandB.oleracea(Fig.3-D). One of the DMR genes,BrNPX1.1a,underwent negative selection in all DMR genes,with a Tajima’s D value of ?0.1189. Other than this gene,all the others underwent positive selection (Appendix L). Furthermore,the different categories of DMR genes showed different trends in conservation betweenB.rapaandB.oleracea(Fig.3-D;Appendix K). More specifically,the RdDM,MDM,and ADD genes inB.rapahad higher Tajima’s D values than inB.oleracea,and in the LF and MF1 subgenomes,and the Tajima’s D value for DMR genes inB.rapawas higher than inB.oleracea(Appendix K).
Fig.3 The frequency distribution of Ka,Ka/Ks,π,and Tajima’s D. RdDM,MDM,and ADD represent the three different pathways of the DNA methylation-related (DMR) genes;ALL represents all genes of this species. Significance is indicated for P-values<0.001 (**).
To explore the expression characteristics of DMR genes in different tissues ofB.rapaandB.oleracea,we calculated the expression of DMR genes in five tissues,including roots,stems,leaves,flowers,and siliques (see Section 2.5). Overall,most of the DMR genes in different pathways showed similar expression characteristics betweenB.rapaandB.oleracea. However,some DRM genes are highly expressed in one species and either only minimally expressed,unexpressed,or absent in another.In the RdDM pathway,some DMR genes inB.oleraceawere significantly higher than inB.rapa,such asRDM1.1andSUVH9.4(Fig.4-A);while in the MDM pathway,some DMR genes inB.rapawere higher than inB.oleracea,such asDDM1.3a,MET1.4,andSUVH4.2(Fig.4-B).Notably,seven DMR genes were almost not expressed at all in any of the tissues inB.rapaandB.oleracea(Fig.4).Similarly,DMR gene members inB.rapaandB.oleraceahad diverse expression modes across different tissues(Fig.4). InB.rapa,some DMR genes in the RdDM and MDM pathways had higher expression levels in stems and flowers,while inB.oleracea,some DMR genes in the RdDM and MDM pathways had higher expression levels in leaf and silique tissues (Fig.4-A and B).
Fig.4 Expression profiles for DNA methylation-related (DMR) genes in different tissues. A,B,and C show the RNA-directed DNA methylation (RdDM),maintenances of DNA methylation (MDM) and active DNA demethylation (ADD) pathways,respectively. Colors correspond to log2(TPM+1) values;the deeper the red,the higher the expression,while gray represents the absence of the gene.TPM,transcripts per million.
DNA methylation is an important epigenetic marking mechanism that regulates many biological processes in plants,and is of special importance for genome evolution in polyploids (Baulcombe and Dean 2014;Diezet al.2014;Song and Chen 2015;Elhamamsy 2017;Itabashiet al.2018).
The WGD-derived multi-copy genes suffer from extensive gene loss,which is substantially associated with DNA methylation. Previous studies have suggested that DNA methylation levels are biased toward the more fractionated subgenomes,and also vary in the genes with different copy numbers (Diezet al.2014;Chenet al.2015). Chenet al.(2015) found that the LF subgenome exhibited the lowest DNA methylation levels,whereas the single-copy genes are saddled with higher methylation than multi-copy genes through the DNA methylation profile inB.rapa. These results indicated the potential role of the DNA methylation system in the evolution of the plant genomes that survived polyploidization,especially for those with subgenome dominance,such asB.rapaandB.oleracea. However,it is difficult to directly study the role of DMR genes in the formation of subgenome dominance,since these DMR genes and the subgenome dominance both went through longterm evolution after polyploidization. As for the mesohexaploidsB.rapaandB.oleracea,it was found thatB.oleraceacontained more TEs thanB.rapa,and thatB.oleraceaconsistently exhibited higher DNA methylation(CpG) thanB.rapa,implying differences in the strength of selection and the potential divergence in their DMR genes (Parkinet al.2014;Chenet al.2015). Therefore,in addition to the characterization of the genomic DNA methylation patterns in these species shown previously,the systematic identification of DMR genes,followed by a comparative genomics and evolutionary analysis of these genes betweenB.rapaandB.oleracea,would provide new clues for explaining their differences in terms of DNA methylation and TE contents,as well as the formation of subgenome dominance. These genes and related information are valuable resources for future gene function studies regarding the differential DNA methylation levels between the genomes or subgenomes ofBrassica.
In this study,a total of 78 and 77 DMR genes were identified inB.rapaandB.oleracea,respectively.Both species contain significantly more DMR genes thanA.thaliana,and almost all these DMR genes are located in the syntenic regions compared withA.thaliana(Fig.1;Appendix E). These findings indicate that theBrassica-specific WGT event played an important role in promoting the DMR gene expansion in these twoBrassicaspecies,similar to the results of a previous study focusing on leaf adaxial-abaxial patterning genes(Lianget al.2016). The gene loss following the WGT also greatly influenced the scenario of DMR genes,as evidenced by the fact that only seven DMR genes were retained as triplets inB.rapaandB.oleracea(Table 1;Appendices B and C). While the total number of identified DMR genes inB.rapaandB.oleraceawas similar,the gene copy numbers in different DMR pathways varied due to the biased DMR gene loss(Table 1). Furthermore,although most DMR genes were ubiquitously expressed in the examined tissues,some DMR genes in the RdDM pathways showed higher expression inB.oleraceathan inB.rapa(Fig.4).Given the critical role of DNA methylation in protecting the plant genomes from the impairment caused by the deleterious amplification of TEs,coupled with the fact thatB.oleraceacontains substantially more TEs thanB.rapa(Vitte and Panaud 2005;Hawkinset al.2006;Chenget al.2016a;Denizet al.2019),our results prompted us to hypothesize that biased DMR gene retention and DMR gene expressions,especially in the RdDM pathways,play critical roles in modulating the different TE contents in these two species.
During long-term evolution,the DMR genes may experience different fates,such as nonfunctionalization,subfunctionalization,and neofunctionalization (Lynch and Conery 2000). Gene structural analysis provides a promising approach for investigating the evolutionary history of genes (Zhouet al.2020;Mulekeet al.2021). Here,we found that some DMR genes such asBrAGO4.3andBoHDP1.2lost key functional domains(Fig.2;Appendix F),and these DMR genes had either no expression or relatively weak expression levels in different tissues (Fig.4),suggesting that they might have experienced nonfunctionalization after they diverged from the common ancestor withA.thaliana. In contrast,some DMR genes,includingBoROS1.1andBoDRM2.3,were found to have new functional domains,indicating that these genes likely underwent neofunctionalization (Kimet al.2014). In addition to the functional domain variation,the values of nucleotide diversity and selection sweep analysis revealed that the DMR genes showed variations in genetic diversity and were under different strengths of selection inB.rapaandB.oleracea,providing a new perspective for understanding the mechanism regulating the differences in methylation levels betweenB.rapaandB.oleracea(Fig.3).
This study systematically identified theB.rapaandB.oleraceaDMR genes at the whole-genome level. A total of 78 DMR genes inB.rapaand 77 DMR genes inB.oleraceawere identified. Functional and evolutionary analyses showed differential evolution of DMR genes betweenB.rapaandB.oleracea,and also demonstrated that the DMR genes inB.oleraceaare more conserved than those inB.rapa. The transcriptomic data revealed that most DMR genes show similar expression patterns inB.rapaandB.oleracea. These results enhance our understanding of the divergent evolution of DNA methylation betweenB.rapaandB.oleracea.
Acknowledgements
The work was supported by the National Natural Science Foundation of China (NSFC;31872105 and 31801862),the Science and Technology Innovation Program of the Chinese Academy of Agricultural Sciences,and the Key Laboratory of Biology and Genetic Improvement of Horticultural Crops,Ministry of Agriculture and Rural Affairs,China.
Declaration of competing interest
The authors declare that they have no conflict of interest.
Appendicesassociated with this paper are available on http://www.ChinaAgriSci.com/V2/En/appendix.htm
Journal of Integrative Agriculture2022年6期