Junxia SONG, Wanting WU, Hongbing QI*
1. School of Geographic Sciences, Lingnan Normal University, Zhanjiang 524048, China; 2. Life Science & Technology School, Lingnan Normal University, Zhanjiang 524048, China
Abstract [Objectives] The paper was to analyze and predict the structure and characteristics of xylanase xynMF13-GH10 gene and its encoded protein. [Methods] xynMF13-GH10 gene was predicted by NCBI and various information analysis tools in ExPASy website, as well as SignaIP 5.0, DNAman, TMHMM, SOPMA and SWISS-Model, and the characteristics and functions of protein structure encoded by the gene were predicted. [Results] The gene is 1 332 bp in length, the coding region is 1-1 332 bp, and the gene encodes 443 amino acids. The xynMF13-GH10 gene has high homology with xylanase in many species, and it has the highest homology with Paraphaeosphaeria sporulosa endoxylanase-like protein, with the consistency reaching 78.91% and e-value reaching 2e-159. The secondary structure consists of 48.31% random curl, 30.25% α-helix structure, 16.70% extended chain and 4.74% β-corner. [Conclusions] The results provide a reference for revealing the physiological function and expression regulation mechanism of xynMF13-GH10 gene in the future.
Key words Xylanase, Mangrove, Xylanase gene, Bioinformatics
Xylan is the main chemical component of plant hemicellulose, and it is often polymerized with a small amount of mannan, arabinoxylan, glucose, xyloglucan and galactomannan into polymer compounds with branched chains[1]. As an important component of plant hemicellulose, xylan is the most abundant renewable biomass resource in nature besides cellulose[2]. The recovery rate of these agricultural waste resources is very low, and most of them will be wasted. The development and utilization of these resources can alleviate the environmental, resource and food crisis. Over the past decade, the reuse of agricultural waste has been the focus of research in various countries. In many applications, using xylan to produce functional oligosaccharides with high added value is an effective value-added approach to recover agricultural wastes rich in hemicellulose substances, which has huge potential economic effect and environmental significance[3-5].
Xylanase is a group of complex enzymes that can degrade xylan to oligosaccharide or xylose, mainly including endo-β-D-xylosidase,β-l, 4-D-xylanase,α-L-arabinofuranosidase,α-D-glucuronidase, acetylxylan esterases and phenol aid esterases[6]. Xylanase in a narrow sense is limited toβ-1, 4-endoxylanase. Xylanase comes from a wide range of sources. At present, studies have shown that microorganisms that can produce xylanase mainly includeAspergillus,Trichoderma,Fusarium,Bacillus,Streptomyces,etc.[7-10]. Xylanase has been widely used in the fields of food, feed, pulping and papermaking and bio-energy. In the food industry, xylanase can prolong the shelf life of bread by effectively improving the water holding capacity and stability of dough during production, increasing the volume of bread after baking, improving the texture of bread core, and reducing the aging rate of bread[11-12]. In the pulping industry, it can be used for pulp bleaching to improve the whiteness of pulp, thus effectively reducing the dosage of chemical bleach[13-15]. In the feed industry, it can reduce the increased viscosity of chyme caused by water-soluble xylan. Meanwhile, xylanase can also act on non-starch polysaccharides to break the plant cell wall, finally releasing nutrients[16-17]. In the field of bio-energy, xylanase can improve the yield of xylose and increase the amount of total reducing sugar[18-19]. At present, the focus of xylanase research is still on the xylanases of the 10thfamily (GH10) and 11thfamily (GH11). The xylanases of the 10thfamily are generally characterized by low isoelectric point (pI) and high molecular weight. In terms of substrate interaction, xylanases are active to both xylan and low molecular cellulose, especially to short chain xylo-oligosaccharides with smaller substrate binding sites and higher activity[20].
Mangrove, located between the sea and the land, is the only forest vegetation in tropical and subtropical coast and estuarine intertidal zone, and it is a kind of special ecosystem. As a unique land-sea ecosystem, in addition to forming a special ecological landscape, mangrove also has the functions of windbreak and dike reinforcement, protecting fishing boat, farmland and village, disaster resistance and alleviation, playing a special role in preventing the pollution of coastal waters, protecting the biodiversity of coastal areas and maintaining the stability and balance of estuary ecosystem. There are many microorganisms distributed in the mangrove ecosystem, which are diverse and huge in number, and can reproduce rapidly, with the characteristics of genetic variation and metabolic diversity. Once the external environment is changed by organic pollutants, the microorganism’s special enzyme system responds quickly, and it is possible to decompose or transform the organic pollutants in response to the environmental change. The previous studies suggest that bacteria and fungi account for 91% of the total microbial resources in mangrove soil[21]. The decomposers of some organic matter such as litters in mangrove natural ecosystem have great potential for the development of xylanase-producing microorganisms.
2.1 Sources and data of strainsThe fungal strain xynMF13-GH10 from mangrove was identified asPhomasp., belonging to Didymellaceae, Pleosporales, Dothideomycetes, Ascomycota[20].
The nucleotide sequence of the gene was queried in the nucleotide collection (nt) database by using Blastn on NCBI website (https:∥blast.ncbi.nlm.nih.gov/Blast.cgi).
2.2 MethodsUsing the basic local alignment search tool (BLAST, www.ncbi.nlm.nih.gov/blast/) of National Center for Biotechnology Information (NCBI, www.ncbi.nlm.nih.gov), homology analysis of gene and amino acid was completed by Blastn and Blastx, respectively.
The open reading frame was searched and the coding protein sequence was deduced using ORF finder (http:∥www.ncbi.nlm.nih.gov/gorf/orfig.cgi). The physicochemical properties of proteins were analyzed using Protparam online software. The conservative domain was analyzed and predicted by online software CD-Search(https:∥www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi). The signal peptide of xynMF13-GH10 protein was predicted by online software SignaIP 5.0. The hydrophobicity of xynMF13-GH10 amino acid sequence was analyzed by software DNAman. The transmembrane domain of xynMF13-GH10 protein was analyzed by online software TMHMM. The secondary structure of xynMF13-GH10 protein was predicted by online software SOPMA. The three-dimensional structure of xynMF13-GH10 protein was predicted by online software SWISS-Model.
3.1 Analysis ofxynMF13-GH10gene sequenceIn order to study the function ofxynMF13-GH10gene, the nucleotide sequence of the gene was used as the query to search the nucleotide collection (nt) database by using Blastn on NCBI website (https:∥blast.ncbi.nlm.nih.gov/Blast.cgi). Blastn analysis showed thatxynMF13-GH10has high homology withParaphaeosphaeriasporulosaendoxylanase-like protein,Alternariaalternataendoxylanase,A.namibiaeglycoside hydrolase family 10 protein andAureobasidiumsubglacialeEXF-2481 glycoside hydrolase family 10 protein, which has the highest homology withP.sporulosaendoxylanase-like protein, with the consistency reaching 78.91% and e-value reaching 2e-159 (Fig.1). Therefore, through Blastn comparison, it can be inferred that the protein encoded byxynMF13-GH10gene may have the functions of xylanase or glycoside hydrolase.
Fig.1 Analysis of xynMF13-GH10 gene by Blastn
ThexynMF13-GH10gene was analyzed by ORF Finder and the amino acid sequence encoded by open reading frame was obtained. The gene is 1332 bp in length, and the coding region is 1-1 332 bp. The gene encodes 443 amino acids, with an initiation codon ATG and a termination codon TAA (Fig.2).
Fig.2 ORF nucleotide sequence and deduced amino acid sequence of xynMF13-GH10 gene
3.2 xynMF13-GH10 protein
3.2.1Physicochemical properties. The physicochemical properties of xynMF13-GH10 protein were analyzed by Protparam online software. The results showed that xynMF13-GH10 protein consists of 443 amino acids, including 39 negative charge residues (Asp+Glu) and 34 positive charge residues (Arg+Lys), and the molecular formula is C2 042H3 169N533O643S15. The major physicochemical properties are as follows: isoelectric point (pI) 5.51, total atomic weight 6402, molecular weight 45.95 kD, fat index 76.07, instability index (II) 26.13, GRAVY coefficient -0.014. Therefore, the protein encoded byxynMF13-GH10gene was classified as stable hydrophobic protein.
3.2.2Prediction of conserved domains. The conserved domains of xynMF13-GH10 protein were analyzed by online software CD-Search (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi). The analysis results showed that the amino acids of the protein encoded by this gene contain fungal-type cellulose-binding domains from the 27thto 59thpositions, envelope glycoprotein I conserved domains from the 55thto 158thpositions, and glycosyl hydrolase family 10 conserved domains from the 147thto 438thpositions, as well as a carbohydrate binding domain (CBM) (Fig.3).
Fig.3 Prediction of conserved domains
3.2.3Prediction of signal peptide. The signal peptide of xynMF13-GH10 protein was predicted by online software SignaIP 5.0. The analysis results showed that the score of maximum original splice site of amino acids at the 23rdand 24thpositions was 0.672 3, and the probability value of signal peptide existence was 0.977 2 (Fig.4). The first 23 amino acids of N-terminus were predicted to be signal peptides, and it was speculated that xynMF13-GH10 is a protein that can secrete signal peptides.
Fig.4 Prediction of signal peptide
3.2.4Prediction of hydrophobicity and transmembrane domain of amino acids. The hydrophobicity of the amino acid sequence of xynMF13-GH10 was analyzed by software DNAman. The results showed that there were many amino acid residues with high hydrophobicity, and the hydrophobicity of amino acid residue was the strongest at the 368thposition (3.01), while the hydrophobicity of amino acid residue at the 22ndposition was also very strong (2.93). In general, hydrophobic amino acids distributed in the whole peptide chain, and the 24 amino acids at the N-terminus had strong hydrophobicity, which was speculated to be the signal peptide region (Fig.5). It is consistent with the prediction result of signal peptide, inferring that xynMF13-GH10 is a hydrophobic protein.
Fig.5 Prediction of hydrophobicity of xynMF13-GH10 amino acids
The transmembrane domain of xynMF13-GH10 protein was analyzed by online software TMHMM. The results showed that the 24 amino acids at the N-terminus of xynMF13-GH10 protein might be located on the surface of cell membrane (Fig.6), and it can be speculated that the N-terminus of xynMF13-GH10 is highly hydrophobic. This is consistent with the previous results of signal peptide prediction and hydrophobicity analysis, so xynMF13-GH10 protein might be a secretory protein with signal peptide at N-terminus.
Fig.6 Prediction of transmembrane domain of xynMF13-GH10 protein
3.2.5Prediction of secondary and three-dimensional structure of protein. The secondary structure of xynMF13-GH10 protein was predicted by online software SOPMA. As shown in Fig.7, blue, red, green and pink represented α-helix structure, extended chain, β-corner structure and random curl structure, respectively. The results showed that the protein is composed of 48.31% random curl, 30.25% α-helix structure, 16.70% extended chain and 4.74% β-turn structure, and there are many random curls at N-terminus and more α-helix structure at C-terminus (Fig.7).
Note: Blue, α-helix; Red, extended chain; Green, β-corner; Pink, random curl.Fig.7 Prediction of secondary structure of xynMF13-GH10 protein
The three-dimensional structure of xynMF13-GH10 protein was predicted by online software SWISS-Model. The analysis results showed that all the templates compared with xynMF13-GH10 were xylanase-like proteins, inferring that the protein had xylanase activity. The best template was 1b30.1.A PROTEIN (XYLANASE), with the identity value of 62.75% and the coverage of 0.67, and the amino acid position matched was 145-442aa. It was speculated that the protein might act as a monomer in organisms and probably have the activity of xylanase (Fig.8).
Fig.8 Prediction of three-dimensional structure of xynMF13-GH10 protein
The homology of the amino acid sequences of xynMF13-GH10 was compared according to NCBI-Blastx (Table 1). The results showed that the amino acid sequences of xynMF13-GH10 had high homology with proteins of 17 different species, such asCurvulariakusanoiEndo-1,4-beta-xylanase F3,Macroventuriaanomochaetaglycoside hydrolase family 10,Ascochytarabieicellulose binding,Stemphyliumlycopersiciglycoside hydrolase family 10,P.sporulosaendoxylanase-like protein. The similarity withC.kusanoiEndo-1,4-beta-xylanase F3 reached 98.65%, with the e-value of 0, the total score of 806 and the coverage of 99%.
Table 1 Comparison of homology between xynMF13-GH10 protein and its homologous protein
ThexynMF13-GH10gene has high homology with xylanase in several species, and has the highest homology withP.sporulosaendoxylanase-like protein, with the consistency reaching 78.91% and e-value reaching 2e-159. The gene is 1 332 bp in length and the coding region is 1-1 332 bp. The gene encodes 443 amino acids, including an initiation codon ATG and a termination codon TAA. xynMF13-GH10 is a hydrophilic protein with a conserved domain and a signal peptide region, inferring that xynMF13-GH10 is a hydrophobic protein. The 24 amino acids at the N-terminus of xynMF13-GH10 protein may be located on the surface of cell membrane. Combined with the previous signal peptide prediction and hydrophobicity analysis results, it is speculated that xynMF13-GH10 may be a secretory protein. The protein is composed of 48.31% random curl, 30.25% α-helix structure, 16.70% extended chain and 4.74%β-corner structure. Therefore,xynMF13-GH10gene and its encoding protein have been predicted successfully.
It has been proved that xylanasexynMF13-GH10gene could achieve heterologous expression in pichia pastoris, and TLC analysis shows that the enzymolysis products are mainly xylobiose and xylotriose, indicating that xynMF13-GH10 is an endoxylanase[20]. With the wide application of xylo-oligosaccharide in many fields, the production method of xylo-oligosaccharide by enzymolysis of xylanase is more environmentally friendly and efficient than other methods such as physical chemistry[22]. However, the production of xylo-oligosaccharide by natural xylanase-producing strains has some problems, such as low natural enzyme yield, low catalytic efficiency and unstable enzyme production. Therefore, the practical industrial application of xylanase has gradually become the focus of research[23]. With the continuous development of biotechnology, it has become an effective means to solve these problems by modifying the characteristics of strains and constructing engineering strains through genetic engineering[24].
The xynMF13-GH10 protein contains a carbohydrate binding domain (CBM) in addition to a typical 10thfamily xylanase catalytic domain. Different studies have shown that CBM can increase the catalytic activity of enzymes, the ability to bind and degrade insoluble substrates, the substrate binding specificity and the connection with the catalytic region, and maintain the stability of xylanase[25-26].
Carbohydrate binding module (CBM) is a class of carbohydrate-binding proteins composed of amino acid sequences that are contained in a larger coding protein sequence, which is usually a part of multi-module enzymes. The role of CBM is to combine the catalyst module with carbohydrate active enzyme to improve the catalytic efficiency of multi-module carbohydrate active enzyme, and CBM usually binds to carbohydrate ligands[27]. Initially, CBM was known as the cellulose binding domain (CBD) because it was a binding domain that could specifically adsorb cellulose substrates and was considered an important component of cellulase, which was connected to catalysis domain (CD) of one or more cellulose substrates through peptide chains[28-29]. Later, a large number of studies have shown that there are also polysaccharide substances in the binding domain of polysaccharide hydrolase, such as starch, chitin and xylan[30], so it is called carbohydrate binding module (CBM). Traditionally, CBM can determine the binding site of enzyme and substrate, and further improve the availability of the interaction between enzyme and insoluble fiber substrate, ultimately improving the catalytic efficiency of enzyme[30-32].
With the rapid development of genetic engineering and fusion technology, CBM not only has biological functions, but also has excellent performance as a tool for fiber modification, biological purification, bioremediation and analytical testing. As a novel raw material, CBM has attracted people’s attention.
Asian Agricultural Research2022年5期