Zhizhen XU, Jing CHEN, Mu MU, Guokun DAI, and Yanan MA
1Department of Atmospheric and Oceanic Sciences &Institute of Atmospheric Sciences,Fudan University, Shanghai 200438, China
2Numerical Weather Prediction Center, China Meteorological Administration, Beijing 100081, China
3Chinese Academy of Meteorological Sciences, China Meteorological Administration, Beijing 100081, China
ABSTRACT How to accurately address model uncertainties with consideration of the rapid nonlinear error growth characteristics in a convection-allowing system is a crucial issue for performing convection-scale ensemble forecasts. In this study, a new nonlinear model perturbation technique for convective-scale ensemble forecasts is developed to consider a nonlinear representation of model errors in the Global and Regional Assimilation and Prediction Enhanced System (GRAPES)Convection-Allowing Ensemble Prediction System (CAEPS). The nonlinear forcing singular vector (NFSV) approach, that is, conditional nonlinear optimal perturbation-forcing (CNOP-F), is applied in this study, to construct a nonlinear model perturbation method for GRAPES-CAEPS. Three experiments are performed: One of them is the CTL experiment, without adding any model perturbation; the other two are NFSV-perturbed experiments, which are perturbed by NFSV with two different groups of constraint radii to test the sensitivity of the perturbation magnitude constraint. Verification results show that the NFSV-perturbed experiments achieve an overall improvement and produce more skillful forecasts compared to the CTL experiment, which indicates that the nonlinear NFSV-perturbed method can be used as an effective model perturbation method for convection-scale ensemble forecasts. Additionally, the NFSV-L experiment with large perturbation constraints generally performs better than the NFSV-S experiment with small perturbation constraints in the verification for upper-air and surface weather variables. But for precipitation verification, the NFSV-S experiment performs better in forecasts for light precipitation, and the NFSV-L experiment performs better in forecasts for heavier precipitation, indicating that for different precipitation events, the perturbation magnitude constraint must be carefully selected. All the findings above lay a foundation for the design of nonlinear model perturbation methods for future CAEPSs.
Key words:Convection-Allowing Ensemble Prediction System,model uncertainty,nonlinear forcing singular vector
In recent years, convection-allowing ensemble prediction systems (CAEPSs) with high resolutions of 2–4 km have emerged as a major focus and hot topic of current research at various NWP centers worldwide (Clark et al.,2010; Baldauf et al., 2011; Baker et al., 2014; Nuissier et al.,2016; Müller et al., 2017; Zhuang et al., 2021). As compared with global medium-range ensemble systems (with horizontal resolutions of 50–100 km) and regional short-range ensemble systems (with horizontal resolutions of 10–20 km), CAEPSs with higher spatial and temporal resolution can better represent extremes, better predict catastrophic weather conditions on small scales, and provide probabilistic forecasting information for the occurrence and development of convective systems. Therefore, comprehensive CAEPSs have been developed at several scientific research institutions and NWP centers based on high-resolution models, such as the Storm-Scale Ensemble Forecast system of the Center for the Analysis and Prediction of Storms at the University of Oklahoma(Clark et al., 2010), and the convective-scale ensemble at the National Center for Atmospheric Research (NCAR)(Schwartz et al., 2015). In 2019, the Numerical Weather Prediction Center (NWPC) of the China Meteorological Administration (CMA) also began to develop a CAEPS, called the GRAPES-CAEPS, with a horizontal grid spacing of 3 km,which has been tested in South China (15°–30°N, 105°–125°E). All the applications of CAEPSs to severe convective weather forecasting, such as heavy rainfall, have been proven to be superior to coarser-resolution ensembles. Therefore, it is desirable to develop CAEPSs to provide probabilistic guidance for the prediction and early warning of severe convective weather.
However, an optimal design for CAEPSs still remains largely unresolved and may be quite different from that of coarser-resolution ensembles, since the error growth characteristics and dynamics and the mechanisms whereby perturbation growth occurs in high-resolution CAEPSs are very different from those of a synoptic-scale model. For example, nonlinear errors amplify faster in convection-allowing simulations(Lorenz, 1969, Zhang et al., 2003), and error growth rates are about 10 times larger than synoptic-scale forecasts,where moist convective (instead of baroclinic) instabilities become the primary mechanism for error growth (Hohenegger and Schar, 2007). Besides, traditional model perturbation methods for representing forecast uncertainties in global and regional coarser-resolution ensembles like the multimodel (e.g., Krishnamurti et al., 1999), multiphysics (e.g.,Houtekamer et al., 1996), as well as stochastic schemes (e.g., Buizza et al., 1999), which includes the Stochastically Perturbed Parameterizations scheme (SPP; Li et al., 2008;Hacker et al., 2011; Christensen et al., 2015; Xu et al.,2020a, b), the Stochastically Perturbed Parameterization Tendencies (SPPT) scheme (Buizza et al., 1999; Palmer et al.,2009) and the Stochastic Kinetic Energy Backscatter(SKEB) scheme (Shutts, 2005; Berner et al., 2009, 2011) do not consider the effects of rapid nonlinear error growth dynamics and strong nonlinearity in the CAEPSs. More importantly, major problems in designing a CAEPS stem from the lack of knowledge about the mechanisms that facilitate rapid nonlinear perturbation growth and propagation, as well as the role of nonlinearities. Therefore, it is imperative to optimally design a model perturbation method for CAEPSs according to their nonlinear error growth characteristics.
In this respect, the nonlinear forcing singular vectors(NFSV) method (Duan and Zhou, 2013), which is also known as conditional nonlinear optimal perturbation-forcing(CNOP-F), is applied in this study for more comprehensively investigating the nonlinear dynamics and characteristics of the CAEPSs. The NFSV is a natural extension of the CNOP(Mu et al., 2003), which refers to the initial perturbation that has the maximum nonlinear evolution at the prediction time.Research into CNOP originally focused only on the initial perturbation. Duan and Zhou (2013) applied CNOP into the field of model perturbation. They first proposed the NFSV,which represents the optimal tendency perturbation which has the largest nonlinear evolution at prediction time based on physical constraint conditions. NSFV considers the effect of nonlinearity in the NWP models and is capable of describing the most sensitive tendency perturbation associated with model uncertainty (Duan and Zhou, 2013).
Duan et al. (2014) further proposed a NFSV-related assimilation method, where a kind of optimal tendency perturbation of the Zebiak–Cane model (Zebiak and Cane, 1987)was obtained and was in turn forced on the sea surface temperature equation of the model to bring the simulation closer to observation. Consequently, realistic ENSO evolutions were successfully simulated. Moreover, the NFSV method can depict the nonlinear growth of model error (perturbation)more realistically (Duan et al., 2016). Qin et al. (2020) used the NFSV method to identify the optimally growing tendency perturbations for tropical cyclone intensity forecasts, and found that the NFSV can improve its forecasting skill. Additionally, since NFSV can induce the largest perturbation growth, it may be most likely to increase the ensemble spread when it is applied in an ensemble prediction system(Huo, 2016).
The impressive attributes of NFSV inspired us to apply the NFSV approach to develop a model perturbation method for a nonlinear representation of model uncertainty in a CAEPS. Therefore, in this study, the NFSV (hereafter called CNOP-F) is first applied and calculated in the GRAPES-CAEPS to find an optimal tendency perturbation having the largest nonlinear evolution during the forecast period. We will thereby use the calculated NFSV to construct a nonlinear model perturbation scheme for the GRAPESCAEPS, finally revealing the effect of nonlinearity in the convective-scale system and providing possible ways to depict the nonlinear growth of model errors more realistically.
This paper is organized as follows. In section 2, the detailed description, calculation method and distribution of the NFSV is presented. We outline the model configurations,experimental design and data in section 3. The verification results of precipitation and surface and upper-air weather variables, as well as the analysis of difference total energy(DTE) are illustrated in section 4. Finally, section 5 presents the summary and discussion.
The NFSV as proposed by Duan and Zhou (2013) represents the optimal tendency perturbation having the largest nonlinear evolution under a given physical constraint at a given future time. This can be derived and calculated by solving the following nonlinear optimization problem, the solution of which is called the NFSV. A tendency perturbation fδis defined as an NFSV if and only if:
where J represents the objective function. Mτ(f)(U0) and Mτ(0)(U0) represent the propagator of the nonlinear model and the propagator of the nonlinear model with a tendency perturbation f , U0is the initial basic state, and τ represents a given future time. ∥f∥a≤δ is the constraint condition of the tendency perturbation f amplitudes defined by the norm∥·∥a, where the constraint condition is defined as belonging to a sphere with the chosen norm and δ is a positive number.Besides, Mτdenotes the propagator of a nonlinear NWP model. The objective function J with norm ∥·∥bmeasures the magnitude of the departure from the basic state Mτ(0)(U0) caused by superimposing the tendency perturbations. Mathematically, the NFSV is the global maximum of J(f) over the sphere ∥ f∥a≤δ . In this study, ∥ ·∥ais set to the L2norm for constraining the tendency perturbation and ∥·∥bis chosen as moist energy norm for the objective function.
From Eq. (1), we can see that if Mτ(0)(U0) represents a control forecast, the NFSV represents one kind of optimal tendency perturbation which can cause the largest departure (nonlinear evolution) from the control forecast at the prediction time (Tao and Duan, 2019). Additionally, if we want to derive a NFSV that can cause the largest nonlinear evolution within the entire forecast period, the objective function can be modified as follows:
In this respect, the objective function can measure the largest accumulated departure from the control forecast which is caused by the tendency perturbation. We applied the objective function appearing in Eq. (2) in this study.
In this study, the sum of moist energy norms during the forecast period of the area (19.99°–25.99°N, 106.5°–117°E)is chosen as the objective function. We find the NFSV by solving the optimization problem with respect to the following objective function:
where u′t, v′t, Tt′, rt′,p′sis the difference between the original output of the model, i.e., Mτ(0)(U0) and the model output after superimposing tendency perturbation f defined as Mτ(f)(U0) for zonal and meridional wind, temperature, specific humidity, and surface pressure, which measure the nonlinear development of perturbed U0from the initial time t0to the prediction time τ. The integration is conducted over the full domain D and the vertical direction σ, where σ is the vertical coordinate. L represents the latent heat. cprepresents the specific heat at constant pressure and R is the gas constant of wet air. Trand prrepresent the reference temperature with a value of 270 K and the reference static pressure with a value of 1000 hPa, respectively.
The constraint condition is defined by the L2norm as follows:
where
The su′,sv′,st′,sq′represent the tendency perturbation of zonal wind, meridional wind, temperature, and humidity,respectively. The physical constraint radii (δ ) are set to δu′ =1 × 10?5, δv′= 1 × 10?5, δth′ = 1 × 10?4, δq′ = 1 × 10?8, respectively. The physical constraint (δ) value is determined based on the original numerical magnitude of the physical tendency perturbations, and constraints in Eq. (4) are set to limit the perturbation amplitude of the zonal wind, meridional wind, temperature, and humidity tendency to avoid excessive development of the original physical tendency that cause it to lose its optimal physical meaning.
To calculate NFSV, i.e., to solve the optimization problem above, we apply the principal component analysis(PCA)-based particle swarm optimization (PSO) algorithm,which has demonstrated a high capability for solving optimization problems (Kennedy and Eberhart, 1995; Shi and Eberhart, 1998; Mu et al., 2015), to calculate the NFSV in GRAPES-CAEPS. The PCA-based PSO (PPSO) algorithm was proposed by Mu et al. (2015) to deal with the problem of high dimensionality in complicated NWP models. Their research work shows that the PPSO algorithm can effectively calculate CNOP in a complex NWP model and does not require the adjoint model. Therefore, this supports our efforts to employ the PPSO algorithm to calculate the NFSV (CNOP-F) of the GRAPES model in this study. For a detailed illustration and description of the PCA-based PSO(PPSO) algorithm, readers are referred to Mu et al. (2015).
Based on the PPSO algorithm described above, the NFSVs under two constraint radii were calculated. Figures 1 and 2 show the horizontal distributions of NFSVs under two different constraint radii from a randomly chosen ensemble member and model timestep. The first to fourth columns represent the horizontal distribution of NFSVs of zonal wind tendency (U-tendency), meridional wind tendency (V-tendency), temperature tendency (T-tendency), and humidity tendency (Q-tendency), respectively.
As shown in Fig. 1, the distribution of NFSV-S perturbations is relatively scattered for all levels and variables, and is more detailed over land than over the ocean, which may be attributed to the non-uniformity of the underlying surface having a certain influence on the NFSV structure.Figure 2 shows the horizontal distribution of NFSV-L fields with a larger perturbation magnitude constraint, demonstrating that the perturbations have been extended to a greater range for the NFSVs using this larger perturbation magnitude constraint. Similarly, the NFSV perturbation distribution over land is generally more detailed than that over the ocean for all levels and variables.
Fig. 1. Horizontal distribution of NFSVs of the NFSV-S experiment (with a small perturbation magnitude constraint) from a randomly chosen ensemble member and model timestep for (a), (e), (i), (m), (q) U-tendency, (b), (f), (j), (n), (r) Vtendency, (c), (g), (k), (o), (s) T-tendency, and (d), (h), (l), (p), (t) Q-tendency at (a), (b), (c), (d) 1000 hPa, (e), (f), (g), (h)850 hPa, (i), (j), (k), (l) 700 hPa, (m), (n), (o), (p) 500 hPa, and (q), (r), (s), (t) 200 hPa, respectively.
Fig. 2. As in Fig. 1, but for NFSVs of the NFSV-L experiment (with a large perturbation magnitude constraint).
The two calculated sets of NFSVs above with different constraint radii were in turn applied to the physical tendency of U, V, T, and Q of the model to construct a model perturbation for a nonlinear representation of model uncertainties in the CAEPS. We used the NFSV-perturbed nonlinear model perturbation method to perturb the ensemble system at each model time step during a one-month ensemble forecasting process. Three experiments were conducted for one month(1–30 May 2020). The verification results are shown in section 4.
In this study, the GRAPES-CAEPS with a 3-km horizontal grid spacing was employed, which is a convective-scale ensemble prediction system based on the GRAPES 3-km convective-scale model, that has been developed since 2019 at the NWPC of the CMA. Table 1 shows the system configuration of GRAPES-CAEPS. The GRAPES-CAEPS has 15 forecast members (one control forecast member and fourteen ensemble members) and covers the domain of South China (19.99°–25.99°N, 106.5°–117°E). It adopts height-based terrain-following coordinates and has 51 vertical levels. Additionally, the initial conditions and lateral boundary conditions are provided (directly downscaled) by different members of the T639 global ensemble prediction system of the CMA.
Table 1. Configuration of GRAPES-CAEPS.
As shown in Table 2, three experiments were carried out for one summer month (1–30 May 2020) based on the GRAPES-CAEPS over South China. No model perturbation was applied in the CTL experiment. The NFSV-S and NFSV-L experiments are NFSV-perturbed experiments with different constraints, and therefore have perturbations of different magnitudes. For the NFSV-S and NFSV-L experiments, we first applied the dynamical downscaled initial perturbation method to construct 14 ensemble members, and then we calculated the NFSV, which in turn was applied to the model to construct a model perturbation in the GRAPES CAEPS. We then used the nonlinear NFSV-perturbed model perturbation method to perturb the ensemble system at each step during a one-month ensemble forecasting process. In this way, we constructed the NFSV-perturbed experiments (i.e., NFSV-S and NFSV-L).
Table 2. Experiments conducted in this study.
A comparison was made between NFSV-perturbed experiments and the CTL experiment to investigate whether the introduction of NFSV could improve the forecasts and to test its impact on the ensemble prediction system. Since different perturbation magnitude constraints of NFSV may lead to different perturbation magnitudes of NFSV, this may result in totally different ensemble forecasts when the NFSV perturbations of different magnitude are applied. Therefore, it is of great importance to make another comparison:between NFSV-S and NFSV-L experiments to test which NFSV-perturbed experiments with different perturbation magnitude constraint settings (that is, different NFSV perturbation magnitudes) can achieve a better forecasting performance.Forecasts were initialized at 0000 UTC and run out to 24 h forecast length for all experiments.
The GRAPES high-resolution 3-km gridded analysis was used as truth, and the observed rain gauge data from ground-based stations in South China were used for precipitation verification.
In this study, we evaluated the precipitation, temperature at four levels (250 hPa, 500 hPa, 850 hPa, and 2 m) and zonal wind at four levels (250 hPa, 500 hPa, 850 hPa, and 10m) for comparing the performances of the three experiments. The verification of upper-air and surface weather variables is performed by employing a set of verification measures: ensemble spread, root-mean square error (RMSE), consistency (defined as the ratio of the ensemble spread to the RMSE), the continuous ranked probability score (CRPS)(Hersbach, 2000), and outlier scores (i.e., the sum of the two end bins of the rank histograms).The verification of precipitation is performed by employing the area under the relative operating characteristic curve (AROC) score (Mason,1982), the Fractions Skill Score (FSS; Roberts and Lean,2008), and the Brier Score (BS; Brier, 1950). Readers can refer to Jolliffe and Stephenson (2012) as well as the appendix for a detailed description of these verification metrics.
Additionally, in our research, the statistical significance has been analyzed by performing an unpaired student’s ttest where we reject a null hypothesis at the 0.05 level of significance, with the null hypothesis being that the difference between the reference CTL experiment and the NFSV-perturbed experiment is zero, and that the difference between the reference NFSV-L experiment and the NFSV-S experiment is zero. Significant differences between the NFSV-L and CTL at the 95% confidence level are denoted by red square points along lines , and the significant differences between the NFSV-S and CTL are denoted by blue square points along lines (Figs. 3, 4, 5, 8 for the AROC, FSS, BS,the CRPS) and the specific significance levels have been listed for spread, RMSE, consistency and outlier scores.
4.1.1.AROC
The AROC score is a commonly used metric for measuring the statistical discrimination capability of an EPS(Mason, 1982), and it is applied in this study for verification of precipitation forecasts. The AROC scores of 6-h accumulated precipitation for 0.1-mm, 4-mm, 13-mm, and 25-mm thresholds are shown in Fig. 3. In comparison with the CTL experiment, the NFSV-S and NFSV-L experiments are both characterized by higher AROC scores for all precipitation thresholds, which implies that the NFSV-perturbed experiments can improve the precipitation forecasts. Additionally,the NFSV-S experiment achieves generally higher AROC scores than the NFSV-L experiment for the 0.1-mm (Fig. 3a)and 4-mm (Fig. 3b) thresholds, while the NFSV-L experiment produces higher AROC scores for the 13-mm (Fig. 3c) and 25-mm (Fig. 3d) thresholds, indicating that the NFSV-S experiment and NFSV-L experiment produce more skillful forecasts for lighter rainfall events and heavier rainfall events, respectively. The improvement in AROC between the NFSV-L and NFSV-S experiments are statistically signifi-cant at the 99% level (t-test) for all of the forecast lead times.
Fig. 3. Domain-averaged (a?d) AROC scores of 6-h accumulated precipitation for four thresholds [(a) 0.1 mm, (b) 4 mm, (c) 13 mm, and (d) 25 mm] for the three experiments, varying with forecast hour. The results are the monthly averages for the 0000 UTC cycle during May 2020.
4.1.2.FSS
In the recent years, the neighborhood approach has been widely employed to derive grid-scale probabilities and to verify high-resolution ensemble forecasts (Schwartz and Sobash, 2017). The FSS, which compares the forecast and observed fractional coverage of events in windows of increasing neighborhood size (Roberts and Lean, 2008; Roberts,2008), is also applied for precipitation verification. According to the definition in Roberts and Lean (2008), the FSS is given by
where O(i, j) and P(i, j) are the ensemble mean forecast and observation fractions at location (i, j), the fraction is defined as the proportion of the neighborhood covered by precipitation exceeding a given threshold compared to the full neighborhood, and N represents the total number of grid points in the area.
Figure 4 shows the FSS values for 6-h accumulated precipitation against neighborhood length. For all thresholds of 0.1 mm (Fig. 4a), 4 mm (Fig. 4b), 13 mm (Fig. 4c), and 25 mm(Fig. 4d), the NFSV-S and NFSV-L experiments achieve higher FSSs and are more skillful than the CTL experiment over all neighborhood lengths, indicating an improvement in precipitation probabilistic skill, especially for the heavier precipitation (Figs. 4c and 4d). Moreover, the NFSV-S experiment achieves quite similar or slightly higher FSSs for the 0.1-mm (Fig. 4a) and 4-mm (Fig. 4b) thresholds compared with the NFSV-L experiment, while the NFSV-L experiment achieves higher FSSs for heavier precipitation (Figs. 4c, 4d),which implies that the NFSV-S experiment generally performs better in forecasts for light precipitation, and the NFSV-L experiment performs better in forecasts for heavier rainfall events (above the 13-mm threshold). The improvement in FSS between the NFSV-S and NFSV-L experiments are statistically significant at the 99% level (t-test) for 0.1 mm (Fig. 4a), 13 mm (Fig. 4c), and at 25 mm precipitation thresholds (Fig.4d), and the 95% level (t-test) for the 4 mm threshold (Fig. 4b).
Fig. 4. Graphs of FSS against neighborhood length for 6-h accumulated precipitation for the three experiments using precipitation thresholds of (a) 0.1 mm, (b) 4 mm, (c) 13 mm, and (d) 25 mm. The results are the monthly averages for the 0000 UTC cycle during May 2020.
4.1.3.BS
The BS is commonly used as a measure of probabilistic forecasting skill and is also illustrated in this study. As shown in Fig. 5, compared to the CTL experiment, the NFSV-S and NFSV-L experiments are characterized by overall lower BSs values for all precipitation thresholds, which indicate a better performance for the NFSV-perturbed experiments. The NFSV-S and NFSV-L experiments have similar value of BSs for the 0.1-mm and 4-mm thresholds (Figs. 5a,5b), indicating a comparable forecasting skill. Additionally,the NFSV-L experiment produces generally lower BSs than those of the NFSV-S experiment for the higher thresholds of 13 mm (Fig. 5c) and 25 mm (Fig. 5d), indicating that the NFSV-L experiment outperforms the NFSV-S experiment for heavier rainfall events (above the 13-mm threshold).The improvement in BS between the NFSV-S and NFSV-L experiments are statistically significant at the 95% level (ttest) for the verified precipitation thresholds and for most of the forecast lead times, except for the differences for 13 mm at 18h (Fig. 5c) and 25 mm at 12 h (Fig. 5d).
In summary, based on the verification results of precipitation shown above, it is indicated that the NFSV-S and NFSV-L experiments are characterized by an overall better probabilistic forecasting performance of precipitation compared with the CTL experiment, which implies that the NFSV-perturbed method may be used as an effective model perturbation method for CAEPSs. Additionally, comparing the NFSV-S experiment with the NFSV-L experiment, the NFSV-S experiment is characterized by a better representation for lighter precipitation, while the NFSV-L experiment is characterized by a better representation for heavier precipitation (threshold above 13 mm), which implies that, for different precipitation events, the perturbation magnitude constraint must be carefully selected.
4.2.1.Spread, Root Mean Square Error (RMSE), and consistency
Figure 6 illustrates the RMSE of the ensemble mean and ensemble spread (left panel) and consistency (right panel), for the NFSV-S, NFSV-L, and CTL experiments for 250-hPa (Figs. 6a, 6b), 500-hPa (Figs. 6c, 6d), 850-hPa(Figs. 6e, 6f), and 10-m (Figs. 6g, 6h) zonal winds. Note that a perfect consistency has a value of 1.0.
As shown in Fig. 6 (left panel), the NFSV-S and NFSVL experiments exhibit a larger spread compared to the CTL experiment, indicating that the NFSV-perturbed experiments can improve the ensemble spread. Additionally, the NFSVL experiment achieves a higher spread than that of the NFSV-S experiment for the 250-hPa (Fig. 6a), 500-hPa (Fig.6c), 850-hPa (Fig. 6e), and 10-m zonal winds (Fig. 6g), indicating a better probabilistic performance. Moreover, the NFSV-L experiment produces quite similar or slightly higher RMSE compared with the NFSV-S experiment.Notably, none of the NFSV RMSE exceeds the RMSE of the CTL experiment, indicating that, although increasing NFSV perturbation magnitude may cause a slight increase in RMSE for zonal wind, the effect of the introduction of NFSV perturbation on the increase in RMSE is within an acceptable range. Additionally, the consistency is improved,which increases from severe under-dispersion (with a value of 0.3–0.7) in the CTL experiment to a higher value closer to 1.0 in the NFSV-L experiment, especially for 500-hPa and 850-hPa zonal wind (Figs. 6d, 6f), which suggests an improvement in overall performance. The improvement in ensemble spread, RMSE and consistency between the NFSV-L and CTL experiments, as well as NFSV-S and CTL experiments are statistically significant at the 99.99%level (t-test) for all zonal wind at all levels.
Fig. 5. BSs for 6-h accumulated precipitation for the three experiments using precipitation thresholds of (a) 0.1 mm,(b) 4 mm, (c) 13 mm, and (d) 25 mm. The results are the monthly averages for the 0000 UTC cycle during May 2020.
However, the improvement for temperature (Fig. 7) is relatively slight compared to that for wind (Fig. 6). As shown in Fig. 7, the NFSV-L experiment produces a slightly higher ensemble spread than that of the CTL experiment, and the RMSE is similar overall for the NFSV-L and NFSV-S experiments, indicating that the NFSV-L generally improve the forecasts and would not increase the error for temperature.There is an exception for 850-hPa temperature (Figs. 7e, 7f),where the NFSV-L experiment produces a slightly higher RMSE compared with the NFSV-S experiment. We can also see that even though the RMSE increases for 850-hPa temperature (Fig. 7e), the improvement in ensemble spread is even greater, and thus the consistency is improved compared to the NFSV-S experiment (Fig. 7f). Thus, in general, the NFSV-L experiment can improve the ensemble spread without causing an increase in the RMSE for temperature (except for a slight increase in RMSE for 850-hPa temperature). As a result, the consistency is improved. Overall,the NFSV-L experiment achieves better performance than
Fig. 6. The (left) domain-averaged RMSE of the ensemble mean of mem00 (gray line), CTL (green line), NFSV-L(red line), and NFSV-S (blue line), respectively, and the ensemble spread for the CTL (green column), NFSV-L (red column), and NFSV-S (blue column) experiments, respectively, and (right) consistency for the (a, b) 250-hPa, (c, d)500-hPa, (e, f) 850-hPa, and (g, h) 10-m zonal winds, varying with forecast hour. The results are the monthly averages for the 0000 UTC cycle during May 2020.
the NFSV-S experiment. The improvement in ensemble spread and consistency between the NFSV-S and CTL experiments, as well as NFSV-L and CTL experiments are statistically significant at the 99% level (t-test) for temperature at all levels. The differences in RMSE are at the 95% level (ttest) for most of the forecast lead times.
In summary, compared to the CTL experiment, the NFSV-L and NFSV-S experiments improve the ensemble spread and the consistency, which implies that the forecast uncertainty is better represented by the ensemble spread,and that the NFSV-perturbed experiments can produce a more reliable EPS. However, the ensemble spread of both the NFSV-perturbed experiments are still below their corresponding RMSE for all verified variables, which indicates that although the NFSV-perturbed experiments can lead to certain improvements, under-dispersion has not been completely solved, therefore more improvements should be made in the future.
4.2.2.The CRPS
The CRPS, which measures the mean absolute error between the forecast probability and observations, is also applied for evaluating the probabilistic performance (Hersbach, 2000). Figure 8 shows the CRPS for zonal wind and temperature at four levels. As shown in Fig. 8, the NFSV-L and NFSV-S experiments exhibit generally lower CRPS than that of the CTL experiment for all the verified variables, which implies that the NFSV-perturbed experiments can improve the mean error in the ensemble system. Additionally, the NFSV-L experiment exhibits a slightly lower or simi-lar CRPS relative to the NFSV-S experiment for both the zonal wind and temperature, indicating a relatively higher probabilistic forecasting skill of the NFSV-L experiment than that of the NFSV-S experiment.
Fig. 7. As in Fig. 6, but (a), (b) for 250-hPa, (c), (d) for 500-hPa, (e), (f) for 850-hPa, and (g), (h) for 2-m temperature.
4.2.3. Outliers
The “outliers” which indicates the frequency at which an observed event falls outside of the ensemble envelope, is shown in Fig. 9. In comparison with CTL experiment, the NFSV-L and NFSV-S experiments exhibit generally lower outlier score for both the zonal wind and temperature, even though the improvement for the 500-hPa zonal wind(Fig. 9d) and 500-hPa temperature (Fig. 9h) is not so dramatic. Additionally, the outlier scores of the NFSV-L experiment are lower than those of the NFSV-S experiment(except for the 500-hPa zonal wind (Fig. 9d) and 850-hPa temperature (Fig. 9d), which implies that the NFSV-L experiment can make a better improvement in the outliers relative to the NFSV-S experiment. The improvement in outliers between the NFSV-S and CTL experiments, as well as NFSV-L and CTL experiments are statistically significant at the 99.9%level (t-test) for 250-hPa temperature (Fig. 9b), 850-hPa zonal wind (Fig. 9e), 850-hPa zonal wind (Fig. 9f), 10-m zonal wind (Fig. 9g) and 2-m temperature (Fig. 9h), and at the 95% level (t-test) for 250-hPa zonal wind (Fig. 9a),500-hPa zonal wind (Fig. 5c), 500-hPa temperature(Fig. 5d).
Fig. 8. The CRPS for the (a) 250-hPa, (c) 500-hPa, (e) 850-hPa, and (g)10-m zonal wind, and (b) 250-hPa, (d)500-hPa, (f) 850-hPa, and (h) 2-m temperature. The results are the monthly averages for the 0000 UTC cycle during May 2020.
Fig. 9. Outlier scores for (a) 250-hPa, (c) 500-hPa, (e) 850-hPa, and (g) 10-m zonal wind and (b) 250-hPa, (d) 500-hPa, (f) 850-hPa, and (h) 2-m temperature varying with forecast hour. The results are the monthly averages for the 0000 UTC cycle during May 2020.
Fig. 10. Evolution of DTE (in m2 s?2) between the NFSV-perturbed experiments and the CTL experiment at (a) 1000 hPa, (b) 850 hPa, (c) 700 hPa, and (d) 500 hPa. The results are the monthly averages for the 0000 UTC cycle during May 2020.
Overall, all the verification results for upper-air and surface weather variables shown above indicate that the NFSVperturbed experiments can effectively improve the probabilistic forecasting skill compared with the CTL experiment, and the NFSV-L experiment exhibits a generally better probabilistic forecasting performance and skill compared with the NFSV-S experiment. This indicates that using a larger perturbation magnitude constraint is better for the forecasting of the upper-air and surface weather variables.
With the aim of further assessing and quantifying the forecast divergence between the NFSV-perturbed and CTL experiments, the domain-integrated DTE (m2s?2) is applied and calculated, as defined by Zhang et al. (2002):
where cp= 1004.9 J kg?1K?1denotes the heat capacity at constant pressure, Tris a reference temperature for calculation and has a value of 270 K, and, andrepresent the differences between the CTL and NFSV-perturbed experiments for the zonal wind, meridional wind and temperature at each grid point, respectively.
Figure 10 shows the evolution of the DTE between the NFSV-S experiment and CTL experiment, as well as the DTE between the NFSV-L experiment and CTL experiment,for representing the impact of introducing the NFSV perturbation on energy. In general, the NFSV-L experiment has a greater impact on energy than NFSV-S at the four levels, indicating that the NFSV perturbation with a larger magnitude generally exerts a greater impact on energy. Additionally,when comparing the energy changes for different verified levels, we find that the NFSV perturbation has a greater impact on the energy at lower levels than on the energy of upper levels (DTE at other levels exhibited similar behavior to those for the four selected levels, and are thus not shown here).
Figure 11 shows the horizontal distributions of DTE between the NFSV-perturbed experiment and the CTL experiment. In general, the NFSV-L experiment (right column)has a greater impact on DTE than the NFSV-S experiment(left column) for almost all lead times, except for the 12-h forecast (Figs. 11c, d), which implies that larger nonlinear NFSV perturbations will generally cause larger changes in energy. Additionally, comparing the DTE of the four forecast lead times (Figs. 11a–h), we also find that the influence of NFSV on energy shows a diurnal variation, where DTE gradually increases during the first 6–12 h (Figs. 11a, c, e, g),whereas after the forecast lead time at 12 h, DTE gradually decreases. The same is true for the NFSV-L experiment,where DTE gradually changes diurnally over time, and the diurnal variation in DTE may be attributed to the diurnal changes in zonal wind and temperature.
The 2-Dimensional Discrete Cosine Transform (2DDCT) method (Denis et al., 2001) is used to decompose the NFSV forcing for the zonal and meridional wind tendency,temperature tendency, and humidity tendency at different levels to investigate the scale characteristics of NFSV forcing and to evaluate at which scales the NFSV forcing is acting.The 2D-DCT method is widely used in limited-area models
for spectral decomposition of two-dimensional atmospheric fields. It can generate a set of spectral coefficients whose spatial scales are related to the wavenumber or wavelength(Zhang et al., 2016). The spectral components of NFSV forcing for the U-tendency, V-tendency, T-tendency, and Q-tendency at 500 hPa, 700 hPa, and 850 hPa are shown in Fig.12. It can be deduced that the power of the NFSV forcing increases significantly at scales ranging from 10 km to 100 km for all tendencies, and in general, the power of NFSV forcing at wavelengths greater than 100 km is greater than that at wavelengths less than 100 km for all tendencies. More importantly, the peak of the power spectra is roughly located between 100 km and 500 km for different tendencies, indicating that the NFSV forcing has a greater impact on the intermediate (meso-) scale component between 100 km and 500 km for the zonal and meridional wind, temperature, and humidity tendencies in the GRAPES limited area model.
Fig. 11. Horizontal distributions of (left column) DTE at 1000 hPa between the NFSV-S experiment and the CTL experiment for the (a) 6-h, (c) 12-h, (e) 18-h, and (g) 24-h forecasts, and (right column) DTE at 1000 hPa between the NFSV-L experiment and the CTL experiment at (b) 6-h, (d) 12-h, (f) 18-h, and (h) 24-h forecasts. The results are the monthly averages for the 0000 UTC cycle during May 2020.
Fig. 12. Spectral components of NFSV forcing for U-tendency [(a), (e),(i)], V-tendency [(b), (f), (j)], T-tendency [(c), (g),(k)], and Q-tendency [(d), (h), (l)] at 500 hPa [(a)?(d)], (e, f, g, h) 700 hPa, 850 hPa [(i)?(l)], respectively.
To optimally represent the model uncertainty for CAEPSs according to their rapid nonlinear error growth characteristics and dynamics, a new nonlinear model perturbation technique has been developed in this study for considering a nonlinear representation of model errors in the GRAPESCAEPS with a horizontal grid spacing of 3 km. Specifically,the NFSV was calculated by the PCA-PSO method to find an optimal tendency perturbation that has the largest nonlinear evolution during the forecast period, then the calculated NFSV perturbation was in turn applied to the model’s physical tendency to construct a nonlinear model perturbation for CAEPS. Finally, the developed nonlinear NFSV-perturbed scheme was used to perturb the ensemble prediction members at each time step during the 24-h forecasting process. Totest the performance and effectiveness of the newly-developed nonlinear model perturbation scheme and the sensitivity of the perturbation magnitude constraint, three experiments—NFSV-S, NFSV-L, and CTL—were performed daily from 1 to 30 May 2020 for forecasts initialized at 0000 UTC over south China. Forecasts were integrated for 24 h. The NFSV-S and NFSV-L experiments used the NFSV to construct nonlinear model physical perturbation(but with different settings of perturbation magnitude constraint), whereas in the CTL experiment, no model physical perturbation method was applied. To compare and assess the performance of the NFSV-perturbed experiments and CTL experiment, various verification were employed in verification for precipitation and upper-air and surface weather variables.
For precipitation verification, the NFSV-L and NFSVS experiments were both characterized by better forecasting performance than the CTL experiment, which implies that the NFSV-perturbed experiments can improve precipitation forecasts. Moreover, the NFSV-S experiment generally performed better in forecasts for light precipitation, and the NFSV-L experiment performed better in forecasts for heavier rainfall events (i.e., above the 13-mm threshold for 6-h precipitation), which implies that, for different precipitation events, the perturbation magnitude constraint must be carefully selected.
Regarding verification of upper-air and surface weather variables, the NFSV-S and NFSV-L experiments yielded an overall improvement to the overall probabilistic forecasting skill compared to the CTL experiment. The verification metrics employed — ensemble spread, RMSE, spread-error consistency, CRPS, and outliers – were all improved for almost all lead times, indicating that the NFSV-perturbed experiments can improve the forecasts. Additionally, the NFSV-L experiment achieved better performance than the NFSV-S experiment for almost all verified metrics and variables, as it was characterized by higher spread and consistency, as well as lower outlier scores and CRPS than those of the NFSV-S experiment. Additionally, the NFSV-L experiment exhibited similar or slightly higher RMSE for zonal wind relative to the NFSV-S experiment, and none of their RMSE exceeded the RMSE of the CTL experiment. This indicated that, although increasing the magnitude of NFSV perturbation may have caused a slight increase in RMSE, the effect of the introduction of NFSV perturbation on the increase of RMSE was within an acceptable range, and importantly,even if RMSE slightly increased, the consistency was still improved. Therefore, the NFSV-L experiment produced more skillful forecasts than the NFSV-S experiment in the verification for upper-air and surface weather variables, and we can conclude that using a larger perturbation magnitude constraint can produce better forecasts for upper-air and surface weather variables.
The analysis of difference total energy between the NFSV-perturbed and CTL experiments indicated that the NFSV perturbation had a greater impact on the energy at lower levels than on the energy at upper levels, and the NFSV perturbation with a larger magnitude generally exerted a greater impact on energy. The influence of NFSV on energy showed a diurnal variation. Additionally, the spectral analysis of the NFSV forcing showed that NFSV forcing generally had a greater impact on the intermediate (meso-)scale components in the GRAPES-CAEPS.
In summary, we can conclude from the above verification results that the NFSV-perturbed experiments can improve the convective-scale ensemble forecasts, and the NFSV-perturbed method may be used as an effective nonlinear model perturbation approach for representing model uncertainties in the CAEPSs. Additionally, attention should be paid to carefully selecting the perturbation magnitude constraint for different precipitation events.
It should be mentioned that solving NFSV requires large amounts of computational resources, and the computational cost is high. It is necessary to further optimize this problem in the future. Furthermore, due to the limited computational and energy power resources, the size of the domain over which the NFSV was verified in this study is relatively small, which may lead to weak baroclinic instabilities and less reliable ensembles. The computational limit may be improved with increasing available computational power in the future, which may contribute to a better performance of NFSV and a more reliable ensemble. Overall, this study reveals the importance and benefits of considering the influence of nonlinearities in convective-scale systems, and it may provide guidance for the future design and development of model perturbation methods for CAEPSs.
Acknowledgements.The authors are grateful to the reviewers for their careful review and invaluable comments. The research was supported by the National Key Research and Development(R&D) Program of the Ministry of Science and Technology of China (Grant No. 2021YFC3000902).
APPENDIX
1. The area under the curve (AROC) is calculated as follows:
where the hit rates H(n) and false alarm rates F(n) can be esti-mated by approximating probabilities with observed frequencies:
2. Brier Score (BS) is calculated as follows:
3. The root-mean-square error (RMSE), the ensemble spread and the corresponding constancy are calculated using the following equations:
Consistency equals Spread divided by RMSE. Where R(i,j) represents the forecast result; R ?(i,j) represents the forecast ensemble mean result; Q(i,j) represents the analysis data. m ,n represent the model grid; N represents the ensemble member; mem represents the ensemble member in the ensemble prediction system.
4. The continuous ranked probability score (CRPS) is calculated as follows:
where K (x) represents the forecast probability; Ka(x) represents the observed frequency.
Advances in Atmospheric Sciences2022年9期