• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Characterizing the Relative Importance Assigned to Physical Variables by Climate Scientists when Assessing Atmospheric Climate Model Fidelity

    2018-06-29 08:24:26SusannahBURROWSAritraDASGUPTASarahREEHLLisaBRAMERPoLunMAPhilipRASCHandYunQIAN
    Advances in Atmospheric Sciences 2018年9期

    Susannah M.BURROWS,Aritra DASGUPTA,Sarah REEHL,Lisa BRAMER,Po-Lun MA,Philip J.RASCH,and Yun QIAN

    Paci fi c Northwest National Laboratory,Richland,Washington 99354,USA

    1.Introduction

    A critical aspect of any climate modeling research is an evaluation of the realism,or fi delity,of the model’s simulated climate through a careful comparison with observational data.For the purposes of this discussion,we de fi ne a climate model’s “ fi delity”broadly as the agreement of the simulated climate with the observed historical and presentday climate state,typically using a combination of satellite and ground-based observations, fi eld campaign measurements,and reanalysis data products as primary sources of observational data.At climate modeling centers around the world,the development of a new model version is always followed by a calibration(“tuning”)e ff ort aimed at selecting values for model parameters that are physically justi fiable and lead to a credible simulation of climate(Hourdin et al.,2017).Model tuning involves the completion of a large number of simulations with variations in parameters,input fi les,and other features of the model.Each simulation is painstakingly evaluated,typically by examining a set of priority metrics,accompanied by manual inspection of a variety of plots and visualizations of various modeled fi elds,and detailed comparisons to determine which model con fi guration produces a credible realization of the climate.Tuning one coupled climate model requires thousands of hours of e ff ort by skilled experts.Experts must exercise judgment,based on years of training,experience,and broad and deep understanding of the model,the physical climate system,and observational constraints,in determining which trade-o ff s are defensible when di ff erent optimization goals con fl ict.

    Comparisons of model fi delity across multiple model simulations are also carried out in multi-model intercomparison projects(e.g.,Gleckler et al.,2008;Reichler and Kim,2008),and in perturbed parameter ensemble experiments for the purpose of quantifying model uncertainty or sensitivities(Yang et al.,2013;Qian et al.,2015,2016).Such studies aim to understand what factors lead to inter-model diversity and drive model sensitivities and to identify potential improvements.Additionally,if an adequate single metric of overall climate model fi delity could be developed,it could be applied to construct weighted averages of climate simulation ensembles(Min and Hense,2006;Suckling and Smith,2013),and used in automatic parameter optimization algorithms(Zhang et al.,2015).

    Early e ff orts to characterize multi-variable climate model fi delitycalculatedanindexofclimatemodel fi delitybycalculating a normalized root-mean-square error or similar metric for each of a selected set of model variables,and then averaging these metrics for all variables(Gleckler et al.,2008;Reichler and Kim,2008).More nuanced objective methods have been proposed to account for the inherent variability in each fi eld(Bravermanetal.,2011),andforspatialandtemporal dependencies between variables(Nosedal-Sanchez et al.,2016).

    These objective methods characterize how closely models resemble observations of speci fi c variables with an increasing degree of sophistication.Nevertheless,in all such approaches,expert judgement is exercised in the selection of which variables to include.In addition,in most previous studies,an implicit decision was made to treat all variables as being of equal physical importance.By contrast,when experts evaluate model fi delity,their decision-making implicitly incorporates their understanding of the physical importance of speci fi c variables to the science questions they are interested in,and more emphasis is placed on the most physically relevant variables.Recent studies have emphasized that the selection of assessed variables should re fl ect physical understanding of the system under consideration(Knutti et al.,2017)and that di ff erent research teams may select di ff erent optimization criteria when weighting model ensemble members,depending on their goals(Herger et al.,2017).

    A potential path forward is to construct a fi delity index I that combines multiple metrics mithat characterize di ff erent aspects of model fi delity,weighted by their relative importance wi:

    However,since the relative “importance”of di ff erent optimization goals is inherently subjective,any such index,including one in which all wiare equal,will be susceptible to criticism that the weights chosen are arbitrary.

    Since expert judgement cannot be fully eliminated from the model evaluation process,we propose that it would be valuable to better understand and quantify the relative importance climate modelers assign to di ff erent aspects of model fi delity when making decisions about trade-o ff s.In addition,we believe it is important to quantify the degree to which consensus exists about the importance of such variables.In the longer term,we envision that this information can be used to develop metrics that quantify both the mean and the variability of the community’s judgements about climate model fi delity.

    This paper reports on our fi rst step towards this long-term goal:the establishment of a baseline understanding of the level of importance that experts explicitly state they assign to di ff erent variables when evaluating the mean climate state of the atmosphere of a climate model.To this end,we conducted a large international survey of climate model developers and users,and asked them to indicate their view of the relative importance of a subset of variables used in assessing model fi delity,in the context of particular scienti fi c goals.The speci fi c aims of this study are to:(1)quantify the extent of consensus among climate modelers on the relative importance of di ff erent variables in evaluating climate models;(2)document whether modelers adjust their importance weights depending on the scienti fi c purpose for which a model is being evaluated;(3)determine whether either importance rankings or degree of consensus vary as a function of an individual’s experience or domain of expertise;and(4)provide baseline information for a planned follow-up study,a mock model evaluation exercise.In the follow-up study,described in more detail in section 4,we will investigate whether experts’assessments of models,on the basis of plots and metrics describing model–observation comparisons,are consistent with the relative importance that these experts previously assigned to individual variables for the assessment of model fi delity,with respect to speci fi c science goals.

    We describe the present study in the following sections.Section 2 describes the design of the survey,recruitment of participants,and methods used in analyzing survey responses.Section 3 describes the results of the survey,including the distribution of importance rankings,degree of consensus,dependence of responses on the speci fi c science questionsandrespondents’levelofexperience,andperceived barriers to systematic quanti fi cation of climate model fi delity.Section 4 discusses a potential approach to synthesizing expert assessments of model fi delity and objective methods for fi delity assessment,by systematically measuring and explicitly accounting for the relative importance experts assign to di ff erent aspects of fi delity.Finally,section 5 summarizes the key points and conclusions from this study.

    2.Survey design and methods

    2.1.Survey aims,design and scope

    We conducted a large international survey to document and understand the expert judgments of the climate modeling community on the relative importance of di ff erent model variables in the evaluation of simulation fi delity.

    To keep the scope of this study focused,we only considered the evaluation of the annual mean climatology of an atmosphere-only model simulation,with prescribed SST.In addition,participants were asked to assume that their evaluation would be carried out only on the basis of scalar metrics(e.g.,RMSE,correlation)characterizing the agreement of the respective model fi eld with observations.

    Transient features of climate were intentionally excluded from this study,but are of critical importance in model evaluation,and should be explored in future work.Similarly,coupled climate models have more complex tuning criteria that are not considered here.

    We chose to limit the number of variables and criteria under consideration in order to encourage broader participation,and in anticipation of a planned follow-up study(described in more detail in section 4).Brie fl y,the follow-up study will invite experts to compare and evaluate climate model outputs,and will aim to infer the importance that expertsimplicitly assign to di ff erent aspects of model fi delity in conducting this assessment.To the best of our knowledge,this would be the fi rstattempttoexperimentallycharacterizeexpertevaluations of climate model fi delity,and so we aim to initially test the approach using a small number of key variables,which will allow for a more controlled study.The relative importance ratings and other input from experts reported in this study will both inform the design of the follow-up study and provide a priori values for Bayesian inference of the weights wi.

    The importance of a particular variable in model evaluation will depend on the purpose for which the model will be used.To better constrain the responses,as well as to explore how expert rankings of di ff erent model variables might change depending on the scienti fi c objectives,we asked participants to rate the importance of di ff erent variables with respect to several di ff erent“Science Drivers”.A list of the six Science Drivers used in this survey is shown in Table 1.For each Science Driver,participants were presented with a preselected list of variables thought to be relevant to that topic,and asked to rate the importance of each variable on a sevenpoint Likert scale from “Not at all Important”to “Extremely Important”.Participants were also invited to provide written feedback identifying any “very important”or“extremelyimportant” variables that they felt had been overlooked;many took the opportunity to provide these comments,summarized in Tables S1–S3(see Electronic Supplementary Material).This feedback will be used to improve the survey design in the follow-up study.

    Table 1.Science Driver(SD)questions posed in this survey.

    2.2. Survey recruitment,participation,and data screening

    The survey was distributed via several professional mailing lists targeting communities of climate scientists,especially model developers and users,and by directly soliciting input from colleagues through the professional networks of the authors of this paper.Due to privacy restrictions,we are unable to report the identities or geographic locations of survey respondents,but we are con fi dent that they are representative of the climate modeling community.The survey was open from 18 January 2017 to 25 April 2017.Participants who had not completed at least all items on the fi rst Science Driver(N=12),and participants who rated themselves as “not at all experienced”with evaluating model fidelity(N=7)were excluded from analysis.Of the remaining 96 participants,81 had completed all six Science Drivers.

    Our survey respondents were a highly experienced group,with the vast majority of participants rating themselves as either“very familiar”(40.6%)or“extremely familiar”(40.6%)with climate modeling.In addition,a large fraction of our participants had worked in climate modeling for many years,with the majority of participants(62)reporting at least 10 years’experience,and a substantial number of participants(31)reporting at least 20 years’experience with climate modeling.When asked to rate their experience in“evaluating the fi delity of the atmospheric component of global climate model simulations,”37.5%rated themselves as “very experienced,”and 20.8%as “moderately experienced”in “tuning/calibrating the atmospheric component of global climate model simulations”.An overview of the characteristics of the survey participants is shown in Fig.1.

    2.3.Formal consensus measure:Coefficient of Agreement(A)

    To quantify the degree of consensus among our participants,we employ a formal measure of consensus called the coefficient of agreement A(Ri ff enburgh and Johnstone,2009),which varies from values near 0(no agreement;random responses)to a maximum possible value of 1(complete consensus).Calculated values of A for the two experience groups,and their probability p of being signi fi cantly di ff erent from each other,are tabulated for all Science Drivers and variables in the Supplementary Tables S4–S6.

    The coefficient of agreement is calculated from the observed disagreement dobsand the expected disagreement under the null hypothesis of random responses dexp.Let rmaxdenote the number of possible options(7 in the Likert scale used here);let r=1...rmaxdenote the possible responses(r=7 is “Extremely important”,r=6 is “Very important”,and so on);let nrdenote the number of respondents choosing the rth option,and let rmeddenote the median value of r from all respondents.The observed disagreement is then calculated as

    where|rmed?r|is the weight for the rth choice.The expected disagreement is calculated as

    The coefficient of agreement A is then calculated as the complement of the ratio of observed to expected disagreement:

    For randomly distributed responses,dobswould be close to dexp,and A would be close to zero;while for perfect agreement,dobs=0 and A=1.

    Fig.1.Characteristics of survey participants.

    Because the value of A is sensitive to the total number of respondents N,the value of A is not comparable for subgroups of participants with di ff erent sizes.We performed additional signi fi cance testing to determine whether the degree of consensus was the same,or di ff erent,between our“high experience”and “l(fā)ow experience”groups,and/or between two survey drivers.

    We test for statistically signi fi cant di ff erences between two values of the coefficient of agreement for two groups of responses,A1and A2,by performing a randomization test with the null hypothesis H0:A1=A2.To perform this test,we take l=1:100 random draws,without replacement,from the two groups of survey responses.For each lth draw,we calculate the di ff erence in the coefficient of agreement for the two groups,dl=|A1l?A2l|.We then calculate the p-value for rejection of the null hypothesis,i.e.,the probability that a di ff erence in agreement larger than the observed mean could occur by chance:

    where dl,meanis the mean of all dl.

    3.Survey results and discussion

    Here we report on selected analyses and results from the survey.We focus primarily on:(1)the degree of consensus among experts on the importance of di ff erent model variables;(2)how responsive experts’assessments of variable importance are to the de fi ned scienti fi c objectives;and(3)di ff erences in expert ratings of variable importance between respondents with more climate modeling experience and those with less experience.

    We also performed similar analyses comparing survey responses from model users and model developers.The responses of these two groups were statistically nearly identical,and so we do not report them in further detail.

    3.1.Importance of di ff erent variables to climate model fidelity assessments across six Science Drivers

    In this section,we discuss expert ratings of variable importance for the six science drivers.In order to understand whether participants’responses di ff ered depending on their degree of expertise,we fi rst divided the participants into two experience groups:those who rated themselves as“very experienced”in evaluating model fi delity were placed into the“high experience”group(N=36);all other participants were placed into the “l(fā)ow experience”group(N=60).

    We emphasize that our“l(fā)ow experience”group consists largely of working climate scientists over the age of 30(95%),with a median of 10 years of experience in climate modeling.Inotherwords,our“l(fā)owexperience”groupmostly consists not of laypersons,students or trainees,but of earlyto-mid-career climate scientists with moderate levels of experience in evaluating and tuning climate models.Our“high experience”group consists largely of mid-to-late career scientists:the majority are over the age of 50(53%),with a median of 20.5 years of experience in climate modeling.Researchers on the development of expertise have argued that roughly 10 years of experience are needed for the develop-ment and maturation of expertise(Ericsson,1996);86%of our“high experience”group members have 10 years or more of climate modeling experience.

    3.1.1. Science Driver 1:How well does the model reproduce the overall features of the Earth’s climate?

    Our fi rst Science Driver asked respondents to assess the importance of di ff erent variables to“the overall features of Earth’s climate”.We believe that this statement summarizes the primary aim of most experts when calibrating a climate model.However,experts’typical practices are likely to be in fl uenced by factors such as the tools and practices used by their mentors and immediate colleagues,their disciplinary background,and their research interests.Such factors could contribute to di ff erences in judgments of what constitutes a“good”modelsimulation.TheaimofthisScienceDriveristo understand what experts prioritize when the goal is relatively imprecisely de fi ned as optimizing the “overall features”of climate;these responses can then be contrasted with the more speci fi c questions in the following fi ve Science Drivers.

    Figures 2 and 3 show the distribution of responses for each variable in Science Driver 1 for the high and low experience groups.Figure 4(top)summarizes the mean and standard deviation of importance ratings for all variables in Science Driver 1.Overall,the variables most likely to be identi fi ed as “extremely important”were(in ranked order):rain fl ux(N=31),2-m air temperature(N=28),longwave cloud forcing(N=22),shortwave cloud forcing(N=21),and sea level pressure(N=20).The complete distributions of responses for all science drivers by experience group,together with statistical summary variables and signi fi cance tests,are shown in Tables S1–13.

    Fig.2.Science Driver 1:distributions of importance ratings,ranked by consensus,as quanti fi ed by the coefficient of agreement A,for variables with high expert consensus about their importance.

    The distribution and degree of consensus is similar between the two groups,with no statistically signi fi cant differences for any variable(see Supplementary Tables S4–S6).This suggests that once an initial level of experience is acquired,additional experience may not lead to signi fi cant differences in judgments about model fi delity.

    Fig.3.As in Fig.2 but for variables with low expert consensus about their importance.

    It is instructive to examine which variables are the exceptions to this general rule;these exceptions hint at insights into where and how greater experience matters most in informing the judgments experts make about model fi delity.The distribution of responses of the high experience and low experience group di ff ered for only one item in Science Driver 1—the oceanic surface wind stress(p<0.01);for this variable,the median response of the high and low experience groups was“very important”and “moderately important,”respectively.We speculate that the high-experience group may be more sensitive to this variable due to(1)its critical importance to ocean–atmosphere coupling,and(2)awareness of the relatively high-quality observational constraints available from wind scatterometer data.

    We also investigated the degree of consensus on the importance of di ff erent variables.We observe a clearly higher degree of consensus for some variables,compared to others.Across all participants(high and low experience groups together),there is a comparatively high degree of consensus on the importance of shortwave cloud forcing(A=0.67),longwave cloud forcing(A=0.62),and rain fl ux(A=0.62).In particular,there is comparatively little agreement on the importance of oceanic surface wind stress(A=0.39),due to the discrepancy between experience groups on this item,and on the aerosol optical depth(AOD;A=0.42).The data we collected do not allow us to be certain of the reasoning behind importance ratings,but the lack of consensus on AOD importance is perhaps unsurprising in light of the high uncertainty associated with the magnitude of aerosol impacts on climate(Stocker et al.,2013),and recent controversies among climate modelers on the importance of aerosols to climate,or lack thereof(Booth et al.,2012;Stevens,2013;Seinfeld et al.,2016).

    3.1.2.Science Driver 2:How well does the model reproduce features of the global water cycle?

    Our second Science Driver included a comparatively limited number of variables related to the global water cycle(Fig.4:middle).These should be considered in combination with Science Driver 6,which addresses the assessment of simulated clouds using a satellite simulator(Fig.5).

    Fig.4.Science Drivers 1–3:mean responses,high and low experience groups,ranked by overall mean response from all participants;color of dots indicates standard deviation of responses.

    While the di ff erences did not pass our criteria for statistical signi fi cance,we note a slight tendency for the high experience group to assign higher mean importance ratings to net TOA radiative fl uxes and precipitable water amount.We speculate that this might be due to a slightly greater awareness of,and sensitivity to,observational uncertainties among the high experience group,expressed as a higher importance rating for variables with stronger observational constraints from satellite measurements.This interpretation is supported bythecommentofonestudyparticipant(with20years’experience in climate modeling),who observed that“surface LH[latent heating]and SH[sensible heating]are not well constrained from obs[ervations].While important,that means they aren’t much use for tuning.”

    3.1.3. Science Driver 3:How well does the model simulate Southern Ocean climate?

    For Southern Ocean climate,surface interactions that affectocean–atmospherecoupling,includingwindstress,latent heat fl ux(evaporation)and rain fl ux,together with shortwave cloud forcing,were identi fi ed as among the most important variables by our participants(Fig.4:bottom).

    Fig.5.Science Drivers 4–5:mean responses,high and low experience groups,ranked by overall mean response from all participants;color of dots indicates standard deviation of responses.

    The high experience group rated rain fl uxes as more important(median:“very”important)compared to the low experience group(median:“moderately”important;probability of di ff erence:p=0.02).

    It is interesting to compare the responses with Science Driver 1,which included many of the same variables.For instance,for AOD,the low experience group assigned a lower mean importance for overall climate(mean:4.32;σ:1.41)than for Southern Ocean climate(mean:4.04;σ:1.49);the high experience group assigned a higher mean importance for overall climate(mean:4.64;σ:1.16)than for Southern Ocean climate(mean:4.34;σ:1.13).

    The reasons for this discrepancy are unclear.One possibility is that the high experience group may be more aware that over the Southern Ocean,AOD provides a poor constraint on cloud condensation nuclei(Stier,2016),and is affected by substantial observational uncertainties,with estimates varying widely between di ff erent satellite products.

    3.1.4. Science Driver 4:How well does the model simulate important features of the water cycle in the Amazon watershed?

    On Science Driver 4,which addresses the water cycle in the Amazon watershed(Fig.5:top),participants identi fi ed surface sensible and latent heat fl ux,speci fi c humidity,and rain fl ux as the most important variables for evaluation.It is possible that the more experienced group is more sensitive to the critical role of land–atmosphere coupling in the Amazonianwatercycle.Thisinterpretationwouldbeconsistentwith the additional variables suggested by our survey participants for this science driver,which also focused on variables critical to land–atmosphere coupling,e.g. “soil moisture”,“water recycling ratio”,and “plant transpiration”(Supplementary TableS2).Whilethevariablesselectedforthesurveyfocused largely on mean thermodynamic variables,commenters also mentioned critical features of local dynamics in the Amazon region,such as surface topography and“wind fl ow over the Andes”,“convection”,and vertical velocity at 850 hPa.

    3.1.5.Science Driver 5:How well does the model simulate important features of the water cycle in the Asian watershed?

    For Science Driver 5,focused on the Asian watershed,participants rated rain fl ux,surface latent heat fl ux,and net shortwave radiative fl ux at the surface as the most important variables(Fig.5:bottom).For variables included in both Science Drivers,the order of variable importance was the same as in the Amazon watershed,but di ff erent than in the Southern Ocean;some of these di ff erences will be discussed in section 3.3.Written responses again mentioned soil moisture(3×)and moisture advection(2×)as important variables missing from the list.

    3.1.6.Science Driver 6:How well does the model simulate the climate impact of clouds globally?

    The fi nal Science Driver addressed the evaluation of cloud properties in the model(Fig.6)using a satellite simulator,which produces simulated satellite observations and retrievals based on radiative transfer calculations in the model.“Very important”(6)was the most common response for all variables in Science Driver 6(Supplementary Table S15).

    While di ff erences in responses between the two experience groups did not pass our bar for statistical signi ficance,the high experience group selected“extremely important”more frequently than the low experience group for the“high level cloud cover”and “l(fā)ow cloud cover”items,which also had the highest mean importance ratings in this Science Driver.

    Fig.6.Science Driver 6:mean responses,high and low experience groups,ranked by overall mean response from all participants;color of dots indicates standard deviation of responses.

    Five participants indicated that longwave cloud forcing and shortwave cloud forcing should have been included,and one respondent noted“A complete vertical distribution of cloud properties would be even more interesting than “l(fā)ow”,“medium”and “high”cloud cover.Cloud particle size and number would also be interesting.”Another responded that“cloud fraction is a model convenience but is quite arbitrary.”

    3.2.Impactofexperienceonjudgmentsofvariableimportance

    We hypothesized that:(H1)respondents with less experience in climate modeling would di ff er from more experienced respondents in their judgments of relative variable importance;and(H2)Respondents with greater experience in climate modeling would exhibit greater consensus in their judgments of the importance of di ff erent variables.

    (H1):Using a Chi-squared signi fi cance test(details in the Supplementary Material),we fi nd support for di ff erences in assessment of variable importance by high and low experience groups,but only for certain selected variables.Compared to the low experience group,the high experience group rated ocean surface wind stress as more important to evaluation of global climate(Science Driver1)and rain fl uxas more important to evaluation of Southern Ocean climate(Science Driver 3).

    Some other di ff erences are observable between the two groups(see Supplementary Tables S10–S15),but did not meet our criteria for signi fi cance;it is possible that additional di ff erences would emerge if a larger survey population could be attained.

    (H2):We fi nd no statistically signi fi cant di ff erences in degree of consensus between the high and low experience groups.

    The lack of large di ff erences in responses between the high and low experience groups suggests that variations in importance ratings are mainly driven by factors that are unrelated to the amount of experience the scientists have.Examples could include the speci fi c subdiscipline of the individual expert,or the practices and research foci that are common in their particular research community or geographic area.This result also suggests that expertise in climate model evaluation may reach a plateau after a certain level of pro fi ciency is attained,with additional experience leading to only incremental changes in expert evaluations and judgments.One possible reason for this is that the process of model evaluation is constantly evolving as updated model versions incorporate additional processes and improvements,new observational datasets become available,and new tools are developed to support the evaluation process.As a result,climate scientists continually need to update their understanding about climate models and their evaluation to re fl ect the current state-of-theart.Another possible explanation is that the culture of the climate modeling community may promote an efficient transfer of knowledge,as more experienced scientists o ff er training and advice to less experienced colleagues and to other research groups,shortening the learning curve of new scientists entering the fi eld.

    3.3.Impact of Science Drivers on judgments of variable importance

    We expected that survey participants would rate the importance of the same model variables di ff erently depending on the science goals,and indeed this is what we found.In this section,we focus on the ratings from the high experience group,but results from the low experience group are similar.

    For instance,rain fl ux was rated as less important to evaluation of the Southern Ocean(mean:6.00;σ:1.12)than to global climate(mean:6.14;σ:0.92)or the Asian watershed(mean:6.32;σ:1.00),while shortwave and longwave cloud forcing were rated as less important to the Asian watershed(shortwave:mean:5.48;σ:0.84;longwave:mean:5.23;σ:1.01)than to global climate(shortwave:mean:5.89;σ:1.02;longwave:mean:5.78;σ:1.02)or Southern Ocean climate(shortwave:mean:5.63;σ:0.86;longwave:mean:5.56;σ:0.90).Surface wind stress was rated more important in the Southern Ocean(mean:5.84;σ:1.30),and less important in the Asian watershed(mean:5.10;σ:1.33),compared to its importance to global climate evaluation(mean:5.81;σ:1.02).While total cloud liquid water path was rated as equally important in the Southern Ocean(mean:5.09;σ:1.10),Amazon watershed(mean:5.06;σ:1.29),and Asian watershed(mean:5.13;σ:1.13),total cloud ice water path was rated as less important to the evaluation of the model in the Amazon watershed(mean:4.45;σ:1.52)and Asian watershed(mean:4.74;σ:1.22),compared to the Southern Ocean(mean:5.03;σ:1.13).

    These di ff erences indicate that experts adjust the importance assigned to di ff erent metrics depending on the science question or region they are focusing on.As a result,we recommend that future work focused on understanding or quantifying expert judgments of model fi delity should always be explicit about the scienti fi c goals for which the model under assessment will be evaluated.

    3.4. Perceived barriers to systematic quanti fi cation of model fi delity

    We also explored the community’s perceptions about the current obstacles to systematic quanti fi cation of model fidelity(Fig.7).Survey participants identi fi ed the lack of robust statistical metrics(28%)and lack of analysis tools(10%)as major barriers,with 17%selecting “all of the above”.

    Fig.7.Perceived barriers to systematic quanti fi cation of model fi delity.Answers were selected from a predetermined list in response to the prompt:“Which one among the following,do you feel,is the biggest barrier towards systematic quanti fi cation of model fi delity?”

    Many participants selected the option “Other”and contributed written comments.We grouped these into qualitative categories of responses.The most commonly identi fi ed issues related to:

    ?Lacking or inadequate observational constraints and error estimates for observations(8×);

    ?Laboriousness of the tuning process(7×);and

    ?Challenges associated with identifying an appropriate single metric of model fi delity(7×).

    On the fi nal point,many of the comments focused on the risk of oversimplifying the analysis and evaluation of models:“Focusing on single metrics over simpli fi es the analysis too much to be useful.It is often hard to identify good vs.bad becauseoneaspectworkswhileothersdon’t,anddi ff erentmodels have di ff erent trade o ff s.”“No one metric tells the whole story;this may lead to false con fi dence in model fi delity.”Another commenter noted that“it’s very hard to create a single metric that accurately encapsulates subjective judgments of many scientists.”Finally,several respondents noted other barriers,including a perceived lack of sufficient expertise in the community,a perception that some widespread practices are inadequate or inappropriate for model evaluation,and a lack of sufficient attention to model sensitivities,as opposed to calibration with respect to present-day mean climate.

    4.Prospects for synthesizing expert assessments and objective model fi delity metrics

    As discussed in section 1,there are many potential applications for a climate model index that summarizes the model’s fi delity with respect to a particular science goal.However,one challenge is that an assessment of which models most resemble the observations depends in part on which observed variables are evaluated,and how much relative importance is assigned to each of them.A model fi delity index can be conceptualized as a weighted average of di ff erent objective metrics(Eq.1),but di ff erent experts might reasonably make di ff erent choices in assigning values to the weights,resulting in models potentially being ranked di ff erently by different experts,as illustrated in Fig.8.Furthermore,the information that experts implicitly use and the relative importance they assign to di ff erent aspects of the model’s fi delity when evaluating actual model output,likely di ff ers from their explicit statements about evaluation criteria.A systematic approach is needed to understand which information experts actually use in evaluating models,how much consensus exists among experts about variable importance when evaluating real model output,and how sensitive a proposed model fi delity index would be to di ff erences in these judgments between experts.

    Fig.8.Illustration of the concept of overall model fi delity rankings and their sensitivity to expert weights.Consider the pair of models uq1 and uq2,where the overall fi delity of the model is evaluated as a weighted mean of several component scores.If uq1 performs better than uq2 on some component scores,but worse on others,the ranking of these models according to their overall mean fi delity metric will be sensitive to how strongly each component metric is weighted.In this example,the rankings of several models using “naive weights”(unweighted average)are compared to rankings that use importance weights derived from the responses of two di ff erent experts in our survey.

    The survey described in this paper represents a fi rst step towards building that understanding.It also provides baseline information that will inform and be used in analysis of a second planned study,in which experts will be invited to evaluate the output from real model simulations.This mock model assessment exercise will enable us to address additional questions,such as:(1)How much consensus exists among experts when evaluating the fi delity of actual model simulations(as opposed to assessing variable importance in the abstract)?(2)can an index Iinferredbe constructed by using experts’assessments of real model output to infer the weights wi,inferredthat they implicitly assign to fi delity of different model variables?(3)Do the weights wi,inferredthat are inferred from experts’assessments of real model output agree or disagree with the relative importance that experts assigned to di ff erent variables a priori,as reported in this study?

    5.Summary and conclusions

    In this article we report results from a large community survey on the relative importance of di ff erent variables in evaluating a climate model’s fi delity with respect to a particular science goal.We plan to use the results of this study to inform the development of a follow-up study in which experts are invited to evaluate actual model outputs.

    We show that experts’rankings are sensitive to the scienti fi c objectives.For instance,surface wind stress was rated as among the most important variables in evaluation of Southern Oceanclimate,andamongtheleastimportantinevaluationof the Asian watershed.This suggests the possibility and utility of designing di ff erent and unique collections of metrics,tailored to speci fi c science questions and objectives,while accounting explicitly for uncertainty in variable importance.

    We fi nd no statistically signi fi cant di ff erences between rankingsprovidedbymodeldevelopersandmodelusers,suggesting some consistency between the developer and user communities’understanding of appropriate evaluation criteria.We also fi nd that our“high experience”group,consisting mostly of senior scientists with many years of climate modeling experience,and our “l(fā)ow experience”group,consisting mostly of early and mid-career scientists,were in agreement about the importance of most variables for model evaluation.However,within each group,there are also substantial disagreements and diversity in responses.The level of consensus is particularly low for AOD,which some participants rated as“extremely important”and others rated as“not at all important.”Additionally,in our survey sample,greater experience with evaluating model fi delity was not associated with greater consensus about the importance of di ff erent variables in model evaluation,and led to only minor changes in estimates of variable importance,i.e.,to small changes in the frequency distribution of importance ratings,which are only statistically signi fi cant for a small number of variables.

    Itisimportanttonotethatwhenexperts’responsesonthis survey di ff er,it does not necessarily imply that their evaluations of actual climate models would also di ff er.We anticipate that experts perform actual model evaluations in a more holistic manner and draw on much broader information than was included in this survey.In order to make initial progress on this extremely complex topic,we limited the scope of the study to evaluation of global mean climate,but the timedependent behavior of the system is also critical to assess,as well as features of the coupled climate system.Future research should extend this approach to include evaluation of diurnal and seasonal cycles;multi-year modes of climate variability such as ENSO,QBO,and PDO;extreme weather events;frequency of extreme precipitation;and other timedependent features of the climate system.Other,more complex metrics of model fi delity could also be considered,e.g.,object-based veri fi cation approaches,and scale-aware metrics that would be robust to changes in model resolution.

    Several study participants noted that issues related to observational datasets continue to be a major challenge for model evaluation.This includes logistical issues,such as theiravailabilitythroughacentralizedrepository,instandardized formats,and in updated versions as new data become available.However,more fundamentally,the limitations of observational constraints continue to be a major obstacle,including the lack of observations of certain key model variables,and the lack of estimates of the observational uncertainty for many datasets.Climate model evaluation e ff orts could also bene fi t from the increased adoption of metrics and diagnostic visualizations that directly incorporate information on observational uncertainty and natural variability,providing greater transparency and richer contextual information to users of these tools.

    The labor-intensiveness of model evaluation e ff orts was noted by several survey participants,and is well-known to most scientists familiar with climate model development.Climate modeling centers invest an enormous amount of computational and human resources into model tuning.At a rough estimate,tuning a coupled climate model requires the e ff orts of about fi ve full-time equivalent(FTE)scientists and engineers for each major model component(atmosphere,ocean and sea-ice,and land)as well as fi ve FTEs for the overall software engineering and tuning of the coupled system.An intense tuning e ff ort for a new major version of a coupled climate model may last for about one year and be repeated every fi ve years,for an average investment of four FTEs per year.Globally,there are at least 26 major climate modeling centers(the number that participated in CMIP5 project:http://cmip-pcmdi.llnl.gov/cmip5/availability.html),of which fi ve are located in the United States(DOE–ACME,NASA–GISS,NASA–GMAO,NCAR,NOAA–GFDL).Assuming that the typical cost to support a sta ffscientist at a climate modeling center is about$300 thousand per year(including salary,fringe,and overhead expenses),we estimate that the amount of money spent annually on the human e ff ort involved in climate model tuning is roughly$6 million in the United States and$31.2 million globally.

    If appropriate quantitative metrics can be developed that meaningfully capture the criteria important in a comprehensive model assessment,then algorithms could be applied to partially automate the calibration process,for instance by identifying an initial subset of model con fi gurations that produce plausible climates,subject to further manual inspection by teams of experts.Further work is needed to assess the feasibility of such an approach;but if successful,similar approaches could be valuable in the development not only of global climate models,but also of regional weather models,large eddy simulations,and other geophysical and complex computational models in which multiple aspects of fi delity must be assessed and weighed against each other.

    We suggest that a closer integration of objectively computed metrics with expert understanding of their relative importance has the potential to dramatically improve the efficiency of the model calibration process.The concise variable lists and community ratings reported in this study provide a snapshot of current expert understanding of the relative importance of certain aspects of climate model behavior to their evaluation.This information will be informative to the broader climate research community,and can serve as a starting point for the development of more sophisticated evaluation and scoring criteria for global climate models,with respect to speci fi c scienti fi c objectives.

    Acknowledgements. The authors would like to express their sincere gratitude to everyone who participated in the survey described in this paper.While privacy restrictions prevent us from publishing their identities,we greatly appreciate the time that many busy individuals have taken,voluntarily,to contribute to this research.We would like to thank Hui WAN,Ben KRAVITZ,Hansi SINGH,and Benjamin WAGMAN for helpful comments and discussions that helped to inform this work.This research was conducted under the Laboratory Directed Research and Development Program at PNNL,a multi-program national laboratory operated by Battelle for the U.S.Department of Energy under Contract DEAC05-76RL01830.

    Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use,distribution,and reproduction in any medium,provided the original author(s)and the source are credited.

    Electronic supplementary material Supplementary material is available in the online version of this article at https://doi.org/10.1007/s00376-018-7300-x.

    REFERENCES

    Booth,B.B.B.,N.J.Dunstone,P.R.Halloran,T.Andrews,and N.Bellouin,2012:Aerosols implicated as a prime driver of twentieth-century North Atlantic climate variability.Nature,484,228–232,https://doi.org/10.1038/nature10946.

    Braverman,A.,N.Cressie,and J.Teixeira,2011:A likelihoodbased comparison of temporal models for physical processes.Statistical Analysis and Data Mining:The ASA Data Science Journal,4,247–258,https://doi.org/10.1002/sam.10113.

    Ericsson,K.,1996:The Road to Expert Performance:Empirical Evidence from the Arts and Sciences,Sports,and Games.Lawrence Erlbaum Associates,369 pp.

    Gleckler,P.J.,K.E.Taylor,and C.Doutriaux,2008:Performance metrics for climate models.J.Geophys.Res.,113,D06104,https://doi.org/10.1029/2007JD008972.

    Herger,N.,G.Abramowitz,R.Knutti,O.Ang′elil,K.Lehmann,and B.M.Sanderson,2017:Selecting a climate model subset to optimise key ensemble properties.Earth System Dynamics,9,135–151,https://doi.org/10.5194/esd-9-135-2018.

    Hourdin,F.,and Coauthors,2017:The art and science of climate model tuning.Bull.Amer.Meteor.Soc.,98,589–602,https://doi.org/10.1175/BAMS-D-15-00135.1.

    Knutti,R.,J.Sedl′aˇcek,B.M.Sanderson,R.Lorenz,E.M.Fischer,and V.Eyring,2017:A climate model projection weighting scheme accounting for performance and interdependence.Geophys.Res.Lett.,44,1909–1918,https://doi.org/10.1002/2016GL072012.

    Min,S.K.,and A.Hense,2006:A Bayesian approach to climate model evaluation and multi-model averaging with an application to global mean surface temperatures from IPCC AR4 coupled climate models.Geophys.Res.Lett.,33,L08708,https://doi.org/10.1029/2006GL025779.

    Nosedal-Sanchez,A.,C.S.Jackson,and G.Huerta,2016:A new test statistic for climate models that includes fi eld and spatialdependenciesusingGaussianMarkovrandom fi elds.Geoscienti fi c Model Development,9,2407–2414,https://doi.org/10.5194/gmd-9-2407-2016.

    Qian,Y.,and Coauthors,2015:Parametric sensitivity analysis of precipitation at global and local scales in the Community Atmosphere Model CAM5.Journal of Advances in Modeling Earth Systems,7,382–411,https://doi.org/10.1002/2014 MS000354.

    Qian,Y.,and Coauthors,2016:Uncertainty quanti fi cation in climate modeling and projection.Bull.Amer.Meteor.Soc.,97,821–824,http://dx.doi.org/10.1175/BAMS-D-15-00297.1.

    Reichler,T.,and J.Kim,2008:How well do coupled models simulate today’s climate?Bull.Amer.Meteor.Soc.,89,303–311,https://doi.org/10.1175/BAMS-89-3-303.

    Ri ff enburgh,R.H.,and P.A.Johnstone,2009:Measuring agreement about ranked decision choices for a single subject.The International Journal of Biostatistics,5,https://doi.org/10.2202/1557-4679.1113.

    Seinfeld,J.H.,and Coauthors,2016:Improving our fundamental understanding of the role of aerosol-cloud interactions in the climate system.Proceedings of the National Academy of Sciences of the United States of America,113,5781–5790,https://doi.org/10.1073/pnas.151404311.

    Stevens,B.,2013:Aerosols:Uncertain then,irrelevant now.Nature,503,47–48,https://doi.org/10.1038/503047a.

    Stier,P.,2016:Limitations of passive remote sensing to constrain global cloud condensation nuclei.Atmospheric Chemistry and Physics,16,6595–6607,https://doi.org/10.5194/acp-16-6595-2016.

    Stocker,T.F.,and Coauthors,2013:Climate Change 2013:The Physical Science Basis.Contribution of Working group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change.Cambridge University Press,1535 pp,https://doi.org/10.1017/CBO9781107415324.

    Suckling,E.B.,and L.A.Smith,2013:An evaluation of decadal probability forecasts from state-of-the-art climate models.J.Climate,26,9334–9347,https://doi.org/10.1175/JCLI-D-12-00485.1.

    Yang,B.,and Coauthors,2013:Uncertainty quanti fi cation and parameter tuning in the CAM5 Zhang-McFarlane convection scheme and impact of improved convection on the global circulation and climate.J.Geophys.Res.,118,395–415,https://doi.org/10.1029/2012JD018213.

    Zhang,T.,L.Li,Y.Lin,W.Xue,F.Xie,H.Xu,and X.Huang,2015:An automatic and e ff ective parameter optimization method for model tuning.Geoscienti fi c Model Development,8,3579–3591,https://doi.org/10.5194/gmd-8-3579-2015.

    tube8黄色片| 人妻 亚洲 视频| 精品久久蜜臀av无| 五月伊人婷婷丁香| 亚洲激情五月婷婷啪啪| 国产成人精品在线电影| 国产一区二区三区综合在线观看 | 国产精品不卡视频一区二区| 日本欧美视频一区| 性色av一级| 一区在线观看完整版| 极品少妇高潮喷水抽搐| 精品久久久精品久久久| www.色视频.com| 制服诱惑二区| 精品少妇黑人巨大在线播放| 一区二区三区乱码不卡18| 亚洲国产精品一区三区| 日韩伦理黄色片| 性高湖久久久久久久久免费观看| 国产欧美日韩一区二区三区在线| 亚洲精品成人av观看孕妇| videossex国产| 精品一区在线观看国产| 97在线视频观看| 一级毛片 在线播放| 亚洲欧洲国产日韩| 亚洲av欧美aⅴ国产| 免费观看av网站的网址| 久久久久精品性色| 搡女人真爽免费视频火全软件| 日韩一区二区视频免费看| 国产精品久久久久久精品电影小说| 大码成人一级视频| 久久精品人人爽人人爽视色| 天天躁夜夜躁狠狠躁躁| 欧美精品av麻豆av| 欧美日韩视频精品一区| 母亲3免费完整高清在线观看 | 欧美国产精品va在线观看不卡| 人妻人人澡人人爽人人| 女性被躁到高潮视频| a级毛片黄视频| 在线 av 中文字幕| 久久久久精品性色| 久久精品久久久久久久性| 欧美丝袜亚洲另类| 久久人人97超碰香蕉20202| 51国产日韩欧美| 久久影院123| 中文天堂在线官网| 一边摸一边做爽爽视频免费| 中国三级夫妇交换| 大香蕉久久网| 丝袜在线中文字幕| 日本欧美国产在线视频| 视频中文字幕在线观看| 亚洲精品色激情综合| 又大又黄又爽视频免费| 国产精品国产三级专区第一集| 国产精品偷伦视频观看了| 国产精品久久久久久久电影| 久久久久久久久久久免费av| 日本vs欧美在线观看视频| 一级爰片在线观看| 蜜桃国产av成人99| 欧美 亚洲 国产 日韩一| 精品国产国语对白av| 国语对白做爰xxxⅹ性视频网站| 啦啦啦中文免费视频观看日本| 成人影院久久| 丰满乱子伦码专区| 久久久精品94久久精品| 观看美女的网站| 久久青草综合色| 天天操日日干夜夜撸| 18禁国产床啪视频网站| 亚洲第一av免费看| 免费黄网站久久成人精品| 国产精品一二三区在线看| 亚洲国产欧美日韩在线播放| 纯流量卡能插随身wifi吗| 自线自在国产av| 老熟女久久久| 乱码一卡2卡4卡精品| 91精品国产国语对白视频| 天堂8中文在线网| 久久久久久久精品精品| videossex国产| 咕卡用的链子| 亚洲人与动物交配视频| 飞空精品影院首页| 高清毛片免费看| 中文字幕制服av| 在线亚洲精品国产二区图片欧美| 97在线人人人人妻| 精品亚洲成a人片在线观看| 日本欧美视频一区| 在线观看免费视频网站a站| 高清视频免费观看一区二区| 不卡视频在线观看欧美| 久久久久久久久久人人人人人人| 性色av一级| 99久久人妻综合| 狠狠精品人妻久久久久久综合| 伊人久久国产一区二区| 在线观看美女被高潮喷水网站| 美女国产视频在线观看| 日韩大片免费观看网站| 22中文网久久字幕| 22中文网久久字幕| 人成视频在线观看免费观看| 国产毛片在线视频| av在线app专区| 欧美成人精品欧美一级黄| 精品亚洲乱码少妇综合久久| 国产一区有黄有色的免费视频| 久久久久久久大尺度免费视频| 老熟女久久久| 久久国产精品男人的天堂亚洲 | 最新的欧美精品一区二区| 免费在线观看黄色视频的| av片东京热男人的天堂| 日韩大片免费观看网站| 青春草亚洲视频在线观看| 天堂8中文在线网| 亚洲色图综合在线观看| 好男人视频免费观看在线| 精品一区二区三区视频在线| 一边摸一边做爽爽视频免费| 久久精品国产亚洲av天美| 日韩一区二区三区影片| 97在线视频观看| 亚洲精品久久久久久婷婷小说| 日韩视频在线欧美| 晚上一个人看的免费电影| 9热在线视频观看99| 亚洲精品国产av蜜桃| 熟妇人妻不卡中文字幕| 亚洲伊人色综图| av黄色大香蕉| 一个人免费看片子| 国产淫语在线视频| 亚洲,欧美,日韩| 亚洲在久久综合| 老司机影院毛片| 一区二区日韩欧美中文字幕 | 亚洲成人一二三区av| 秋霞在线观看毛片| 一边摸一边做爽爽视频免费| 亚洲精品第二区| 欧美丝袜亚洲另类| 高清毛片免费看| 美女中出高潮动态图| 久久久欧美国产精品| 国产一区有黄有色的免费视频| 午夜福利视频在线观看免费| 午夜免费男女啪啪视频观看| 亚洲久久久国产精品| 国产男女超爽视频在线观看| 午夜日本视频在线| 搡女人真爽免费视频火全软件| 国国产精品蜜臀av免费| 精品少妇内射三级| 国产乱来视频区| 99热全是精品| 久久精品国产鲁丝片午夜精品| 午夜免费观看性视频| 久久女婷五月综合色啪小说| 寂寞人妻少妇视频99o| 亚洲性久久影院| 老熟女久久久| 高清在线视频一区二区三区| av在线app专区| 热99久久久久精品小说推荐| 亚洲,欧美精品.| 免费观看在线日韩| 亚洲av国产av综合av卡| 少妇人妻 视频| 日韩制服骚丝袜av| 国产成人免费观看mmmm| 国产亚洲欧美精品永久| 啦啦啦中文免费视频观看日本| 欧美人与性动交α欧美软件 | 色94色欧美一区二区| 天天躁夜夜躁狠狠久久av| 中文字幕人妻熟女乱码| 欧美精品国产亚洲| 日韩免费高清中文字幕av| 久久人人97超碰香蕉20202| 天天影视国产精品| 99re6热这里在线精品视频| 十八禁网站网址无遮挡| 国产成人精品久久久久久| 女人精品久久久久毛片| 激情五月婷婷亚洲| 黄色毛片三级朝国网站| 97精品久久久久久久久久精品| 精品亚洲乱码少妇综合久久| 亚洲精品一二三| 一级毛片黄色毛片免费观看视频| 韩国av在线不卡| 少妇熟女欧美另类| 一区二区三区乱码不卡18| 一级毛片我不卡| 亚洲,一卡二卡三卡| 最近的中文字幕免费完整| 青春草视频在线免费观看| 亚洲精品美女久久av网站| 永久网站在线| 伦理电影大哥的女人| 五月伊人婷婷丁香| 亚洲精品自拍成人| 精品国产乱码久久久久久小说| 超色免费av| 一边亲一边摸免费视频| 嫩草影院入口| 丰满饥渴人妻一区二区三| 亚洲第一av免费看| 亚洲在久久综合| 纵有疾风起免费观看全集完整版| 人人妻人人澡人人看| 天美传媒精品一区二区| 亚洲综合精品二区| 丝袜在线中文字幕| 另类亚洲欧美激情| 日韩人妻精品一区2区三区| 国产一区二区在线观看av| 国产日韩一区二区三区精品不卡| 精品卡一卡二卡四卡免费| 一级黄片播放器| 在线观看免费视频网站a站| 国产精品久久久久久久电影| 交换朋友夫妻互换小说| 大香蕉久久网| 97超碰精品成人国产| 日韩欧美一区视频在线观看| 久久久精品免费免费高清| 国产精品久久久久久久电影| 亚洲精品,欧美精品| 婷婷色综合大香蕉| 午夜激情久久久久久久| 亚洲色图综合在线观看| 女人被躁到高潮嗷嗷叫费观| 十八禁高潮呻吟视频| 国产淫语在线视频| 黄色一级大片看看| 欧美精品一区二区大全| 中国三级夫妇交换| 9色porny在线观看| 一本久久精品| 国产精品久久久久久av不卡| 久久国内精品自在自线图片| 国产精品一区www在线观看| 插逼视频在线观看| 日本av免费视频播放| 天堂8中文在线网| 久久精品国产综合久久久 | 精品酒店卫生间| 高清黄色对白视频在线免费看| 日韩不卡一区二区三区视频在线| 久久热在线av| 免费黄色在线免费观看| 女人被躁到高潮嗷嗷叫费观| 精品国产一区二区三区四区第35| 久久人妻熟女aⅴ| 桃花免费在线播放| 国产免费一区二区三区四区乱码| 在线观看人妻少妇| 精品国产一区二区三区久久久樱花| 精品国产一区二区久久| 99九九在线精品视频| 国产一级毛片在线| 色5月婷婷丁香| 亚洲精品日韩在线中文字幕| 夫妻午夜视频| 国产精品嫩草影院av在线观看| 国产精品麻豆人妻色哟哟久久| 久久久国产精品麻豆| 国产一区二区在线观看日韩| 卡戴珊不雅视频在线播放| av电影中文网址| 丰满饥渴人妻一区二区三| 精品视频人人做人人爽| 国产一区二区三区综合在线观看 | 九色成人免费人妻av| 亚洲国产最新在线播放| 99久久人妻综合| 国产在线免费精品| 国产福利在线免费观看视频| 一区二区av电影网| av在线播放精品| 内地一区二区视频在线| 亚洲性久久影院| 国产精品无大码| 国产综合精华液| 国产精品国产三级国产专区5o| 国产又爽黄色视频| 天堂中文最新版在线下载| 国产极品粉嫩免费观看在线| 久久97久久精品| 91成人精品电影| 亚洲国产精品一区二区三区在线| av免费在线看不卡| 热99国产精品久久久久久7| 一级爰片在线观看| 天天操日日干夜夜撸| 五月玫瑰六月丁香| 热99国产精品久久久久久7| 国产日韩欧美亚洲二区| 欧美老熟妇乱子伦牲交| 成人免费观看视频高清| 亚洲熟女精品中文字幕| 纵有疾风起免费观看全集完整版| 精品久久久久久电影网| 99久久中文字幕三级久久日本| 一区在线观看完整版| 久久人人爽人人爽人人片va| 在线观看免费日韩欧美大片| 桃花免费在线播放| 少妇的逼好多水| 少妇 在线观看| 久久久久久久亚洲中文字幕| 美女大奶头黄色视频| 精品国产乱码久久久久久小说| 国产精品一区二区在线不卡| 少妇的逼好多水| 下体分泌物呈黄色| 赤兔流量卡办理| 国产精品一国产av| 亚洲内射少妇av| 99久久精品国产国产毛片| 99久久人妻综合| 国产永久视频网站| 久久综合国产亚洲精品| 欧美成人午夜免费资源| 中文字幕人妻丝袜制服| 久久影院123| 国产在线一区二区三区精| 久久人妻熟女aⅴ| 天堂8中文在线网| av在线老鸭窝| av网站免费在线观看视频| 制服人妻中文乱码| 免费看av在线观看网站| 久久久久久久久久人人人人人人| 久久久久视频综合| 97人妻天天添夜夜摸| 熟女av电影| 精品一区二区免费观看| 国产精品久久久久久精品电影小说| 精品人妻在线不人妻| 亚洲精品日韩在线中文字幕| 国产免费一区二区三区四区乱码| 国产成人精品一,二区| 男的添女的下面高潮视频| 久久久国产一区二区| 国产精品一二三区在线看| 精品酒店卫生间| 女人精品久久久久毛片| 欧美日韩av久久| 亚洲久久久国产精品| 大香蕉久久成人网| 男女国产视频网站| 99热全是精品| 久久久久精品人妻al黑| 久久精品久久久久久噜噜老黄| 亚洲精品av麻豆狂野| 亚洲av综合色区一区| 高清视频免费观看一区二区| 免费高清在线观看视频在线观看| 免费av中文字幕在线| 又黄又粗又硬又大视频| 国产爽快片一区二区三区| 久久精品人人爽人人爽视色| 亚洲美女视频黄频| 美女内射精品一级片tv| 老熟女久久久| 亚洲国产精品一区二区三区在线| 亚洲精品国产色婷婷电影| 午夜激情av网站| 日韩,欧美,国产一区二区三区| 国产一区二区在线观看日韩| 在线观看免费高清a一片| 亚洲av在线观看美女高潮| 久久久国产精品麻豆| 亚洲精品色激情综合| 国产精品嫩草影院av在线观看| 久久精品aⅴ一区二区三区四区 | 午夜激情av网站| 国产精品人妻久久久久久| 久久99精品国语久久久| 久久午夜综合久久蜜桃| 国产精品一二三区在线看| 久久狼人影院| 国产永久视频网站| 久久久精品94久久精品| 自线自在国产av| videosex国产| 七月丁香在线播放| av不卡在线播放| 免费少妇av软件| 男的添女的下面高潮视频| 多毛熟女@视频| 高清毛片免费看| 99国产综合亚洲精品| 一级a做视频免费观看| 男人舔女人的私密视频| 国产女主播在线喷水免费视频网站| www.熟女人妻精品国产 | 母亲3免费完整高清在线观看 | 日本黄大片高清| 人人妻人人添人人爽欧美一区卜| 久久久久久人妻| 日韩一本色道免费dvd| 免费观看无遮挡的男女| 国产色婷婷99| 成人毛片60女人毛片免费| 桃花免费在线播放| 午夜福利视频精品| 巨乳人妻的诱惑在线观看| 观看av在线不卡| 中文字幕精品免费在线观看视频 | 久久久亚洲精品成人影院| 狂野欧美激情性xxxx在线观看| 97在线人人人人妻| 人成视频在线观看免费观看| 亚洲欧洲精品一区二区精品久久久 | 日日撸夜夜添| 久久久久网色| 免费在线观看完整版高清| 最黄视频免费看| 国产在视频线精品| 国产不卡av网站在线观看| av在线观看视频网站免费| 一级毛片我不卡| av视频免费观看在线观看| www.色视频.com| 人妻系列 视频| 美女脱内裤让男人舔精品视频| 国产欧美另类精品又又久久亚洲欧美| 亚洲人与动物交配视频| av视频免费观看在线观看| 国产 一区精品| 性色avwww在线观看| 亚洲第一av免费看| 国产极品天堂在线| 国精品久久久久久国模美| 国产亚洲av片在线观看秒播厂| a级毛片在线看网站| 免费观看无遮挡的男女| 国产精品久久久av美女十八| 亚洲激情五月婷婷啪啪| 精品卡一卡二卡四卡免费| 国产淫语在线视频| 亚洲美女视频黄频| 国产无遮挡羞羞视频在线观看| 亚洲一码二码三码区别大吗| 久久午夜福利片| 久久鲁丝午夜福利片| 国产精品欧美亚洲77777| 在现免费观看毛片| 欧美激情 高清一区二区三区| 亚洲国产av影院在线观看| 丝袜脚勾引网站| 国产欧美亚洲国产| 精品久久蜜臀av无| 国产亚洲一区二区精品| 99视频精品全部免费 在线| 国产日韩一区二区三区精品不卡| 国产日韩欧美在线精品| 亚洲熟女精品中文字幕| 久久久久精品性色| 九九在线视频观看精品| 天堂8中文在线网| 一级爰片在线观看| 26uuu在线亚洲综合色| 久久久久国产精品人妻一区二区| 免费观看在线日韩| 91在线精品国自产拍蜜月| 久久久久久久亚洲中文字幕| 日本wwww免费看| 热99国产精品久久久久久7| 亚洲,欧美精品.| 一边亲一边摸免费视频| 熟妇人妻不卡中文字幕| 国产极品天堂在线| av国产久精品久网站免费入址| 精品人妻在线不人妻| 人人妻人人澡人人看| 韩国高清视频一区二区三区| 免费观看av网站的网址| 啦啦啦视频在线资源免费观看| 一个人免费看片子| 18在线观看网站| 久久久精品94久久精品| 黄色 视频免费看| 伦理电影大哥的女人| 国产一区二区在线观看日韩| 国产综合精华液| 自线自在国产av| 国产免费现黄频在线看| 七月丁香在线播放| 午夜老司机福利剧场| 国产在线一区二区三区精| 性高湖久久久久久久久免费观看| 青春草国产在线视频| 考比视频在线观看| 日韩人妻精品一区2区三区| 欧美性感艳星| 久热久热在线精品观看| 九九在线视频观看精品| 国产成人免费观看mmmm| 美国免费a级毛片| 成年女人在线观看亚洲视频| 这个男人来自地球电影免费观看 | 亚洲精品国产色婷婷电影| 女的被弄到高潮叫床怎么办| 日韩av在线免费看完整版不卡| 99久国产av精品国产电影| 岛国毛片在线播放| 国产视频首页在线观看| 丝袜喷水一区| 99热网站在线观看| 久久久久久久大尺度免费视频| 男女啪啪激烈高潮av片| 下体分泌物呈黄色| 精品国产一区二区三区久久久樱花| 黄片播放在线免费| 久久这里有精品视频免费| 色吧在线观看| 中文字幕制服av| 又大又黄又爽视频免费| 热re99久久精品国产66热6| 只有这里有精品99| 建设人人有责人人尽责人人享有的| 男女边摸边吃奶| 精品一区二区三区视频在线| 久久国产精品大桥未久av| 美女内射精品一级片tv| 99国产综合亚洲精品| 欧美日韩国产mv在线观看视频| 亚洲精品美女久久av网站| 亚洲精品av麻豆狂野| 亚洲国产日韩一区二区| 校园人妻丝袜中文字幕| 午夜福利,免费看| 亚洲国产精品一区三区| 欧美激情 高清一区二区三区| 久久午夜综合久久蜜桃| 国产成人午夜福利电影在线观看| av天堂久久9| 成年女人在线观看亚洲视频| 精品午夜福利在线看| 老女人水多毛片| 日韩伦理黄色片| 香蕉精品网在线| 婷婷色综合大香蕉| 欧美丝袜亚洲另类| 免费观看a级毛片全部| 国产亚洲一区二区精品| 婷婷色麻豆天堂久久| 九色亚洲精品在线播放| 成人毛片60女人毛片免费| 久久国内精品自在自线图片| 国产免费一级a男人的天堂| 中文字幕精品免费在线观看视频 | 久久ye,这里只有精品| 亚洲精华国产精华液的使用体验| 亚洲精品国产色婷婷电影| 国产av一区二区精品久久| 91久久精品国产一区二区三区| 亚洲,欧美精品.| 国产成人av激情在线播放| 欧美+日韩+精品| 久久97久久精品| 成人亚洲精品一区在线观看| 丝袜美足系列| 精品国产露脸久久av麻豆| 国语对白做爰xxxⅹ性视频网站| 日韩熟女老妇一区二区性免费视频| 精品少妇内射三级| 国产高清不卡午夜福利| 99久久综合免费| av.在线天堂| 99久久精品国产国产毛片| 国产熟女欧美一区二区| 日韩欧美精品免费久久| 人人妻人人澡人人爽人人夜夜| av在线播放精品| 国产精品一区二区在线观看99| 少妇猛男粗大的猛烈进出视频| 午夜福利乱码中文字幕| 美女国产视频在线观看| 免费黄色在线免费观看| 亚洲av电影在线进入| 国产片内射在线| 一级爰片在线观看| 又大又黄又爽视频免费| 国产成人aa在线观看| 免费女性裸体啪啪无遮挡网站| 97在线视频观看| 内地一区二区视频在线| 日韩av不卡免费在线播放| 一本久久精品| 色94色欧美一区二区| 啦啦啦在线观看免费高清www| 亚洲精品国产av蜜桃| 免费人妻精品一区二区三区视频| 王馨瑶露胸无遮挡在线观看| 国产又色又爽无遮挡免| 五月玫瑰六月丁香| 亚洲精品一区蜜桃| 如日韩欧美国产精品一区二区三区| 亚洲人与动物交配视频|