• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Stroke Risk Assessment Decision-Making Using a Machine Learning Model:Logistic-AdaBoost

    2024-01-20 13:02:00CongjunRaoMengxiLiTingtingHuangandFeiyuLi

    Congjun Rao ,Mengxi Li ,Tingting Huang and Feiyu Li

    1School of Science,Wuhan University of Technology,Wuhan,430070,China

    2Wuhan University of Technology Hospital,Wuhan University of Technology,Wuhan,430070,China

    ABSTRACT Stroke is a chronic cerebrovascular disease that carries a high risk.Stroke risk assessment is of great significance in preventing,reversing and reducing the spread and the health hazards caused by stroke.Aiming to objectively predict and identify strokes,this paper proposes a new stroke risk assessment decision-making model named Logistic-AdaBoost(Logistic-AB)based on machine learning.First,the categorical boosting(CatBoost)method is used to perform feature selection for all features of stroke,and 8 main features are selected to form a new index evaluation system to predict the risk of stroke.Second,the borderline synthetic minority oversampling technique(SMOTE)algorithm is applied to transform the unbalanced stroke dataset into a balanced dataset.Finally,the stroke risk assessment decision-making model Logistic-AB is constructed,and the overall prediction performance of this new model is evaluated by comparing it with ten other similar models.The comparison results show that the new model proposed in this paper performs better than the two single algorithms(logistic regression and AdaBoost)on the four indicators of recall,precision,F1 score,and accuracy,and the overall performance of the proposed model is better than that of common machine learning algorithms.The Logistic-AB model presented in this paper can more accurately predict patients’stroke risk.

    KEYWORDS Stroke;risk assessment decision-making;CatBoost feature selection;borderline SMOTE;Logistic-AB

    1 Introduction

    As basic public services in a country,the quality and level of medical and health services have an important impact on people’s health.The continuous improvement in the extent of medical and health services is also a key factor in promoting the sustainable development of the medical and health industry.The rise and development of artificial intelligence and big data have provided a strong boost to improve medical and health services.Aligned with artificial intelligence and big data,providing personalized,intelligent assistance to medical and digital diagnosis technology not only improves diagnostic accuracy and efficiency but also reduces operating costs and increases economic benefits,promoting the sustainable development of the medical and health industry.Considering this,this paper is committed to studying intelligent,assisted medical treatment and digital diagnosis technology for stroke and proposes a method based on machine learning to assist the clinical diagnosis of stroke to provide more accurate,efficient and intelligent support for its clinical diagnosis.

    Stroke,also called cerebrovascular accident,is a type of disease in which brain tissue is damaged due to the sudden rupture or blockage of a blood vessel in the brain,preventing blood flow to the brain.Stroke,as a chronic noncommunicable disease,is very harmful,and its prevalence and mortality rate continue to rise.Stroke has become a serious health hazard worldwide.The incidence,recurrence,disability and mortality rates of stroke patients are very high,greatly reducing patient quality of life[1].In China,two out of every five people die of cardiovascular disease.It is projected that approximately 330 million people currently suffer from stroke,with 13 million stroke patients[2],which represents the second largest group of patients among the total number of patients with cardiovascular diseases.Early prevention is very important because stroke is irreversible,not easy to cure,the cost of care is high,and the medical burden is increasing.However,many patients do not benefit from early treatment,which is usually because they do not know the symptoms of stroke,do not find emergency treatment or do not have an emergency response.There are many factors affecting the development of stroke,and some studies have pointed out that age,heart disease,diabetes mellitus,hypertension,sex,dyslipidemia and poor lifestyle habits are all factors contributing to an increased risk of stroke[3,4].

    There is no specific method for treating stroke,but we can accurately predict the risk of stroke and implement early prevention and early intervention.Thus,stroke risk assessment is of great significance in preventing,reversing and reducing the spread and health hazards of stroke.The early detection and prevention of stroke can accurately identify early and potential stroke patients in advance and accurately control their conditions in a targeted manner,effectively preventing the vast development of stroke and improving the quality of life of patients.In addition,the early detection and prevention of stroke can effectively identify the main pathogenic factors of stroke for hierarchical management and early intervention in high-risk groups to reduce the risk of disease,which has important practical significance for the intelligent prevention and treatment of stroke.

    With the constant accumulation of medical data and the continuous development of machine learning algorithms,machine learning has entered the field of medicine,where large amounts of data provide training support for machine learning as well as new methods for discovering disease patterns[5].Machine learning methods process data efficiently and mine it for hidden patterns.These excellent algorithmic features can find the source and related attributes of a disease better and faster,leading to disease diagnosis and prediction.

    The research on stroke risk assessment has been very intensive,but there are still some issues in the context of the existing work.First,the evaluation indices for stroke risk assessment are not uniform enough,and there are no clear specifications,which easily leads to certain models performing well when all of the evaluation indices are present.However,once some of the indices are missing or replaced,the assessment produced by the model will be greatly compromised.Second,there is a serious imbalance in the stroke dataset,and related studies have performed stroke risk assessment without a providing good solution to solving this problem.Some of the studies in the literature simply increase the sample size in a few categories.However,when the sample size of two dataset categories differs greatly,simply increasing the sample size of a few categories will lead to overfitting and is prone to generating spurious relationships.Some literature sources account for unbalanced datasets in ways that are too cumbersome,which will improve the classification effect of the model but has little value in practical applications.Finally,there is the issue of the accuracy of the model’s assessment.Related studies usually single out a certain machine learning algorithm for training and then finally compare the results with those of other machine learning algorithms to conclude that a certain algorithmic model is suitable for stroke risk assessment.Single algorithmic models are more or less flawed,which is caused by that algorithm itself and is difficult to avoid.Thus,assessing stroke incidence risk using these single algorithms may not being less accurate,but the accuracy is not too high either.Moreover,these models are usually compared with only 2-3 common machine learning algorithms and not with the rest of the algorithmic models,making the results not very convincing.Considering this,this paper aims to establish a stroke assessment model based on machine learning that can effectively reduce the risk of citizens suffering from the onset of stroke by using a number of methods to find the influencing factors associated with stroke and then constructing an integrated model to assess stroke risk.Compared to the literature,the main contributions of this paper are summarized as follows:

    (1) A new index system of stroke risk assessment is constructed.CatBoost is used to perform feature selection for all features of stroke,and the importance ranking of all features of stroke disease is determined.The index system screened by using the feature selection method of CatBoost is not only representative but also more common,which is conducive to promotion.

    (2) Borderline SMOTE is applied to transform the unbalanced stroke dataset into a balanced dataset,which solves the defect of fuzzy boundaries after generating new samples by using the SMOTE algorithm.

    (3) A new Logistic-AB model is developed to predict the risk of stroke.The model not only improves upon traditional logistic regression but also takes the output of AdaBoost as a reference to prevent obvious misclassification in logistic regression,which further improves classification.After comprehensive comparison with other models,the Logistic-AB model proposed in this paper is more predictive and more suitable for evaluating the risk of diseases.

    The structure of the rest of this paper is as follows:Section 2 gives a literature review;Section 3 designs the stroke evaluation index system;Section 4 proposes a new stroke risk assessment decisionmaking model (Logistic-AB) based on machine learning;Section 5 provides an empirical analysis of stroke risk assessment;and Section 6 summarizes the whole paper systematically,points out the shortcomings and puts forward the prospects.

    2 Literature Review

    In the research area of stroke risk prediction,Manuel et al.[6] suggested using patient selfreported information to accurately predict the health behaviors of patients with sudden stroke,and this information can be combined with the results from a survey of population health to predict the risk of individual stroke,which can be used to project the health of the population or to issue certain stroke prevention measures for the patients.Lumley et al.[7] developed a new stroke prediction model for Americans that used an interactive Java application for risk prediction to predict the factors associated with stroke.They used the model to empirically analyze a patient and determine their risk of stroke over a five-year period.In addition,foreign studies have evaluated the prediction model with the help of the calculated AUC;for example,in an improvement to the Framingham stroke scale,the area under curve(AUC)was determined to be 0.726 for males and 0.656 for females[8].Domestic studies have also used this method,and the results obtained were similar to those from other countries.The area under the receiver operating characteristic(ROC)curve of the pooled queuing equation was 0.713 for males and 0.818 for females[9].Moreover,a stroke risk calculator predicted the risk of stroke over 5-10 years,but not at an age less than 20 years,and the performance of the model for males and females was 0.740 and 0.715,respectively,as determined by the AUC[10,11].In addition,by analyzing the factors influencing the onset of stroke,the impact of education has been used abroad to control the factors contributing to the onset of the disease,thus achieving “prevention of the disease before it occurs”.The current system of disease prevention,control and health care delivery in China has not identified a cause for the high mortality rate of stroke.

    Ten years ago,academics used simple mathematical formulas for stroke risk prediction due to the small amount of available data.Currently,with the improvements in data collection techniques and computer data processing capabilities,researchers have begun to use advanced methods,such as multiple linear regression and neural networks,to process historical data with certain results.Although this method combines multiple nonlinear complexities,the accuracy is low and suboptimal.Taking the study of Sun et al.[12] on stroke patients as an example,the risk factors for the occurrence of stroke were obtained through retrospective statistics,which are highly representative.Aslam et al.[13]studied the etiology and risk factors for stroke in young adults,and the research results showed that common risk factors for ischemic stroke in the local young population included hypertension,diabetes mellitus and smoking.Wang et al.[14] applied a novel metaCCA method to identify the risk genes for stroke that may overlap with seven correlated risk factors,including atrial fibrillation,hypertension,coronary artery disease,heart failure,diabetes,body mass index,and total cholesterol level.By empirical analysis,Asowata et al.[15] concluded that the main factors causing stroke are hypertension,dyslipidemia,diabetes mellitus,and a family history of cardiovascular disease.

    With the constant advancements in science,researchers have begun to apply mathematical statistics to disease prediction models,which has led to quantitative predictions of disease progression.Currently,simple mathematical models based on statistical theory are mainly used to predict trends of disease development[16].Disease patterns are approximated by using methods such as regression,and calculations and predictions are made with the help of statistical analysis software.Researchers have proposed a variety of models to predict chronic disease pathogenic factors,risk factors,and treatment strategies and have achieved significant results in practical application[17].A widely used model for predicting the 10-year risk of ischemic cardiovascular disease [18] has gained acceptance in the medical community.However,this model treats coronary heart disease and stroke as the same disease for the prediction,rather than creating a stroke-only prediction model.In clinical practice,it is most common to use algorithms based on Bayesian networks or neural networks to build different predictive models for diseases.Wang et al.[19]used decision trees to develop a risk prediction model for hemorrhagic transformation in acute ischemic stroke.Xu et al.[20]used factor analysis and logistic regression to conclude that the incidence of stroke in Dali was associated with blood glucose,age and sex.Xu[21]performed a screening to obtain the factors influencing the development of progressive ischemic stroke by comparison and logistic regression analysis.Other studies have used Cox regression model analysis to obtain the risk factors affecting the development of stroke,but the results were not accurate enough because of the small number of cases due to long intervals between the pre-and postvisits,resulting in many lost visits[22,23].In addition,efforts have been made in China to prevent stroke early,but the outcomes have been less than satisfactory.For example,the Prediction for ASCVD Risk in China(China-PAR)model,developed by Gu Dongfeng et al.,has attracted much attention in China as an atherosclerotic cardiovascular disease(ASCVD)risk prediction tool[24].This model has high prediction accuracy among the Chinese population,but it mainly focuses on the prediction of cardiovascular diseases.Benameur et al.[25]compared the performance of three parametric imaging techniques (covariance analysis and parametric imaging based on Hilbert transform and that based on the monogenic signal)used in cardiac MRI for the regional quantification of cardiac dysfunction,and the three approaches were evaluated using cine-MRI frames acquired from three planes of view.

    With the in-depth application of big data,stroke risk prediction methods based on machine learning have become the focus of research in recent years [26],because their superior algorithms can identify the source of morbidity and the related attributes faster and more accurately and provide strong support for subsequent precision medicine[27].For example,Kumar et al.[28]applied curve fitting and an artificial neural network(ANN)to model the condition of patients to determine whether a patient is suffering from heart disease.Chang et al.[29]used machine learning algorithms to predict the risk of stroke incidence in Jiangxi and established two models,a support vector machine model and a plain Bayes model,and found that the support vector machine performed better after comparing the results.Yu et al.[30] used decision trees,multilayer perceptron and convolutional networks in machine learning to compare the prediction results with the results from traditional multifactorial logistic regression and finally found that convolutional neural networks have higher accuracy in stroke risk prediction.Arif et al.[31] developed a Lasso-logistic regression model that can manage SARSCoV-2 infections of varying severity(severe,moderate,and mild)by using machine learning,and the results showed that the number of deaths has been reduced thanks to the established prediction method that enables early detection in patients across these three severity levels.

    However,one of the problems machine learning faces in stroke research is how unbalanced data should be analyzed,and the general idea has been to reconstruct the dataset.For this,a combination of oversampling,under-sampling and SMOTE algorithms can be used [32-34].Combining the active learning support vector machine (SVM) algorithm and SMOTE algorithm [35] can provide a good solution to the problem of unbalanced datasets.On this basis,Xu et al.[36] proposed an improved synthetic minority oversampling technique (ISMOTE) algorithm from the perspective of oversampling,which improves the classification performance of unbalanced datasets.Tao et al.[37]integrated the idea of negative immunity to generate artificial minority class samples,which can offer a good solution to the problem of underrepresentation of minority class samples.The problem of SVM classification bias can be effectively improved by integrating cost-sensitive learning,oversampling and under-sampling [38,39].In the problem of breast cancer data classification,Wang et al.[40] used a combination of SMOTE,particle swarm optimization and the C5.0 algorithm in their research and found that this method can significantly improve the classification effect.In addition,Sun et al.[41]observed that using the SMOTE algorithm can effectively solve the problem of unbalanced data.The above methods mainly focus on small datasets and do not consider the processing methods for large datasets.

    Based on existing research and aiming to objectively predict and identify of stroke,this paper proposes a new stroke risk assessment decision-making model based on machine learning named Logistic-AdaBoost(Logistic-AB).First,this paper preliminarily screens the stroke-related influencing factors,uses the CatBoost method to further select the initial screening indices to obtain the final indices,and constructs a new index system for stroke assessment.Second,the borderline SMOTE algorithm is used to balance the data,which can solve the defects of fuzzy boundaries after the generation of new samples by the SMOTE algorithm.Finally,after learning the common stroke risk assessment models,this paper proposes a stroke risk assessment decision-making model named Logistic-AB and uses 10 homogeneous machine learning algorithms to evaluate the overall prediction performance of this new model.

    3 Construction of the Stroke Evaluation Index System

    This section begins with the preprocessing of the collected data and correlation analysis to obtain the influential factors related to stroke.Then,the criteria for constructing the evaluation index system are elaborated and preliminary screening indicators are provided,which helps to more intuitively and comprehensively understand the relevant factors affecting the incidence of stroke.Finally,the CatBoost feature selection method is used to filter the preliminary screening indicators to obtain the final stroke risk assessment index system.

    3.1 Data Preprocessing

    The data in this paper are acquired from the publicly available dataset Kaggle 2021(https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset).First,we examine the dataset find that only the body mass index data had missing values;its overall distribution is shown in Fig.1.Fig.1 shows that the body mass index data tended to be normally distributed,so the median is used to fill in the missing data.Then,we review the dataset for outliers.Through observation,we find that there is only one outlier in the data for the attribute of sex,so it is directly removed.

    Figure 1:Distribution of body mass index and log blood glucose levels

    In addition,considering the large difference between the extreme values of the average blood glucose value,the data show an obvious right-skewed trend,so the extreme outliers are deleted.However,we find that the average blood glucose values still remain heavily right-skewed,so we take the logarithm of all the average blood glucose values.Finally,we find that the processed data fluctuate less and that the distribution tends to be more normal,as shown in Fig.1.

    3.2 Criteria for Selecting the Relevant Features for Prediction

    The following criteria are given to select the relevant features for stroke risk assessment decisionmaking:

    Systematic principle:There should be a certain logical relationship between the indicators,which should not only be related to stroke but also have an internal relationship among themselves.

    Principle of typicality:Evaluation indicators should have a certain typical representation,so the number of indicators should not only be as small as possible but also fully and comprehensively reflect the risk of stroke.

    Dynamic principle:With the continuous improvement of technology,influencing factors related to stroke will continue to be discovered.Therefore,stroke risk assessment should be a dynamic concept,and the selection of indicators should have dynamic variability.

    Simple and scientific principle:The selection of indicators should follow a scientific basis,neither too much nor too little,which can objectively and truly reflect the risk of stroke.Moreover,indicator data should be easy to obtain and simple to calculate.

    Quantifiability principle: There are many factors affecting stroke,and the selected indicators should be quantified as much as possible to facilitate subsequent data analysis.

    Practicality principle: When selecting features,we should consider whether it is practical and whether more common and understandable indicators should be chosen while trying to avoid obscure and infrequently used indicators.

    3.3 Preliminary Screening Indicators

    To construct a scientific and reasonable evaluation index system,the evaluation indices selected should be considered from various aspects.The risk factors for stroke vary from region to region depending on the population,but it is generally agreed that the main risk factors should meet the following criteria[42].First,the risk factor exists in a large number of people.Second,the risk factor has a significant independent effect on the risk of stroke.Finally,the risk of the onset of the disease can be reduced through treatment and prevention.After reviewing the relevant literature[43-47],the following initial screening indicators are selected in this paper:

    (1)Age

    Uncontrollable factors such as sex and age are factors that influence the occurrence of cardiovascular disease (CVD),and the risk of cardiovascular disease usually increases with age.Studies have shown that men are generally more likely to develop CVD than women,but this difference decreases with age,and the sex difference in CVD recurrence rates also decreases[43].

    (2)Hypertension

    One of the major risk factors for stroke is high blood pressure,which stimulates the development of cerebral atherosclerosis.In the atherosclerotic region,the vessel wall becomes thicker,the lumen becomes narrower or the plaque ruptures to form a thrombus,causing conditions,such as cerebral arterial blockage,that can result in cerebral ischemia or stroke.Statistically,the effective control of blood pressure can reduce the prevalence of stroke in patients by 50%[44].

    (3)Heart disease and blood sugar levels

    Heart disease and blood glucose levels are also important factors that influence the incidence of stroke,and in general,the risk of stroke in patients with heart disease exceeds the risk of individuals without heart disease by more than twofold;moreover,high blood glucose levels also increase the risk of stroke.According to a previous study,there is a significant difference between the prevalence of stroke and the prevalence of heart disease (χ2=25.915,p=0.000) [45],which indicates a strong association between heart disease and stroke.Additionally,approximately 40% of stroke patients also suffer from hyperglycemia,which can aggravate neurological damage and cause ischemic stroke progression.

    (4)Type of work,marital status and place of residence

    Different types of work,marital statuses,and living places bring different stresses to people,such as if they have been engaged in high-intensity work for a long time,living in a depressing place for a long time,or facing various problems in their marriage,which will increase stress levels and make individuals more prone to diseases.Some studies have shown that the incidence of stroke also varies in different occupational groups[46].

    (5)Smoking and body mass index

    Cigarettes contain many toxic components,such as nicotine and carbon monoxide.Moreover,smoking can lead to increased blood viscosity and hypoxia in the cells lining the blood vessels and contribute to atherosclerosis,thus increasing the prevalence of stroke.Additionally,the higher the body mass index(BMI)is,the higher the prevalence of stroke is[47].

    The evaluation indices for stroke risk assessment in the initial screening of this paper are shown in Table 1.

    Table 1: The preliminary screening indicators for stroke risk assessment

    As the indicators in the initial screening may have a higher correlation,the model will appear to have multiple covariants.Although the assessment can also produce good results without processing,the practical significance is not very large,so the correlation of the indicator data needs to be tested first.From Fig.2,it can be seen intuitively which characteristics are better correlated with stroke,which are age,high blood pressure,heart disease,marital status,average blood glucose level and body mass index.While age,type of work and marital status had higher correlations,it can be determined that the initial screening indicators need to be further screened.

    Figure 2:Correlation coefficients of initial screening indicators

    3.4 Feature Selection by CatBoost

    Based on the above,it is found that there is a correlation between the indicators,so this paper further screens the indicators after the initial selection.Feature selection [48-50],also known as feature subset selection(FSS)or attribute selection,selectsNfeatures(N<M)from the existingMfeatures to reduce the dimensionality of the dataset.Machine learning often suffers from overfitting,and to address this,four methods are usually considered,such as collecting more data.To reduce complexity,the complexity penalty can be introduced by using regularization methods.In addition,simple models with fewer parameters or dimensionality reduction of the data(e.g.,feature selection)can be considered.Of these,the first is difficult to implement,and thus,the second and fourth methods are usually used.Feature selection generally includes filter,wrapper,and embedded methods.

    The filtered approach(see Fig.3)evaluates the importance of each feature independently during the feature selection process,regardless of the training process of the model.This approach uses some statistical methods or information theoretic techniques to measure the degree of association or importance of each feature with the target variable.Some of the commonly used filtering methods include mutual information,information gain,analysis of variance(ANOVA)and the chi-square test.These methods select important features based on the magnitude of the measure by calculating some measure between the feature and the target variable.The advantage of filtered methods is that they are simple and fast to use,but they ignore the correlation between features.

    The wrapper approach embeds feature selection into the model training process and evaluates the importance of features by iteratively selecting different subsets of features and training the model,as shown in Fig.4.This approach uses the performance of the model directly as the criterion for feature selection,which is more similar to the application scenario of the final model.Common wraparound approaches include recursive feature elimination(RFE)[51,52]and genetic algorithms[53,54]based on feature selection.RFE is an iterative approach that starts with all features,then removes one or more of the less important features at a time,and then trains the model and evaluates its performance.This process is performed iteratively until a specified number of features or optimal performance is reached.The advantage of the wrapped approach is that it can take into account the correlation between features,but the computational complexity is higher because it requires repeated training of the model.

    Figure 4:Packaging method

    The embedded approach embeds feature selection into the model training process and selects features through the model’s own feature importance assessment,as shown in Fig.5.This approach considers the contribution of features and the quality of splitting them during the training process of the model and ranks the features according to these metrics.

    Figure 5:Embedding method

    CatBoost[55,56]is an algorithm based on a gradient boosting decision tree that calculates feature importance scores during the training process of each tree.These scores can reflect the extent of the contribution of each feature to the model performance.The get_feature_importance method provided by CatBoost can be used to obtain the feature importance score.The core rationale of CatBoost is gradient boosting,an ensemble learning method that constructs a strong classifier by combining multiple weak classifiers,where each weak classifier is trained on the residual of the previous weak classifier.In this way,each weak classifier can focus on solving problems that the previous weak classifier could not solve,thereby gradually improving the performance of the entire model.Another important feature of CatBoost is its ability to automatically process class features.In the traditional gradient lifting algorithm,the category features need to be processed by one-hot encoding,which leads to a sharp increase in the dimension of the feature space,thus increasing the model complexity and training time.CatBoost uses a sort-based approach to class features that converts class features into numerical features,avoiding the problem of unique thermal coding.

    Compared with the algorithms of gradient boosting decision tree (GBDT),eXtreme Gradient Boosting (XGBoost),and Light Gradient Boosting Machine (LightGBM),the CatBoost algorithm has many advantages,but the most helpful in the identification of important features related to stroke risk are the following two.(i)Processing of type features.This allows us to consider not being concerned with type features through feature engineering before training the model.(ii) Predictive offset processing.This can reduce the overfitting of the model and improve the prediction effect of the model.

    Under the CatBoost framework,the following methods can be used for feature selection:

    1) Control the process of feature selection by tuning the model parameters,e.g.,setting one_hot_max_size to limit the dimensionality of the one-hot encoded features or using colsample_bylevel and colsample_bytree to control the proportion of features sampled in each tree.

    2) Use the get_feature_importance method to obtain the importance score of each feature and perform feature ranking and selection based on this score.

    3) CatBoost can be combined with other feature selection methods,such as filtering or wrapping methods,to filter a specific subset of features.

    CatBoost can be expressed as:

    whereFTdenotes a strong learner integrated by multiple weak learners,andftdenotes that the next tree that is built sequentially on top of the existing tree.The loss function is:

    wherel(f(xi),yi)denotes the loss at sample point(xi,yi),wirepresents the weight of theith objective,andJ(f)represents the regularized term.CatBoost uses the prediction results of the previous tree to train the next tree,and through iteration,it effectively improves the accuracy of the final prediction results and the stability of the model.

    The algorithm pseudo code is shown below:

    The preliminary screening indicators for stroke risk assessment determined from the above process are shown in Fig.6.The indicators affecting whether or not one will have a stroke are ranked as follows:age,body mass index,blood glucose level,high blood pressure,heart disease,marital status,type of work,smoking status,place of residence,and sex.In view of the small number of characteristics in the sample,in this paper,place of residence and sex are deleted,and the remaining eight characteristics are used as the main influencing factors of stroke.Therefore,the final indicator system constructed in this paper consists of age,body mass index,blood glucose level,high blood pressure,heart disease,marital status,type of work and smoking status.

    The insights into the 8 selected indicators used to construct the index evaluation system for stroke risk prediction are as follows.Age is one of the most important factors,suggesting that the risk of stroke increases with age.Body mass index and blood glucose level are also very important indicators,which suggests that obesity and high blood glucose are among the major risk factors for stroke.High blood pressure and heart disease are also more important characteristics,which is in line with previous findings.In addition,it was found that marital status and type of job were also strongly associated with the risk of stroke,which may be due to the effects of marital and job stability on physical and mental health.Finally,smoking is another important factor,as it can lead to vasoconstriction and increase the risk of stroke.Based on these results,this paper suggests that these major influencing factors should be the focus for the prevention and treatment of stroke.In addition,more frequent examinations and monitoring,as well as appropriate lifestyle modifications and pharmacological measures to reduce the incidence of stroke,are recommended for those that are high risk.

    Figure 6:CatBoost feature selection results

    4 Stroke Risk Assessment Model

    In this section,based on the evaluation indices selected in Section 3,first,borderline SMOTE is used to balance the data,and second,a new fusion model is proposed to assess the risk of stroke incidence based on the theories of Logistic and AdaBoost algorithms,denoted as the Logistic-AB model in this paper.Moreover,the results from the test of this model are compared with the results from ten machine learning algorithms.

    4.1 Data Balance

    In practice,many industries collect data with unbalanced characteristics.Existing algorithms perform better for majority class data processing than for minority class data processing,so there is a need to improve the classification of minority class data for prediction.In actuality,the number of stroke patients is much lower than the number of normal people.With a sample size of 99% of normal people as the entire sample,the classifier can achieve global accuracy of up to 99%simply by determining that all people are normal.However,there are fewer uses for such classifiers in practical applications.The most critical aspect of the stroke risk assessment problem is the precise identification of stroke patients.However,commonly used classification algorithms tend to neglect the identification of certain minority class samples when building classification models with unbalanced data,thus leading to the models having insufficient practical application value.

    To avoid the above situation,this article first balances the unbalanced data during the model implementation process.The most common methods of balancing data are under-sampling and oversampling.Under-sampling is the random selection of a portion of samples from the majority category so that the majority category has the same or close to the same number of samples as the minority category.The advantage of this method is that it is computationally fast,but the disadvantage is that some important information is lost,which may increase the error rate of the model.Oversampling is the addition of new samples to the minority category so that the majority and minority categories have the same number of samples.This approach has the advantage of avoiding loss of information,but it may also lead to overfitting problems,as the newly generated samples may be very similar to the original samples.

    The simplest method of oversampling is “random oversampling”,but since this method only replicates a few classes of samples and does not generate new samples,it is prone to overfitting.Therefore,the SMOTE algorithm was applied here.

    4.1.1SMOTEAlgorithm

    SMOTE [57] is a technique for generating minority samples that can effectively improve the problem of data imbalance.The flow of the algorithm is as follows:

    1.Calculate the Euclidean distance from each minority class samplexto all minority class sample points to obtain its K-nearest neighbors.

    2.Determine the proportion of data that is unbalanced and set the sampling multiplierN.Randomly select a number of samplesxfrom the K-nearest neighborsxnof the few classes of samples.

    3.For each randomly selected nearest neighborxn,a new sample is generated based on the original sample using the following mathematical formula.

    xnew=x+rand(0,1)×(-x)

    From the above algorithm process,it can be seen that the SMOTE algorithm generates new samples but ignores the distribution characteristics of minority samples,which easily leads to marginalization of the data distribution.In a binary classification problem,if a negative class sample is at the edge of the sample set,a new sample artificially synthesized from that sample will also be at the edge;then,the cycle continues to generate new samples that only get closer to the edge position.This tends to cause the distance between the positive and negative class samples and the threshold to decrease,so that there will be a lot of trouble during the subsequent generation of new samples belonging to the positive or negative class.Therefore,although the algorithm balances the dataset,it increases the difficulty of the classification algorithm in terms of classification.

    4.1.2BorderlineSMOTEAlgorithm

    Based on this defect of the SMOTE method,the borderline SMOTE algorithm [58],which is also an oversampling method,was proposed,which,unlike the SMOTE algorithm,performs near-neighbor linear interpolation on boundary samples,making the newly generated samples more reasonable.The specific steps are described below:

    1.Obtainmneighbors of the minority samplepiand calculate the Euclidean distance from all training sample points.

    2.Classify a few of the samples.Assuming that the number of samples belonging to the majority class ism'in themimmediate neighborhood of the minority class sample,clearly,0 ≤m'≤m;ifm'=m,piis considered to be noise;ifm/2 ≤m'<m,piis classified as a boundary sample;and if 0 ≤m'<m/2,piis classified as a safe sample.The number of boundary samples in the minority class is denoted asdnum,and samples classified as boundary samples are denoted as(0 ≤dnum≤pnum).

    3.Using the sampling multiplierU,the K-nearest neighbors(sindividuals)of the minority class samplesPare selected and linearly interpolated.Interpolation produces a samplesyntheticj=+rj×dj(j=1,2,...,s),derived from the effect of the distances betweenand the K-nearest neighbors(dj),that is also multiplied by a random numberrjbetween 0 and 1.

    4.Combine the original training sampleTwith the new synthetic sample into a new training sampleT'.

    Compared with the SMOTE algorithm,the borderline SMOTE algorithm adopts near-neighbor linear interpolation for boundary samples,which avoids the problem that the data tend to be marginalized in the SMOTE algorithm.In addition,the borderline SMOTE algorithm focuses on boundary samples,which can avoid such samples being misclassified,and this algorithm can increase the distribution of boundary minority samples,making the sample distribution more reasonable.Based on the improved algorithm sample set,the learning prediction effect is more ideal.

    4.2 Logistic-AB Model

    Data classification is a fundamental problem in the field of machine learning and data analysis,and many related studies and many research results have been obtained.At present,the more common and representative classification algorithms include K-nearest neighbors (KNN),decision trees,Bayes,random forest (RF),SVM,logistic regression,neural networks and AdaBoost [59,60].Logistic regression is a widely used model in the field of disease classification,which is highly efficient in terms of training,has a low computation cost,fast,requires few storage resources,and has a good explanatory model,making it easy to understand and implement.However,the disadvantage of logistic regression is that it cannot handle nonlinear problems well,and it is easy to overfit.Under the framework of AdaBoost,a variety of regression classification models can be used to build weak learners,which are very flexible.As a simple binary classifier,AdaBoost has a simple construction,high classification accuracy,and understandable results.Compared with the bagging algorithm and RF algorithm,AdaBoost fully considers the weight of each classifier.In particular,AdaBoost is not prone to overfitting.Thus,based on the advantages and disadvantages of the logistic regression model and AdaBoost algorithm,this paper considers the integration of these two machine learning methods to propose a new model,i.e.,the Logistic-AB model,for stroke risk assessment.

    4.2.1LogisticRegression

    Logistic regression is a generalized linear regression analysis model with the regression equation shown below:

    whereαandβare the parameters to be estimated andP(yi=1|xi) is the probability of event{yi=1}occurring in theith samplexistate,denotedpi(0<pi<1),which is modeled by a logarithmic transformation as follows:

    whenα+βTxi→-∞,pi→0,and ifα+βTxi→∞,pi→1.From Eq.(3),it can be seen that the logistic regression model is nonlinear,so great likelihood estimation can be used for the parametersαandβ.

    4.2.2AdaBoostAlgorithm

    AdaBoost [61] is a boosting algorithm proposed by Yoav Freund and Robert Schapire known as adaptive boosting.The boosting algorithm,also known as boosting,allows weak learners to be boosted to strong learners through continuous learning.Using an iterative algorithm,each step generates a new learner that has been boosted by modifying the learner obtained in the previous step,and then a strong learner is obtained by integrating the learners generated during the iteration process.

    The AdaBoost algorithm is adaptive in the sense that for the current base classifierhk,the weights of the correctly classified samples inhk-1decreases,while the weights of the misclassified samples increases.In this way,the classifierhk“automatically”values samples that have been misclassified by the previous classifier.

    The algorithm steps are as follows:

    Input:training dataT={(x1,y1),(x2,y2),...,(xN,yN)},wherexi∈χ?Rn,yi∈γ={-1,+1};weak learning algorithms.

    Output:final classifier.

    1)Initialize the weight distribution of the training data.

    2)Form=1,2,···,M,

    (a)Based on the weight distributionDmof the training dataset,a basic classifier is learned.

    (b)Calculate the classification error rate ofGm(x)on the training dataset.

    where log is the natural logarithm.

    (d)Update the weight distribution of the training dataset,

    which makesDm+1a probability distribution.

    3)Construct linear combinations of basic classifiers.

    The parameters in the AdaBoost algorithm are as follows.(i) base_estimator: weak learner parameter.(ii) n_estimators: parameter for the number of weak learners,which has a default value of 50.(iii)learning_rate:weight reduction coefficient of the weak learner.This value ranges from 0 to 1.Generally,the step size and the maximum number of iterations are used together to determine the fitting effect of the algorithm,so the parameters n_estimators and learning_rate should be adjusted together.(iv)algorithm:parameters of the classification algorithm,with the default being SAMME.R.(v) loss: error calculation function.The options are linear,square,and exponential.In general,the default option is linear.

    For the binary classifier,the ensuing classification error rate is 50%.For any weak learner with higher performance than random classification,there is alwaysem<50%.Therefore,it can be seen that the subsequent iterations are more concerned with the samples that were misclassified in the previous iterations,which makes the direction of the subsequent weak learner optimization clearer.Moreover,αmdecreases with an increase inem,which indicates that the output results of the base learner itself,which has a lower error rate,contributes more to the final strong learner output,which is the essence of the AdaBoost algorithm.

    4.2.3TheArchitectureandKeyComponentsoftheLogistic-ABModel

    This paper adopts a similar approach to logistic regression by dividing the training sample points into four intervals via the probability-based classification method and calculating the probability of correct classification in each interval.This is also combined with the evaluation results from AdaBoost to provide credible support for logistic regression,thus reducing the risk of misjudgment.From Section 4.1,the data in this paper have been balanced by the borderline SMOTE algorithm;thus,the probability of stroke patients as well as normal individuals is 0.5 and the intervals are divided equally,i.e.,the four intervals of the logistic output probability are:

    I1=[0,0.25),I2=[0.25,0.5),I3=[0.5,0.75),I4=[0.75,1].

    The algorithm steps of the Logistic-AB model are given as follows:

    Step 1:Based on the above four intervals,the training set is divided into four intervals,denoted asX1,X2,X3,X4,for which the classification accuracy under the logistic regression model and the AdaBoost model is computed and denoted as(i=1,2,3,4).

    Step 2: The test set is also divided intobased on the intervalsI1,I2,I3,I4,and the logistic regression model and AdaBoost model are used to compute the classification accuracy,assuming that on,the output results are(j=1,2,...,n;i=1,2,3,4).

    Step 3: Based on the classification accuracies(i=1,2,3,4) from step two,the following discriminant rule is established:onchoose the classification resulti.e.,the logistic regression result;otherwise,choose

    The architecture and key components of the Logistic-AB model are shown in Fig.7.

    The architecture and key components of the Logistic-AB model given in Fig.7 can be described as follows.First,the probability interval[0,1]is divided into four consecutive reciprocal intervals due to the balancing of data after the parity,and then the training and test sets are divided into four subsets based on the intervals.Next,the data are classified by using the logistic regression model and AdaBoost model,and the classification accuracy is calculated for the four subsets.Finally,the test set is classified accordingly,the accuracy of the training set is used to judge the test set,and the results are comprehensively evaluated.This proposed model not only integrates the advantages of logistic regression and AdaBoost but also complements their disadvantages,which not only results in a faster training speed but also ensures good accuracy and precision.

    Figure 7:The architecture and key components of the Logistic-AB model

    5 Empirical Analysis of Stroke Risk Assessment

    This chapter focuses on the empirical analysis of the previous discussion.It begins with correlation analysis of age,mean blood glucose levels,body mass index,stroke,heart disease and hypertension from the data.The data are then balanced using borderline SMOTE.Finally,the data are trained and evaluated with the Logistic-AB model established in the previous section,and the training effect graph is constructed.The evaluation results are then compared and analyzed with the training results of ten common machine learning algorithms:random forest,SVM,logistic,KNN,Bayesian,decision tree,AdaBoost,gradient boosting,XGB,and CatBoost,through which the strengths and weaknesses of the Logistic-AB model established in this paper can be determined.

    5.1 Relevance Analysis

    The studies in this section aim to examine the relationship between age,average blood glucose levels,and body mass index with stroke,heart disease,and high blood pressure.The results of the study show that all of these factors are related to the development of these diseases.

    First,this paper examined the relationship between age and stroke,heart disease,and hypertension,as shown in the first panel of Fig.8.As seen from the graph,the probability of developing these diseases gradually increases with age.The probability of stroke and heart disease increases significantly,especially after the age of fifty.In addition,there was an increased probability of hypertension.It can therefore be concluded that people are more prone to these three types of diseases as they age.

    Second,this paper examined the relationship between average blood glucose levels and stroke,heart disease,and high blood pressure.As shown in the second panel of Fig.8,the graph has two crests.At a glycemic index of 80,the probability of stroke,heart disease and high blood pressure are all higher,suggesting that blood sugar levels are linked to these diseases.In addition,the probability of these diseases begins to reverse when the glycemic index exceeds 150,with the difference between the two reaching a maximum at approximately 175.It can therefore be concluded that elevated blood glucose increases the risk of these three types of diseases.

    Finally,this section looks at the relationship between body mass index and stroke,heart disease,and hypertension.As shown in the third panel of Fig.8,the graph shows that individuals with a BMI of 30 are more likely to suffer from stroke and heart disease,but there is no significant correlation between the development of high blood pressure and BMI.It can therefore be concluded that being overweight also increases the risk of stroke and heart disease.

    Figure 8:Correlation diagram

    Overall,the findings in this section suggest a relationship between age,mean blood glucose levels,and body mass index and stroke,heart disease,and hypertension.Therefore,attention should be given to these factors,and measures should be taken to prevent and treat these diseases.For example,the incidence of these diseases can be reduced by eating a balanced diet,exercising moderately and maintaining a healthy weight.

    5.2 Data Balancing Processing

    In the field of machine learning,the quality of the dataset determines the quality of the model.Therefore,before training the dataset,the quality of the dataset needs to be ensured by splitting.However,in practice,an imbalance in the dataset is encountered,which means that some sample types have much smaller sample sizes than others.

    To solve this problem,this paper uses an approach called hierarchical cross-validation.This approach ensures that the number of samples of each type in the training and test sets is preserved.In this paper,all the data are divided into five training and test sets,and hierarchical cross-validation is used to ensure the quality of the dataset.

    However,even with the use of hierarchical cross-validation,the dataset may still be unbalanced,so measures need to be taken to balance the dataset.This paper uses a method called borderline SMOTE,which generates synthetic data to increase the number of samples from a small number of classes to balance the dataset.

    The processing results are shown in Fig.9.The left panel shows the dataset before processing,and the right panel shows the results after borderline SMOTE processing.After treatment,the distribution of samples becomes more reasonable,and the number of samples of each type is relatively balanced.This allows the model to be trained and evaluated more accurately.

    Figure 9:Borderline SMOTE before and after treatment

    In conclusion,when dealing with unbalanced datasets,methods such as hierarchical crossvalidation and borderline SMOTE need to be used first to ensure data quality.Using these methods leads to more accurate and useful models and provides better support for practical applications.

    5.3 Experimental Results and Analysis

    5.3.1ROCandAUC

    ROC curves,also known as subject operating characteristic curves,are mainly used in the field of assessment.The relationship between sensitivity and specificity can be effectively demonstrated by plotting ROC curves.The horizontal coordinate of the ROC curve indicates specificity,while the vertical coordinate represents sensitivity.A lower horizontal coordinate indicates higher accuracy of the algorithm,while a higher vertical coordinate indicates higher accuracy of the algorithm.

    The AUC,also known as the area under the ROC curve,can be used as a measure of the evaluation accuracy of the algorithm.The larger this area is,the larger the AUC value is and the better the evaluation accuracy of the algorithm is.In machine learning model evaluation,using ROC curves and AUC values has become a very important method to better evaluate the effectiveness of the model and to improve the evaluation of the model.

    In summary,ROC curves are a very practical way of assessing the effectiveness of a model by plotting the relationship between sensitivity and specificity and showing the accuracy of the algorithm in graphical form.When using ROC curves and AUC values for model evaluation,it is necessary to select the appropriate threshold as accurately as possible to ensure the best performance of the algorithm.

    In this paper,Logistic-AB was used to train the model,and the results are shown in Fig.10.In Fig.10,we can see that all four subsets are trained very well and have corresponding AUC values of 0.92,0.94,0.92 and 0.93.

    Figure 10:ROC graph

    These results demonstrate the effectiveness of the fusion algorithm used in this paper and demonstrate the broad applicability of this method on multiple datasets.It is also noted that in each of these subsets,the performance is different from the others,and therefore,careful consideration is required when selecting the best model.

    It is worth noting that these results only represent results under the particular dataset and parameter settings currently used in this paper.In practice,users are advised to adapt this model to meet their needs and further optimize it based on their specific dataset.

    In conclusion,very good training results have been achieved by using the fusion algorithm proposed in this paper.It is believed that these results will have a positive impact on future research and applications.

    5.3.2EvaluationMetricsandAnalysisoftheExperimentalResults

    In this paper,the results are evaluated by using accuracy,precision,recall and theF1 score[62].Accuracy is the proportion of correctly classified samples to the total number of samples predicted by the model and is calculated as follows:

    whereTPrepresents the number of positive class samples predicted as positive by the classification model,FNrepresents the number of positive class samples predicted as negative by the classification model,TNrepresents the number of negative class samples predicted as negative by the classification model,andFPrepresents the number of negative class samples predicted as positive by the classification model.

    Accuracy refers to the proportion of correct predictions that would have been correct(the larger the value,the better;1 is ideal),which is defined by the following formula:

    Recall refers to the ratio of the number of positive cases correctly identified by the classifier out of all the actual positive cases(the larger the value,the better;1 is ideal),and its formula is as follows:

    TheF1 score is a weighted average of precision and recall and is defined by Eq.(17):

    wherePandRrepresent precision and recall,respectively,and a higherF1 score indicates a better model.

    Here,to show the superiority of the new model proposed in this paper,ten homogeneous machine learning algorithms are selected for comparison with the Logistic-AB model by using the evaluation metrics of accuracy,precision,recall andF1 score,and the results are shown in Table 2.

    Table 2 gives the comparative results from 10 machine learning algorithms as well as those from the presented Logistic-AB model on the model prediction performance for stroke risk assessment.The overall prediction performance of all models is evaluated by four specific evaluation metrics,i.e.,recall,precision,F1 score,and accuracy.From Table 2,we can see that the Logistic-AB algorithmperforms better than the two single algorithms in terms of the four indicators (recall,precision,F1 score,and accuracy),regardless of whether the logistic regression or AdaBoost algorithm is used.Although random forest,decision tree,and XGB all performed much better in terms of accuracy,they could not match the Logistic-AB algorithm on the other three indicators.Specifically,the Logistic-AB algorithm is far ahead of the other algorithms in the two key indicators of theF1 score and precision.

    Table 2: Results of model evaluation

    In summary,the overall performance of the Logistic-AB algorithm proposed in this paper when applied to stroke risk assessment is better than that of common machine learning algorithms.The traditional logistic regression method classifies the results with 0.5 as the cutoff,while the Logistic-AB algorithm proposed in this paper divides the interval into four parts,thus greatly reducing the risk of misjudgment.In addition,the Logistic-AB algorithm uses the output results of AdaBoost as a reference to prevent obvious misjudgments in logistic regression,which further improves the classification effect.In this sense,the Logistic-AB model proposed in this paper has excellent performance in stroke risk assessment.This method not only has important practical application significance in the medical field but also provides a new idea and method for risk assessment research on machine learning algorithms in other fields.

    6 Conclusion

    With the deepening of internet technology in the medical field,health management practices driven by medical big data are gradually taking shape.Of the three components of health management,i.e.,health detection,risk assessment and precision intervention,the most critical is the management of various risk factors throughout the process,which is achieved with effective predictive tools to improve health management.Therefore,a comprehensive study of the influencing factors and risk assessment of stroke patients can help with rehabilitation and early detection and promote the whole process of patient health management,changes in the medical service model and innovation of the management mechanism.This paper establishes a new stroke risk assessment model by screening important influencing factors as well as balancing data,which provides theoretical guidance for the rational diagnosis,timely treatment and effective intervention among high-risk groups and lays the foundation for achieving indirect economic and good social benefits.The main contributions of this study include the following:a new index system of stroke risk assessment is constructed by using the feature selection method of CatBoost;the unbalanced stroke dataset is transformed into a balanced dataset by using the borderline SMOTE algorithm;and a new Logistic-AB model is developed to predict the risk of stroke.

    In conclusion,this paper has successfully performed stroke risk assessment by constructing an integrated algorithmic model,Logistic-AB.The Logistic-AB model far exceeded other machine learning algorithms in terms of the main evaluation metrics.The application of the Logistic-AB model is promising.Moreover,this model has practical significance provides a theoretical basis and decisionmaking reference for related theoretical research.

    However,this Logistic-AB model still has some limitations,such as the interpretability of the results and the sensitivity to outliers and unbalanced data.In this sense,it is necessary to clean and transform the data before using this model;that is,remove the outliers and transform the unbalanced data into balanced data.

    In the future,we will consider choosing better data balancing algorithms or more realistic data for more effective predictive analyses,and we will combine multiple machine learning algorithms or improve the ensemble learning algorithms to achieve more accurate and efficient predictive models.

    Acknowledgement:The authors wish to express their appreciation to the reviewers for their helpful suggestions which greatly improved the presentation of this paper.

    Funding Statement:This work is supported by the National Natural Science Foundation of China(No.72071150).

    Author Contributions:The authors confirm contribution to the paper as follows:study conception and design:C.Rao,M.Li,and T.Huang;data collection:F.Li;analysis and interpretation of results:C.Rao and M.Li;draft manuscript preparation:M.Li and T.Huang.All authors reviewed the results and approved the final version of the manuscript.

    Availability of Data and Materials:The datasets generated and analyzed during the current study are available in the(Kaggle site survey report)repository(https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset).

    Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.

    久久久久久久久久黄片| 又爽又黄a免费视频| 欧美人与善性xxx| 亚洲精品乱码久久久久久按摩| 黄片无遮挡物在线观看| 变态另类丝袜制服| 午夜福利视频1000在线观看| 国产高清三级在线| 悠悠久久av| 免费不卡的大黄色大毛片视频在线观看 | 人体艺术视频欧美日本| 国产蜜桃级精品一区二区三区| 嫩草影院精品99| 一本一本综合久久| 观看免费一级毛片| 国产成人精品久久久久久| 久久久久久久久大av| 岛国在线免费视频观看| 国产久久久一区二区三区| 日韩国内少妇激情av| 成人毛片60女人毛片免费| 国产私拍福利视频在线观看| 夜夜爽天天搞| 欧美色欧美亚洲另类二区| 国产激情偷乱视频一区二区| 在线观看av片永久免费下载| 舔av片在线| 美女 人体艺术 gogo| 国产成人精品一,二区 | 日韩 亚洲 欧美在线| 亚洲三级黄色毛片| 国内精品美女久久久久久| 免费av毛片视频| 国产国拍精品亚洲av在线观看| 色综合站精品国产| 亚洲精品色激情综合| 日韩强制内射视频| 我要看日韩黄色一级片| av视频在线观看入口| 天美传媒精品一区二区| 午夜福利视频1000在线观看| 最近的中文字幕免费完整| 国产69精品久久久久777片| 特大巨黑吊av在线直播| av黄色大香蕉| 一级毛片aaaaaa免费看小| 国产成人一区二区在线| 亚洲经典国产精华液单| 永久网站在线| 亚洲精品乱码久久久久久按摩| 亚洲欧洲国产日韩| 听说在线观看完整版免费高清| 欧美成人一区二区免费高清观看| 亚洲久久久久久中文字幕| 欧美人与善性xxx| 日日啪夜夜撸| 99九九线精品视频在线观看视频| 亚洲精品久久国产高清桃花| 成年av动漫网址| 亚洲无线观看免费| 蜜臀久久99精品久久宅男| 一级毛片电影观看 | 一本一本综合久久| 免费人成在线观看视频色| 日本黄大片高清| 日韩制服骚丝袜av| 一级毛片我不卡| 色尼玛亚洲综合影院| 中国美白少妇内射xxxbb| 日本五十路高清| 乱人视频在线观看| 又爽又黄无遮挡网站| 国产成人一区二区在线| 3wmmmm亚洲av在线观看| 18禁在线无遮挡免费观看视频| 日本黄色片子视频| 亚洲色图av天堂| 哪个播放器可以免费观看大片| 国产在视频线在精品| 一个人看的www免费观看视频| 成人三级黄色视频| 国产精品日韩av在线免费观看| 久久久色成人| 成人午夜精彩视频在线观看| 男人舔奶头视频| 青春草亚洲视频在线观看| 人妻系列 视频| 在线播放无遮挡| 国产极品天堂在线| 岛国在线免费视频观看| 直男gayav资源| 人体艺术视频欧美日本| 有码 亚洲区| 国产成人aa在线观看| 在线国产一区二区在线| ponron亚洲| 亚洲熟妇中文字幕五十中出| 最近的中文字幕免费完整| 欧美变态另类bdsm刘玥| 精华霜和精华液先用哪个| 亚洲七黄色美女视频| 欧美成人免费av一区二区三区| 欧美潮喷喷水| 国产爱豆传媒在线观看| 综合色av麻豆| 少妇人妻精品综合一区二区 | 成人国产麻豆网| 欧美激情在线99| 久久精品国产亚洲av天美| 在线观看66精品国产| 久久精品国产自在天天线| 亚洲欧美日韩卡通动漫| 国产色爽女视频免费观看| 人妻夜夜爽99麻豆av| 精品久久久久久久久av| 美女高潮的动态| 不卡视频在线观看欧美| 亚洲美女视频黄频| 亚洲va在线va天堂va国产| 国产高清三级在线| 日产精品乱码卡一卡2卡三| 亚洲av一区综合| 亚洲av不卡在线观看| h日本视频在线播放| 国产不卡一卡二| 岛国在线免费视频观看| 国产在线精品亚洲第一网站| 日韩欧美 国产精品| a级毛片a级免费在线| 免费在线观看成人毛片| 网址你懂的国产日韩在线| 亚洲美女视频黄频| 久久精品夜色国产| 网址你懂的国产日韩在线| 99久久中文字幕三级久久日本| 欧美精品国产亚洲| 国产麻豆成人av免费视频| 国产欧美日韩精品一区二区| 精华霜和精华液先用哪个| 免费人成视频x8x8入口观看| 欧美成人精品欧美一级黄| 九草在线视频观看| 国产精品,欧美在线| 毛片一级片免费看久久久久| 真实男女啪啪啪动态图| 亚洲激情五月婷婷啪啪| 嫩草影院新地址| 内地一区二区视频在线| 天天躁夜夜躁狠狠久久av| 久久久精品欧美日韩精品| 国产人妻一区二区三区在| 国产午夜精品论理片| 少妇被粗大猛烈的视频| 简卡轻食公司| 尾随美女入室| 色综合亚洲欧美另类图片| 好男人视频免费观看在线| 国产又黄又爽又无遮挡在线| 日本五十路高清| 欧美在线一区亚洲| 久久这里有精品视频免费| 日韩欧美 国产精品| 有码 亚洲区| 成人性生交大片免费视频hd| 亚洲天堂国产精品一区在线| 久久久久久久久久久丰满| 18+在线观看网站| av福利片在线观看| 久久欧美精品欧美久久欧美| 男女下面进入的视频免费午夜| a级毛片免费高清观看在线播放| 天天躁日日操中文字幕| 麻豆精品久久久久久蜜桃| 久久人妻av系列| 91精品一卡2卡3卡4卡| 日韩欧美 国产精品| av免费在线看不卡| 一夜夜www| 国产精华一区二区三区| 蜜桃亚洲精品一区二区三区| 在线播放无遮挡| 午夜福利在线在线| 性色avwww在线观看| 人人妻人人看人人澡| 亚洲成人中文字幕在线播放| 男人的好看免费观看在线视频| 国产成人a区在线观看| 男人舔奶头视频| 欧美激情在线99| 国产精品麻豆人妻色哟哟久久 | 18+在线观看网站| 五月伊人婷婷丁香| 国产精品一区www在线观看| 欧美人与善性xxx| 国产精品一及| 国产单亲对白刺激| 国产精品.久久久| 亚洲自偷自拍三级| 久久99热6这里只有精品| 日本成人三级电影网站| 免费观看人在逋| av黄色大香蕉| 成人三级黄色视频| 久久久久九九精品影院| 18+在线观看网站| 亚洲欧洲国产日韩| 99热这里只有是精品在线观看| 久久久久国产网址| АⅤ资源中文在线天堂| 在现免费观看毛片| 国产又黄又爽又无遮挡在线| 亚洲一区二区三区色噜噜| 99热只有精品国产| 国内精品宾馆在线| 久久久a久久爽久久v久久| 婷婷亚洲欧美| av国产免费在线观看| 成人午夜高清在线视频| 内地一区二区视频在线| 变态另类成人亚洲欧美熟女| avwww免费| 欧美激情在线99| 女人被狂操c到高潮| 蜜桃久久精品国产亚洲av| www.av在线官网国产| 村上凉子中文字幕在线| 一边亲一边摸免费视频| 午夜老司机福利剧场| 直男gayav资源| 岛国毛片在线播放| 黄片wwwwww| 69人妻影院| 国产一级毛片在线| 人人妻人人看人人澡| 亚洲av中文av极速乱| 中国国产av一级| 色尼玛亚洲综合影院| 日本黄大片高清| 嫩草影院新地址| 啦啦啦观看免费观看视频高清| 2021天堂中文幕一二区在线观| 亚洲国产精品成人综合色| 久久久a久久爽久久v久久| 只有这里有精品99| 性色avwww在线观看| 国产亚洲精品av在线| 在线观看一区二区三区| 毛片女人毛片| 国产黄片美女视频| 中国国产av一级| av在线老鸭窝| 久久精品国产亚洲av香蕉五月| 国产老妇伦熟女老妇高清| 不卡视频在线观看欧美| 日韩强制内射视频| 啦啦啦韩国在线观看视频| 国产一区二区三区av在线 | 亚洲第一区二区三区不卡| 超碰av人人做人人爽久久| 最近最新中文字幕大全电影3| 男人舔奶头视频| 能在线免费看毛片的网站| 在线播放无遮挡| 亚州av有码| 校园春色视频在线观看| 寂寞人妻少妇视频99o| 91在线精品国自产拍蜜月| 联通29元200g的流量卡| 九九久久精品国产亚洲av麻豆| 国产真实乱freesex| 黄色一级大片看看| 深爱激情五月婷婷| 中文字幕久久专区| 亚洲欧洲国产日韩| 欧美成人免费av一区二区三区| 亚洲中文字幕一区二区三区有码在线看| 国产视频首页在线观看| 一级毛片aaaaaa免费看小| 亚洲欧美成人综合另类久久久 | 看黄色毛片网站| 亚洲成av人片在线播放无| 久久精品人妻少妇| 综合色av麻豆| 青春草国产在线视频 | 成年版毛片免费区| 国产真实伦视频高清在线观看| 亚洲精品粉嫩美女一区| 午夜爱爱视频在线播放| 免费观看精品视频网站| 熟女电影av网| 又粗又硬又长又爽又黄的视频 | 成人午夜精彩视频在线观看| 国产三级在线视频| 青春草亚洲视频在线观看| 欧美另类亚洲清纯唯美| 亚洲自拍偷在线| 给我免费播放毛片高清在线观看| 国产色爽女视频免费观看| 日韩人妻高清精品专区| 免费不卡的大黄色大毛片视频在线观看 | 精品国产三级普通话版| 免费大片18禁| 中出人妻视频一区二区| 美女被艹到高潮喷水动态| 人妻夜夜爽99麻豆av| 日韩三级伦理在线观看| 国产精品美女特级片免费视频播放器| 免费观看精品视频网站| 国产精品久久久久久久电影| 国产成年人精品一区二区| 精品久久久久久久末码| 中文字幕人妻熟人妻熟丝袜美| 国产黄片美女视频| 亚洲av男天堂| 99久国产av精品国产电影| 中国美白少妇内射xxxbb| 少妇的逼好多水| 国产在视频线在精品| 亚洲欧洲国产日韩| 人妻久久中文字幕网| 亚洲精品亚洲一区二区| 能在线免费观看的黄片| 22中文网久久字幕| av女优亚洲男人天堂| 成人亚洲欧美一区二区av| 少妇猛男粗大的猛烈进出视频 | 五月伊人婷婷丁香| 婷婷色av中文字幕| 国产午夜精品久久久久久一区二区三区| 久久午夜福利片| 日韩,欧美,国产一区二区三区 | 麻豆国产av国片精品| 国产熟女欧美一区二区| 性色avwww在线观看| 免费无遮挡裸体视频| 欧美性感艳星| 日韩成人伦理影院| 一级二级三级毛片免费看| 啦啦啦韩国在线观看视频| 亚洲欧美成人综合另类久久久 | 小蜜桃在线观看免费完整版高清| 久久亚洲精品不卡| 午夜精品国产一区二区电影 | 性插视频无遮挡在线免费观看| 国产精品不卡视频一区二区| 欧美性感艳星| 亚洲性久久影院| 狂野欧美激情性xxxx在线观看| 国产综合懂色| 国产精品爽爽va在线观看网站| av黄色大香蕉| 久久久久久久久久黄片| 国产在线精品亚洲第一网站| 久久中文看片网| 欧美高清性xxxxhd video| 国产 一区精品| 久久99热6这里只有精品| 此物有八面人人有两片| 人妻少妇偷人精品九色| 日韩人妻高清精品专区| 中文字幕精品亚洲无线码一区| 在线天堂最新版资源| 精品久久久久久久末码| 99视频精品全部免费 在线| 丝袜美腿在线中文| 成年版毛片免费区| 日韩成人av中文字幕在线观看| 午夜精品一区二区三区免费看| 精品国产三级普通话版| 久久久久久久久久久丰满| 成人国产麻豆网| 九九爱精品视频在线观看| 少妇丰满av| 特大巨黑吊av在线直播| 国内揄拍国产精品人妻在线| 亚洲av电影不卡..在线观看| 精品人妻视频免费看| 啦啦啦啦在线视频资源| av国产免费在线观看| 中出人妻视频一区二区| 亚洲精品日韩av片在线观看| 嘟嘟电影网在线观看| 国产一区二区三区av在线 | 日日啪夜夜撸| 欧美丝袜亚洲另类| 91av网一区二区| 日本黄色片子视频| 亚洲自拍偷在线| 在线免费观看不下载黄p国产| 日本黄色视频三级网站网址| 在线观看午夜福利视频| 欧美最新免费一区二区三区| 中国美女看黄片| 网址你懂的国产日韩在线| 毛片一级片免费看久久久久| 欧美日韩综合久久久久久| 一级毛片我不卡| 99视频精品全部免费 在线| 禁无遮挡网站| 99视频精品全部免费 在线| 亚洲va在线va天堂va国产| 高清午夜精品一区二区三区 | 1000部很黄的大片| 精品人妻熟女av久视频| 国产亚洲5aaaaa淫片| 久久久久性生活片| 国产真实伦视频高清在线观看| 最近的中文字幕免费完整| 又粗又爽又猛毛片免费看| 欧美xxxx性猛交bbbb| 天天躁夜夜躁狠狠久久av| 国产成人aa在线观看| 99国产极品粉嫩在线观看| 秋霞在线观看毛片| 黄片无遮挡物在线观看| 中文字幕制服av| 3wmmmm亚洲av在线观看| 99久久成人亚洲精品观看| 精品一区二区免费观看| 亚洲人成网站高清观看| 亚洲内射少妇av| 麻豆一二三区av精品| 国产av不卡久久| 国产色婷婷99| a级毛片免费高清观看在线播放| 啦啦啦韩国在线观看视频| 亚洲一级一片aⅴ在线观看| 久久人妻av系列| www.色视频.com| 日韩欧美国产在线观看| 久久久久久伊人网av| 中文在线观看免费www的网站| 一区福利在线观看| 亚洲不卡免费看| 久久久久久久久大av| 亚洲人成网站高清观看| 有码 亚洲区| 国产精品野战在线观看| 天堂√8在线中文| 成人美女网站在线观看视频| 亚洲精品色激情综合| 联通29元200g的流量卡| 我要看日韩黄色一级片| 日日摸夜夜添夜夜添av毛片| 免费人成视频x8x8入口观看| 国产精品一区二区三区四区免费观看| 一个人观看的视频www高清免费观看| 国产成人一区二区在线| 欧美激情在线99| 亚洲成人av在线免费| 免费观看的影片在线观看| 午夜福利高清视频| 乱系列少妇在线播放| 2022亚洲国产成人精品| 91精品国产九色| 久久精品国产亚洲av香蕉五月| 精品99又大又爽又粗少妇毛片| 日韩欧美精品v在线| 晚上一个人看的免费电影| 久久这里只有精品中国| 精品久久久噜噜| 国产一区二区亚洲精品在线观看| 亚洲av中文字字幕乱码综合| 精品无人区乱码1区二区| 在线国产一区二区在线| 男人舔女人下体高潮全视频| 禁无遮挡网站| 成人高潮视频无遮挡免费网站| 免费不卡的大黄色大毛片视频在线观看 | 婷婷精品国产亚洲av| 一个人免费在线观看电影| 波多野结衣巨乳人妻| 男女做爰动态图高潮gif福利片| 国产成人aa在线观看| 夜夜看夜夜爽夜夜摸| 亚洲七黄色美女视频| 欧美高清成人免费视频www| 中文精品一卡2卡3卡4更新| 久久精品国产鲁丝片午夜精品| 大型黄色视频在线免费观看| 国产中年淑女户外野战色| 久久99精品国语久久久| 亚洲欧美精品综合久久99| 看非洲黑人一级黄片| 国产大屁股一区二区在线视频| 久久久久久伊人网av| 日韩大尺度精品在线看网址| 青青草视频在线视频观看| 欧美激情久久久久久爽电影| 男人的好看免费观看在线视频| 亚洲欧美精品自产自拍| 免费在线观看成人毛片| 日韩制服骚丝袜av| 国产精品野战在线观看| 我要看日韩黄色一级片| 国产精品国产高清国产av| 亚洲精品久久国产高清桃花| 国产亚洲精品久久久久久毛片| 一区二区三区免费毛片| 一进一出抽搐gif免费好疼| 亚洲国产色片| 国产一区二区三区av在线 | 精品久久久久久久久亚洲| 老女人水多毛片| 欧美性猛交黑人性爽| 伊人久久精品亚洲午夜| 国产成人影院久久av| 国产精品久久久久久亚洲av鲁大| 少妇熟女欧美另类| 国产不卡一卡二| 久久精品国产鲁丝片午夜精品| 91麻豆精品激情在线观看国产| 观看免费一级毛片| 日韩强制内射视频| .国产精品久久| 亚洲欧美日韩高清专用| 只有这里有精品99| 在线播放国产精品三级| 床上黄色一级片| 久久久久免费精品人妻一区二区| 校园人妻丝袜中文字幕| 九九在线视频观看精品| 午夜激情福利司机影院| 久久99热这里只有精品18| 国产在线精品亚洲第一网站| 少妇的逼好多水| av在线亚洲专区| 亚洲精品久久国产高清桃花| 国内少妇人妻偷人精品xxx网站| 精品人妻熟女av久视频| 婷婷色综合大香蕉| av黄色大香蕉| 人人妻人人看人人澡| 嫩草影院入口| 日韩av不卡免费在线播放| 久久久国产成人精品二区| 26uuu在线亚洲综合色| 亚洲欧美日韩高清在线视频| 激情 狠狠 欧美| 国产精品一区二区在线观看99 | 看非洲黑人一级黄片| 天堂网av新在线| 嫩草影院新地址| 天堂中文最新版在线下载 | 亚洲国产精品合色在线| av视频在线观看入口| 在线a可以看的网站| 国产成人aa在线观看| 少妇被粗大猛烈的视频| 国产精品电影一区二区三区| 一级毛片久久久久久久久女| 亚洲色图av天堂| 1024手机看黄色片| 哪个播放器可以免费观看大片| 久久久久久久久久久丰满| 91久久精品国产一区二区三区| 少妇熟女aⅴ在线视频| 在线观看午夜福利视频| 日韩欧美 国产精品| 亚洲av电影不卡..在线观看| 亚洲国产欧洲综合997久久,| 久久精品国产亚洲av涩爱 | 高清午夜精品一区二区三区 | 久久精品久久久久久噜噜老黄 | 悠悠久久av| 成人亚洲精品av一区二区| 免费电影在线观看免费观看| 亚洲无线观看免费| 一本久久中文字幕| 亚洲精品自拍成人| 中文字幕人妻熟人妻熟丝袜美| 国产一区二区激情短视频| 国产探花极品一区二区| 国产亚洲av嫩草精品影院| 人妻夜夜爽99麻豆av| 欧美成人免费av一区二区三区| 深夜精品福利| 99久久九九国产精品国产免费| 99久国产av精品国产电影| 亚洲乱码一区二区免费版| 国产高清三级在线| 欧美zozozo另类| 一进一出抽搐gif免费好疼| 久久精品久久久久久久性| 久久中文看片网| 看免费成人av毛片| 麻豆成人午夜福利视频| 亚洲欧美成人综合另类久久久 | 国产女主播在线喷水免费视频网站 | 欧美一区二区国产精品久久精品| 黄色一级大片看看| 国产精品久久久久久精品电影小说 | 亚洲欧美日韩无卡精品| 日本三级黄在线观看| 九色成人免费人妻av| 日韩国内少妇激情av| 最近手机中文字幕大全| 中文欧美无线码| 成人美女网站在线观看视频| 久久韩国三级中文字幕| 日韩一本色道免费dvd| 亚洲图色成人| 乱人视频在线观看| 国产白丝娇喘喷水9色精品| 熟女人妻精品中文字幕| 国产精品久久久久久精品电影| av专区在线播放| 97热精品久久久久久| 97人妻精品一区二区三区麻豆| 99久久成人亚洲精品观看| 青青草视频在线视频观看| 12—13女人毛片做爰片一| 亚洲国产色片| 亚洲欧美精品专区久久| 91精品国产九色|