• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Evaluation of ensemble learning for Android malware family identification

    2020-04-09 04:01:38JordanWylie譚志遠AhmedAlDubai王建珍
    廣州大學學報(自然科學版) 2020年4期
    關鍵詞:信息

    Jordan Wylie, 譚志遠*, Ahmed Al-Dubai, 王建珍

    (1. School of Computing, Edinburgh Napier University, Edinburgh, EH10 5DT, UK; 2. 山西大學 商務學院/信息學院, 山西 太原 030031)

    Abstract: Each Android malware instance belongs to a specific family that performs a similar set of actions and shares some key characteristics. Being competent to identify Android malware families is critical to address security challenges from malware and the damaging consequences. Ensemble learning is believed an improvement to solve computational intelligence problems by strategically combining decisions from multiple machine learning models. This paper, thus, presents a study of the application of ensemble learning and its effectiveness in Android malware family identification/classification against other single-model-based identification approaches. To conduct a fair evaluation, a prototype of ensemble learning based Android malware classification system was developed for this work, where Weighted Majority Voting (WMV) approach is used in this prototype to determine the importance of individual models (i.e., Support Vector Machine, k-Nearest Neighbour, ExtraTress, Multi-layer Perceptron, and Logistic Regression) to a final decision. The results of the evaluation, conducted by using publicly-accessible malware datasets (i.e., Drebin and UpDroid) and the recent samples from GitHub repositories, show that the ensemble learning approach does not always perform better than single-model learning approaches. The performance of the ensemble learning based malware family classification is heavily influenced by several factors, in particular the features, the values of the parameters and the weights assigned to individual models.

    Key words: Android malware; family identification; static analysis; ensemble learning

    1 Introduction

    Android took the largest market share in the mobile operating system (OS) market in 2012 for the first time ever and which has been continuously growing since then, as shown in Fig.1 that was drawn based on data from[1]. As of May 2019 Android had 2.5 billion monthly active devices[2]. Due to this popularity the threat of malware, which is defined as software that has malicious intent[3-4], increases. Every malware sample belongs to a family of malware that all have a similar set of characteristics, in regards to how they are implemented and their goals. Android malware family identification assists in the identification of steps that could be taken to overturn any damage that has been caused by a specific malware sample[5]. This shows that techniques, being able to identify malware families efficiently and effectively, are vital.

    Fig.1 iOS and Android mobile OS market share[1]

    This paper, therefore, develops an approach to identify Android malware families, in which an ensemble learning algorithm, termed weighted majority voting, is applied to build the malware family classifier. The ensemble learning approach was chosen over other approaches because they perform well when applied to real world issues, improve accuracy, and handle the problems of concept drift and unbalanced data[6]. To fairly evaluate our proposed ensemble learning based malware family identification approach, publicly available evaluation datasets are used. However, the majority of recent related approaches were evaluated on datasets that are out of date which makes it difficult to evaluate how well these related approaches would perform on samples from the current Android malware landscape. This also causes issues when providing comparisons between our proposed approach and these related approaches. To solve these issues, multiple datasets, including commonly-used and recently-discovered malware samples, will be used. Through this research, we aim to answer two main questions defined below.

    (1) How does our ensemble learning approach for Android malware family identification compare to other related single-model-based Android malware classification approaches?

    (2) How effective is our approach developed in classifying recent Android malware into their respective families?

    1.1 Impact

    This work will have an impact on the general public, providing security experts with a weapon to respond to new variants of known malware families effectively which should ultimately result in the competence to define their impact and the steps that could be taken to mitigate any damage caused.

    In addition, the evaluation approach developed in this work will inform how a varied set of classifiers perform while tackling the family identification problem, which will guide the development of new defense approaches.

    1.2 Paper organisation

    The rest of this paper is organised as follows.

    ? Section 2: Provides a relevant technical background, including the structure of an Android application, the different malware analysis evaluation techniques, the evaluation metrics and datasets;

    ? Section 3: Presents a critical review into recent related works;

    ? Section 4: Informed by Section 2, this section proposes a viable ensemble learning approach with an aim to answer the questions defined in Section 1;

    ? Section 5: Critically evaluates the proposed approach making use of multiple metrics discussed in Section 4;

    ? Section 6: Concludes the paper and evaluates how well the questions defined in Section 1 were answered. This section will also provide potential future work.

    2 Technical background

    2.1 Android application structure

    An Android Package (APK) is an archive file that contains various files and directories[7], defined in Table 1.

    Table 1 Contents of an APK file[8]

    There are four different app components that are used by an Android application to provide the system or a user to enter itself. These are activities, services, broadcast receivers, and content providers[7,9]and they are defined in Table 2.

    Table 2 Types of Android components[7,9]

    The activities, services and broadcast receivers are activated by an Intent, which is a messaging object that an Android application use to request an action from another components. A request can be either explicit (where a specific app component will be requested) or implicit (where, in contrast, a type of components will be requested instead). Recent research[9]shows “broadcast receivers” app component was used more often than the other components by malware, and the same components tended to be used across the malware variants of the same family as malware creators reused some of the developed functions.

    To protect user privacy, app permissions are supported by Android, who categorises the permissions into three different groups, namely normal, signature, and dangerous permissions[10]. The 36 normal permissions are low-risk and automatically granted during installation[10-11]. The 29 signature permissions, in contrast, require the requesting and the consuming Android application must be the same[10-11], and two special signature permissions are considered sensitive and should not be used by the majority of applications[10]. The 28 dangerous permissions controlling access to data and private resources, require user confirmation[10].

    2.2 Malware analysis and anti-analysis techniques

    Malware analysis aims to understand the working and intention of malware in order to identify any infected devices and files[12]. This is commonly done through static analysis and dynamic analysis. No execution of the malware of interest is required during the static analysis, whereas the dynamic analysis is in the opposite[12].

    On the other side, to hinder the analysis, anti-analysis techniques have been used in malware creation[13].

    Techniques, such as string encryption and encoding, dynamic loading, native payloads and reflection, help creators to produce evasive malware to bypass dynamic analysis[13-15].

    2.3 Machine learning

    Several metrics have been commonly used by researchers and practitioner to evaluate the effectiveness of trained Machine Learning (ML) models. From the four most basic metrics[16-17], the following advanced metrics shown in Table 3 can be derived.

    Table 3 Main machine learning metrics[18]

    (1)Accuracy: TheAccuracyis the ratio of instances that were classified correctly[16]and is calculated using Equation 1. This metric is effective when classes are balanced. As such, this metric only renders a fair evaluation during the afore-mentioned condition.

    (1)

    (2)Recall: TheRecallshows how many relevant instances were classified into a specific class[11, 16, 17, 19]and is calculated using Equation 2.

    (2)

    (3)Precision: ThePrecisionis the ratio of instances that have been classified correctly against all other instances that were correctly or incorrectly classified to that class[18-19]. This is calculated using Equation 3.

    (3)

    (4)F1-Score: TheF1-Scorecan also be used to evaluate the performance of a model. This is the harmonic mean between the recall and precision metrics[17, 20]and is calculated using Equation 4.

    (4)

    These four advanced metrics will be used in our evaluation.

    2.4 Datasets

    These are the Android Malware Genome Project (AMGP), Drebin, Android Malware Dataset (AMD), UpDroid, Android PRAGuard, AndroZoo, and samples obtained from GitHub.

    (1)AMGP: The AMGP is a dataset that has been used in previous Android malware family identification approaches[5,15], however it has now been discontinued[13, 21]. The dataset contained a total of 1 260 samples that were gathered between August 2010 and October 2011 from multiple marketplaces including the official Google marketplace[22].

    (2)Drebin: The Drebin dataset[23]is another widely used dataset that has been used by various approaches[5, 15]. This dataset has a total of 5 560 samples which were gathered between August 2010 and October 2012 from multiple sources including the Google Play Store, third party marketplaces, and the AMGP[23].

    (3)AMD: The AMD[13]contains 24 560 samples which were obtained from 2010 to 2016 from the Google Play Store, third party security companies, and VirusShare. Each of the samples is associated with a family.

    (4)UpDroid: UpDroid[24]contains 2 479 samples, 50% of which were collected from 2015 onwards. Each of these samples had to be flagged as malicious by at least 20 anti-virus software from VirusTotal. The family name was also derived from the VirusTotal results.

    (5)Android PRAGuard: The Android PRAGuard dataset[14]contains malware from the AMGP and the Contigio dataset. These samples were also obfuscated to form this dataset.

    (6)AndroZoo: The AndroZoo dataset[25]that contains millions of benign and malicious applications. These were obtained through the use of crawlers which were obtained from multiple sources including the Google Play Store, third party marketplaces, torrents, and the AMGP.

    (7)GitHub: There are also multiple repositories on GitHub[26-27]which have more recent samples. The samples obtained from this source would have to be analysed further due to its unreliability.

    Among these seven reviewed datasets, some are more appropriate than the others to be used to evaluate our proposed malware family identification approach. Although the AMGP has been used in several research works. It is outdated and not accessible anymore. The Drebin dataset has also been used to evaluate a significant number of detection approaches, but it is outdated and would not give a real representation of current Android malware too. Similarly, the AMD is an outdated dataset, which however is large in size and could help establishing a better overview of model performance. A more recent dataset. The UpDroid dataset, however, small and has not been used as much. Apart from these old datasets, there are still a few more recent perspective datasets available. The Android PRAGuard dataset is more representative of the current malware landscape as these samples were modified for obfuscation purposes. The AndroZoo dataset could also be used as it is a large dataset in spite of containing benign samples. Last but not least, recent samples from GitHub are accessible but these would not be as reliable as the other datasets mentioned. There are two main criteria to select an evaluation dataset.

    (1)Being able to provide a comparison with other family identification approaches.

    (2)Being able to show how the approach of interest will perform with more recent samples.

    Based on the above criteria, three datasets are selected for this research work. They are the Drebin dataset, UpDroid, and samples obtained from GitHub.

    3 Related works

    Android malware family identification has been widely studied. These related approaches differ themselves from each other by the features used in analysis. The following paragraphs review several key recent related approaches. The quantified evaluation results are summarised in Table 4.

    Table 4 Summary of family identification approaches

    3.1 Approaches using static features

    RevealDroid[5]uses static features in malware family classification. Its main aim was to be resilient to obfuscation techniques. The Classification and Regression Trees (CART) algorithm was applied and a total of 1 054 features were used to build the classifier. This approach was evaluated using a total of 27 979 samples gathered from the AMGP, the Drebin dataset, VirusShare, and VirusTotal. This approach achieved an accuracy of 84%. Whereas no other metrics have been used to evaluate the performance of this approach which makes it difficult to ascertain how well this approach has performed. This is due to the fact that accuracy does not provide a good indication of how well an approach performed when the classes are unbalanced[16, 19].

    Similarly, DroidSieve[15]made use of the ExtraTrees (ET) algorithm and static features in order to infer Android malware families. Four datasets including the AMGP, Drebin dataset, Marvin dataset, and Android PRAGuard were used to evaluate DroidSieve. DroidSieve achieved outstanding performance (i.e., an accuracy of 98.12% and an F-measure of 97.84%) on the Drebin dataset and (i.e., an accuracy of 99.15%) on the AMGP and Android PRAGuard dataset. However, The main issue with this approach is that the Android PRAGuard dataset is a obfuscated version of the AMGP and the Contagio dataset[14]. These obfuscated samples are not samples found in a real life environment.

    3.2 Approaches using dynamic features

    To characterise malware and its runtime behaviour, dynamic features of the malware of interest are extracted from a live environment. A recent research, conducted by Massarelli, et al.[28], made use of Support Vector Machine (SVM) to build a malware classifier with the dynamic features. The Drebin dataset was utilised to evaluate this SVM-based classifier. A70-30 train-test split was applied, and stratified 5-fold cross validation was executed for 20 times. Their approach was able to obtain a mean accuracy of 82%. Although the precision and recall were reported, however, these were calculated for each individual class label (family).

    CANDYMAN[29]involves dynamic features and various classifiers including the Random Forest (RF). This approach was evaluated using the Drebin dataset, from which a total of 4 442 samples from 24 families were taken for evaluation after removing the malware families with fewer than 20 samples. Features of the malware samples were extracted using a Markov chain from the raw features provided through emulation. With the RF classifier, CANDYMAN was able to achieve an accuracy of 81.8%, an f-measure of 80.2%, 80.7% precision and 81.8% recall.

    EnDroid[17]is another approach that made use of dynamic features as well as the stacking classifier. It achieved an accuracy of 94.5% when evaluated against the top 20 malware families with 4 446 samples from the Drebin dataset.

    3.3 Approaches using hybrid features

    Some recent work attempts to take advantages of both static and dynamic features. EC2[30], for example, uses both types of features to train the RF and DBScan classifiers. EC2 is again evaluated using the Drebin dataset, from where the malware samples could not without issues were discarded. The EC2 with the DBScan algorithm achieved a micro f-measure of 76% and a macro f-measure of 97%.

    UpDroid[24]is another hybrid-feature approach equipped with three algorithms, namely k-Nearest Neighbors (kNN), RF, and J48. Alongside with the Drebin dataset, a dataset of 2 479 collected samples was used to evaluate the UpDroid. The samples were executed in an emulated environment for 15 minutes to obtain the dynamic features. Using the newly collected dataset and the kNN algorithm, UpDroid was able to achieve an accuracy of 96.37%, a recall of 96.4% and a false positive rate of 0.2%. On the Drebin dataset, UpDroid achieved an accuracy of 96.85%, a recall of 96.8%, and a false positive rate of 0.3%. This approach has been tested on more recent samples, however the limitation of this approach is the dynamic features used. To extract these dynamic features from the newly collected dataset would take at least 25 days to dynamically analyse every sample.

    Overall, the Drebin dataset was most popular and used to evaluate all the reviewed malware classification approaches. Only UpDroid[24]was evaluated against a more up-to-date dataset. In addition, these approaches were all evaluated with varied metrics which makes it difficult to compare with the approach developed in this paper.

    4 Methodology

    This section defines the process of preparing and organising the datasets, analysing the samples, and defining and evaluating the family identification model. The methodology is summarised in Fig.2.

    Fig.2 Overview of the methodology

    4.1 Dataset selection and preparation

    The datasets were chosen and obtained first. This resulted in a total of 3 datasets being chosen which aimed to answer the research questions stated in Section 1. These datasets are the Drebin dataset[23,31], the UpDroid dataset[24], and samples that were obtained from GitHub[26-27]. These datasets will be split into two experimental datasets. The first experimental dataset contains the samples from the Drebin dataset and will be used to answer our research question 1; the second experimental dataset contains samples from the UpDroid dataset and those collected from GitHub by us, and this second experimental dataset will be used to answer our research question 2.

    To prepare the experimental datasets, all the samples are first organised in a similar manner and then assigned to their respective experimental datasets. Due to the nature of the GitHub samples additional analysis was carried out on those samples. Similar to the sample preparation approach introduced in Ref.[24], only the samples, discovered before May 2019, is kept to ensure they can be correctly recognised by a predefined number (e.g., 20) of major anti-virus scanners. The test results from AVClass, a Python tool to tag malware samples with their respective family label, is then used to build a ground truth of the family labels of the malware samples.

    4.2 Feature extraction

    To extract various static features shown in Table 5 for each sample, AndroGuard, a Python library, is used. The extracted features of the samples are stored in a CSV file. If none of these features can be extracted from a sample, then it is discarded.

    Table 5 Static features used in this project

    4.3 Family identification

    The family identification process steps include data preprocessing, hybrid feature selection, establishment of ensemble learning based malware classification, as well as hybrid-parameter selection. The details of these steps are discussed in the subsections below.

    4.3.1 Data preparation

    The malware family identification begins with data preprocessing. Given a sample setX(defined in Equation 5) that contains s samples with c features, and a respective label setY(defined in Equation 6), the training datasetDis defined as Equation 7.

    Any families having fewer than 5 samples are removed when proceeding with a 5-fold cross validation. The non-Boolean features are then normalised to avoid the ML models from being biased by the features with large values. The Boolean features are not normalised as to keep the property of those features.

    (5)

    (6)

    D=(Y|X)

    (7)

    where:

    s=The number of samples

    c=The number of features

    4.3.2 Hybrid feature selection

    To reduce both the noise in the data and the complexity of computation, feature selection is carried out at this stage. Hybrid feature selection is recommended as exploiting the advantages of both wrapper and filter methods. On one hand, filter methods are not prone to bias and are more time-efficient but do not always improve the effectiveness of a model[32]. On the other hand, wrapper methods usually improve the effectiveness of a model but are time-consuming[32]. As such, it would be wise to first apply a filter method.

    In this work, a correlation-based feature selection, namely Pearson’s correlation as shown in Equation 8[33], is in use. The respective algorithm is defined in Algorithm 1 which removes any features having a correlation greater than 0.8 or less than -0.8. The reason that these values should be chosen is because they indicate a close relationship between the two features[34].

    (8)

    where:

    ρx,y=The Pearson’s correlation between variablesAandB

    μA=The expectation ofA

    μB=The expectation ofB

    σA=The standard deviation ofA

    σB=The standard deviation ofB

    Algorithm 1 Correlation-based feature selection

    Require:Xfeature set,cfeature count

    forx,yinXdo

    CFx,y←ρx,y

    end for

    InitialisecorrF{Empty list that will contain highly correlated features}

    InitialiseremF{Empty list that will contain features to be removed}

    f← list of features

    fori=1 tocdo

    forj=i+1 tocdo

    ifCFi,j<-0.8‖CFi,j>0.8 then

    corrF∪{fi,fj}

    end if

    end for

    end for

    freqs← list of corrF sorted by frequency

    forfeature,freqinfreqsdo

    forCFTincorrFdo

    iffeatureinCFT&& not inremFthen

    remF∪feature

    removeCFTfromcorrF

    end if

    end for

    end for

    Remove features fromXthat are inremF

    returnX

    For the wrapper method, the Recursive Feature Elimination (RFE) algorithm with random forest as an estimator is then used. This works by training a classifier, random forest in this case, to obtain a rank of each feature. Once each feature has a rank, the lowest one is removed, this is repeated until the desired number of features is met[35]. The number is set to 100 in this work. Once the features have been selected, these can then be saved to a new CSV file if required.

    4.3.3 Establishment of malware family identification using ensemble learning with weighted majority vote

    Ensemble learning is used at this stage to build an identifier to classify malware samples. The weighted majority voting mechanism is applied to strategically generate and combine models in the fashion of ensemble learning to solve this classification problem. The ML algorithms used to generate these models are SVM, kNN, ET, Multiplayer Perception (MLP) and Logistic Regression (LR). These ML algorithms have been chosen for their popularity in related research.

    (1)SVM: The SVM classifier is a kernel method algorithm[36]which aims to find the hyperplane that splits the dataset into two classes. Due to this being a multiclass problem a one-vs-rest approach will be taken, where one class is split from the other classes and the class that has the largest distance from the hyperplane is the one that is predicted[36].

    (2)kNN: The kNN classifier predicts the class label by looking at thekclosest samples and is assigned the class label that appears the most[37]. This is known as a lazy algorithm because of the fact that it does not have a training stage[37].

    (3)ET: The ET classifier is similar to the Random Forest classifier used during the wrapper feature selection in that both of these algorithms are ensembles of trees[38]. The difference is that ET makes use of random attributes and cut point values to identify how a node is split[38].

    (4)MLP: The MLP classifier is a feed-forward neural network that uses a set of neurons, in the form of at least one hidden layer which are connected using weights[39]. The weights are altered during training in order to identify the correct weights that will identify the correct class label[39].

    (5)LR: The LR classifier aims to identify the relationship between a feature set and a class label[40]and tends to not over fit as much as other classifiers[41]. Like SVM, LR is a binary classifier but it is possible to use it to handle multiclass problems using a one-vs-rest approach[42].

    To build the ensemble learning based classification approach, the hyperparameters for each of the five chosen classifiers need to be carefully selected. A randomised searching algorithm[43]is applied to pick 10 sets of random parameters and identify the best parameters. The randomised searching algorithm was chosen as its unique properties: (a) a budget can be chosen independent of the number of parameters and possible values; and (b) adding parameters that do not influence the performance does not decrease efficiency.

    The weights for each individual classifier is then calculated using an approach similar to the approach proposed in Ref.[41] where the accuracy of each individual classifier is calculated and divided the sum of the accuracies of the individual classifiers. In this work, the F1-score is used instead due to the fact that accuracy does not perform as well with unbalanced classes[16, 19]. Our improved weighting approach is defined by Equation 9 and Algorithm 2.

    (9)

    where:

    cis the c-th individual classifier

    Wcis the weight of the c-th individual classifier

    F1cis the F1-score of the c-th individual classifier

    Algorithm 2 F1-score based weighting

    Require: A list of classifiers {C1,C2, …,Cn}

    n← the number of classifiers;Wis an empty weight set

    fori=1 tondo

    c←Ci

    W∪{Wc}

    end for

    returnW

    Once the initial candidate set of weights,W, is settled, our proposed malware family identifier based WMV, defined in Equation 10, is evaluated using 5-fold cross validation, where the ensemble model is trained and weights are tuned. Evaluation Metrics, including accuracy, macro F1-score, macro precision, macro recall, and macro AUC, are used alongside a macro ROC curve.

    (10)

    where:

    yis the predicted class of samplex

    mis the number of classifiers

    Cjis the j-th classifier

    wjis the weight of the j-th classifier

    Ais the set of unique class labels

    In order to cater an insight into the types of features that are significant for malware family identification, XGBoost Feature Importance Scores are calculated and the top 20 most important ones are plotted for each experimental dataset. Following the above-introduced process, the experimental results can be reproduced.

    5 Evaluation results and analysis

    Two experiments were conducted and the results are presented in this section. The main goal of the first experiment is to provide a comparison between our ensemble learning based malware family identification approach and other non-ensemble learning based approaches. The second experiment is to evaluate how well our proposed approach performs on recent malware samples.

    5.1 Evaluation results of ensemble learning and its sub-models

    This section delivers the first evaluation of our proposed ensemble learning based malware family identification approach. The performance of the ensemble learning based approach and its sub-model are shown in Table 6. As can be seen from Table 6 the ET and kNN models outperform the proposed ensemble learning based approach, WMV, on the first experimental dataset. This could be due to the limited variation between the weights of the individual classifiers which means that all classifiers were treated equally.

    Table 6 The results of the first experimental dataset %

    Table 7 shows that the proposed ensemble learning based approach, WMV, does outperform the models in terms of F1-score and precision on the second experimental dataset. However, it is not as good as ET with regard to the accuracy and recall. These shortfalls could potentially be improved through further research into weighting calculations.

    Table 7 Experiment 2 results %

    5.2 Comparison between the proposed ensemble learning based malware family identification approach and other approaches

    In order to answer research question 1, a comparison between our ensemble learning based approach (i.e., WMV) and other approaches[15, 17, 24, 28-30]is presented in this section. The different approaches were all evaluated using the same dataset. The evaluation results are shown in Table 8 which details the approaches, the types of features used, and the metrics used to evaluate each approach. These results show that our approach outperforms two out of the three approaches using dynamic features and falls marginally behind the other approaches using static and hybrid features. The reason for this could be due to the reduced feature set. If the feature set was increased and non-linear filter methods were used, there would be a chance that these results could improve at the expense of time. It may also be beneficial to obtain the evaluation scripts for these related approaches. This would provide the ability to evaluate the related approaches using the exact samples as although each of these approaches used the Drebin dataset, certain samples were removed which might impact the results.

    Table 8 Comparison with other approaches that used the Drebin dataset %

    5.3 ROC curves

    As can be seen from Fig.3 the performance of the model using older samples is not perfect which show that only 59% of the time the model would classify a sample correctly. This could be due to various reasons including the fact that the dataset had a total of 83 different labels with only 100 different features to differentiate each of them.

    Fig.3 Experiment 1 macro-average ROC curve

    The performance in experiment 2 was significantly better as can be seen in Fig.4 which shows that 75% of the time the model will classify a sample correctly. This further consolidates the above theory as although the dataset is half the size of experiment 1, it only has a total of 24 families.

    Fig.4 Experiment 2 macro-average ROC curve

    5.4 Feature importance

    Fig.5 and Fig.6 provide the ability to differentiate between which features are important for older samples when compared to newer samples. It can be seen that in general the most important features for more recent samples are increasingly varied as well as some new features (e.g., the signature scheme) added to the list of the top 20.

    Fig.5 Experiment 1 top 20 features

    Fig.6 Experiment 2 top 20 features

    6 Conclusion

    This paper set out to evaluate ensemble learning for malware family classification. To understand the impact of ensemble learning on decision making and its performance in comparison with other non-ensemble learning based malware classification approach, a prototype of ensemble learning-based malware family classification was developed in this paper. The weighted majority voting approach is used in this prototype to determine the importance of individual models to a final decision. The evaluation was conducted using well-known, publicly-available datasets (i.e., Drebin and UpDroid) and the recent malware samples collected from GitHub repositories.

    The evaluation results show that the prototype of ensemble learning based malware family classification outperforms almost all opponents using dynamic features, but marginally behind others using static or hybrid features. The ROC curves show the the prototype has 75% chance being able to distinguish the families of the samples in the second experimental dataset.

    The results imply that the type and quality of features have significant impact on the decision; the randomised searching algorithm does not guarantee optimal selection of parameters. The proposition that ensemble learning outclassed single-model-based learning holds only if the weights of individual models are well tuned. Further investigation is sought to further confirm these implications.

    猜你喜歡
    信息
    訂閱信息
    中華手工(2017年2期)2017-06-06 23:00:31
    展會信息
    中外會展(2014年4期)2014-11-27 07:46:46
    信息超市
    展會信息
    展會信息
    展會信息
    展會信息
    展會信息
    信息
    健康信息
    祝您健康(1987年3期)1987-12-30 09:52:32
    免费大片18禁| 欧美变态另类bdsm刘玥| 一区二区三区免费毛片| 婷婷色麻豆天堂久久| a级毛色黄片| 在线观看美女被高潮喷水网站| 日本wwww免费看| 99视频精品全部免费 在线| 晚上一个人看的免费电影| 搡女人真爽免费视频火全软件| 亚洲av日韩在线播放| 伦理电影大哥的女人| 岛国毛片在线播放| 久久久a久久爽久久v久久| 亚洲欧美日韩另类电影网站| 国产成人a∨麻豆精品| av免费在线看不卡| 久久人人爽av亚洲精品天堂| 麻豆成人午夜福利视频| 精品亚洲成a人片在线观看| 国产欧美日韩一区二区三区在线 | 性色avwww在线观看| 亚洲国产毛片av蜜桃av| 成人二区视频| 精品视频人人做人人爽| 一级毛片黄色毛片免费观看视频| 我的女老师完整版在线观看| 男女国产视频网站| 夜夜骑夜夜射夜夜干| 国产美女午夜福利| 久久久a久久爽久久v久久| 在线观看人妻少妇| 日本91视频免费播放| 两个人免费观看高清视频 | 日本午夜av视频| 精品一区在线观看国产| 99热这里只有是精品在线观看| 中文字幕制服av| 性色avwww在线观看| 大片免费播放器 马上看| 少妇的逼好多水| 最新的欧美精品一区二区| 三级经典国产精品| 久久精品久久久久久久性| av免费在线看不卡| 国产免费福利视频在线观看| 国产精品麻豆人妻色哟哟久久| 水蜜桃什么品种好| 免费不卡的大黄色大毛片视频在线观看| 中文资源天堂在线| 一级黄片播放器| 91精品国产国语对白视频| 日本av免费视频播放| 免费观看av网站的网址| 两个人免费观看高清视频 | 在线亚洲精品国产二区图片欧美 | 国产日韩一区二区三区精品不卡 | 一本大道久久a久久精品| 精品国产一区二区久久| 插逼视频在线观看| 欧美日韩综合久久久久久| 中文字幕制服av| 视频中文字幕在线观看| 能在线免费看毛片的网站| a级毛色黄片| 日本欧美视频一区| 亚洲在久久综合| 国产免费一级a男人的天堂| 国产精品福利在线免费观看| 国产一区亚洲一区在线观看| 亚洲中文日韩欧美视频| 又紧又爽又黄一区二区| 免费在线观看影片大全网站| 午夜免费成人在线视频| 纯流量卡能插随身wifi吗| 精品高清国产在线一区| 91大片在线观看| av又黄又爽大尺度在线免费看| 国产无遮挡羞羞视频在线观看| 我要看黄色一级片免费的| 中国国产av一级| 日本猛色少妇xxxxx猛交久久| 国产片内射在线| 青春草亚洲视频在线观看| 少妇猛男粗大的猛烈进出视频| av不卡在线播放| 两性午夜刺激爽爽歪歪视频在线观看 | 日韩视频在线欧美| 蜜桃在线观看..| 欧美激情极品国产一区二区三区| 国产成人精品无人区| 男女边摸边吃奶| 亚洲欧洲日产国产| 日韩中文字幕欧美一区二区| 日韩熟女老妇一区二区性免费视频| 18禁黄网站禁片午夜丰满| 国产成人a∨麻豆精品| 精品人妻1区二区| 黄频高清免费视频| 高清在线国产一区| 婷婷色av中文字幕| 99精国产麻豆久久婷婷| 精品一区在线观看国产| 成人国产一区最新在线观看| 日本a在线网址| 老司机影院毛片| 免费观看a级毛片全部| 亚洲色图综合在线观看| 乱人伦中国视频| 国产有黄有色有爽视频| 久久影院123| 婷婷成人精品国产| 岛国毛片在线播放| 18禁裸乳无遮挡动漫免费视频| 国产精品熟女久久久久浪| 美女中出高潮动态图| 免费看十八禁软件| 在线观看免费高清a一片| av天堂在线播放| 少妇裸体淫交视频免费看高清 | 12—13女人毛片做爰片一| 日韩熟女老妇一区二区性免费视频| 国产成人免费观看mmmm| 一区二区三区乱码不卡18| 少妇精品久久久久久久| 国产黄频视频在线观看| 亚洲中文日韩欧美视频| 午夜免费观看性视频| 最新的欧美精品一区二区| 青青草视频在线视频观看| 亚洲全国av大片| 99精国产麻豆久久婷婷| 亚洲avbb在线观看| 国产一级毛片在线| 久久人妻熟女aⅴ| 国产精品免费大片| 午夜两性在线视频| 老司机影院毛片| 人妻久久中文字幕网| 一二三四在线观看免费中文在| 亚洲,欧美精品.| 亚洲一卡2卡3卡4卡5卡精品中文| 午夜91福利影院| 天天影视国产精品| 两性夫妻黄色片| 精品少妇内射三级| 80岁老熟妇乱子伦牲交| 性少妇av在线| 一区二区三区激情视频| 亚洲人成77777在线视频| 老司机午夜福利在线观看视频 | 色婷婷久久久亚洲欧美| 国产欧美亚洲国产| 人妻久久中文字幕网| 亚洲五月色婷婷综合| 欧美老熟妇乱子伦牲交| 99久久国产精品久久久| 欧美激情久久久久久爽电影 | 久久性视频一级片| 欧美 亚洲 国产 日韩一| 热99久久久久精品小说推荐| e午夜精品久久久久久久| 国产精品99久久99久久久不卡| 人人妻人人爽人人添夜夜欢视频| 菩萨蛮人人尽说江南好唐韦庄| 久久精品亚洲熟妇少妇任你| 日本精品一区二区三区蜜桃| 国产亚洲欧美精品永久| 日韩中文字幕欧美一区二区| 国产伦人伦偷精品视频| 欧美久久黑人一区二区| 少妇精品久久久久久久| 免费人妻精品一区二区三区视频| 午夜福利影视在线免费观看| 美女高潮喷水抽搐中文字幕| 狂野欧美激情性xxxx| 亚洲专区国产一区二区| 国产成人精品在线电影| 亚洲精品国产色婷婷电影| 12—13女人毛片做爰片一| 天堂中文最新版在线下载| 精品一区在线观看国产| 十八禁高潮呻吟视频| 欧美精品一区二区大全| 亚洲欧美一区二区三区久久| 久久亚洲国产成人精品v| bbb黄色大片| 热99国产精品久久久久久7| 午夜福利,免费看| e午夜精品久久久久久久| 久久国产精品影院| 国产伦人伦偷精品视频| 真人做人爱边吃奶动态| 热99久久久久精品小说推荐| 亚洲第一欧美日韩一区二区三区 | 国产精品影院久久| 大香蕉久久成人网| 欧美国产精品va在线观看不卡| 亚洲成人国产一区在线观看| 18禁裸乳无遮挡动漫免费视频| 亚洲中文字幕日韩| 日本a在线网址| 两个人看的免费小视频| 精品亚洲乱码少妇综合久久| 久久精品亚洲av国产电影网| 美国免费a级毛片| 亚洲视频免费观看视频| 少妇的丰满在线观看| 亚洲欧美激情在线| 丝袜美腿诱惑在线| 9色porny在线观看| 一区二区日韩欧美中文字幕| 一本—道久久a久久精品蜜桃钙片| 国产亚洲av高清不卡| 视频区欧美日本亚洲| 热99re8久久精品国产| 国产99久久九九免费精品| 精品国产超薄肉色丝袜足j| 一二三四社区在线视频社区8| 日本一区二区免费在线视频| 91九色精品人成在线观看| 天天躁狠狠躁夜夜躁狠狠躁| 大型av网站在线播放| 国产精品欧美亚洲77777| 老司机影院成人| 欧美黑人精品巨大| 久久香蕉激情| 窝窝影院91人妻| 天天躁夜夜躁狠狠躁躁| videosex国产| 91麻豆av在线| 又紧又爽又黄一区二区| 男女床上黄色一级片免费看| 亚洲国产av影院在线观看| 黄色a级毛片大全视频| 久热爱精品视频在线9| 国产精品二区激情视频| 三级毛片av免费| 亚洲av成人一区二区三| 999久久久国产精品视频| 久久久水蜜桃国产精品网| 久久99一区二区三区| 日本wwww免费看| 久久综合国产亚洲精品| 国产精品久久久久久精品电影小说| 乱人伦中国视频| www日本在线高清视频| 在线 av 中文字幕| 国产视频一区二区在线看| 丝瓜视频免费看黄片| 99精品欧美一区二区三区四区| 久久天躁狠狠躁夜夜2o2o| 人妻久久中文字幕网| 热99国产精品久久久久久7| 精品人妻1区二区| 久久精品人人爽人人爽视色| 中文字幕人妻熟女乱码| 亚洲国产精品一区三区| 欧美精品啪啪一区二区三区 | 亚洲av美国av| 中亚洲国语对白在线视频| 国产男人的电影天堂91| 一本大道久久a久久精品| av在线app专区| 久久久水蜜桃国产精品网| 亚洲天堂av无毛| 黑人欧美特级aaaaaa片| 少妇 在线观看| 国产精品国产av在线观看| 交换朋友夫妻互换小说| 777米奇影视久久| 少妇猛男粗大的猛烈进出视频| 悠悠久久av| 久久久久精品人妻al黑| 亚洲av国产av综合av卡| 夫妻午夜视频| 日韩 亚洲 欧美在线| avwww免费| 69av精品久久久久久 | 女人精品久久久久毛片| 91精品伊人久久大香线蕉| 男人舔女人的私密视频| 在线天堂中文资源库| 亚洲天堂av无毛| 青春草亚洲视频在线观看| 性色av一级| 蜜桃在线观看..| 久久精品亚洲av国产电影网| 中国国产av一级| 动漫黄色视频在线观看| 十八禁网站网址无遮挡| 亚洲色图 男人天堂 中文字幕| 久久青草综合色| 免费在线观看影片大全网站| av有码第一页| 国产一区二区在线观看av| netflix在线观看网站| 搡老岳熟女国产| 中文字幕人妻丝袜制服| 亚洲精品乱久久久久久| 国产亚洲精品一区二区www | 一进一出抽搐动态| 日日夜夜操网爽| 秋霞在线观看毛片| 在线永久观看黄色视频| 1024视频免费在线观看| 伦理电影免费视频| 另类精品久久| 亚洲国产精品999| 国产一区二区三区av在线| 亚洲精品久久午夜乱码| 999精品在线视频| 90打野战视频偷拍视频| 极品人妻少妇av视频| 亚洲欧美清纯卡通| 欧美97在线视频| 性色av乱码一区二区三区2| 亚洲精品乱久久久久久| 亚洲精品国产av成人精品| 成年动漫av网址| 大香蕉久久成人网| 精品国内亚洲2022精品成人 | 十八禁网站网址无遮挡| av福利片在线| 搡老熟女国产l中国老女人| 新久久久久国产一级毛片| 天天添夜夜摸| 欧美黄色淫秽网站| 超碰成人久久| 在线永久观看黄色视频| 两性夫妻黄色片| 日日夜夜操网爽| 精品亚洲乱码少妇综合久久| 叶爱在线成人免费视频播放| 免费久久久久久久精品成人欧美视频| 国产精品一区二区精品视频观看| 18禁国产床啪视频网站| 欧美人与性动交α欧美精品济南到| 青春草亚洲视频在线观看| 999久久久国产精品视频| 精品国产一区二区三区久久久樱花| 少妇人妻久久综合中文| 国产老妇伦熟女老妇高清| 国产主播在线观看一区二区| 久久性视频一级片| 国产1区2区3区精品| 国产亚洲精品第一综合不卡| 国产成人精品无人区| www.精华液| 12—13女人毛片做爰片一| 最近中文字幕2019免费版| 婷婷丁香在线五月| 黄网站色视频无遮挡免费观看| 亚洲精品美女久久久久99蜜臀| 亚洲伊人久久精品综合| 免费日韩欧美在线观看| 久久精品aⅴ一区二区三区四区| 一区在线观看完整版| 黄色毛片三级朝国网站| 搡老岳熟女国产| 汤姆久久久久久久影院中文字幕| 黄色怎么调成土黄色| 国产野战对白在线观看| av超薄肉色丝袜交足视频| 国产一区二区 视频在线| 欧美+亚洲+日韩+国产| 99精国产麻豆久久婷婷| 国产亚洲午夜精品一区二区久久| 亚洲精品中文字幕在线视频| 国产免费现黄频在线看| 一级片'在线观看视频| 亚洲欧美日韩另类电影网站| 国产精品九九99| 各种免费的搞黄视频| 狂野欧美激情性bbbbbb| 精品少妇黑人巨大在线播放| 三级毛片av免费| 9热在线视频观看99| a级片在线免费高清观看视频| 久久久国产成人免费| 欧美日韩国产mv在线观看视频| 美女午夜性视频免费| 一区在线观看完整版| 国产成+人综合+亚洲专区| avwww免费| 一区在线观看完整版| 天天躁日日躁夜夜躁夜夜| 久久久欧美国产精品| 国产成人一区二区三区免费视频网站| 又黄又粗又硬又大视频| 精品乱码久久久久久99久播| 国产男人的电影天堂91| 日韩中文字幕欧美一区二区| 亚洲中文av在线| 可以免费在线观看a视频的电影网站| 国产高清视频在线播放一区 | 咕卡用的链子| 人妻一区二区av| 一区二区av电影网| 亚洲国产精品成人久久小说| 久久久国产精品麻豆| 国产亚洲欧美精品永久| av线在线观看网站| 亚洲av欧美aⅴ国产| 国精品久久久久久国模美| 欧美激情 高清一区二区三区| av又黄又爽大尺度在线免费看| 成人国产一区最新在线观看| 超色免费av| 日韩免费高清中文字幕av| 久久狼人影院| 99久久国产精品久久久| 欧美成人午夜精品| 1024视频免费在线观看| 亚洲 欧美一区二区三区| 久久亚洲精品不卡| 18禁裸乳无遮挡动漫免费视频| 中文字幕制服av| 两性夫妻黄色片| 热re99久久国产66热| 蜜桃在线观看..| 国产成人免费观看mmmm| 蜜桃国产av成人99| 久久99一区二区三区| 一级毛片精品| 精品国产一区二区三区四区第35| 欧美日韩一级在线毛片| 午夜激情av网站| 免费女性裸体啪啪无遮挡网站| 男男h啪啪无遮挡| 国产精品国产av在线观看| 大码成人一级视频| 黄色 视频免费看| 国产成人精品无人区| 久久青草综合色| 亚洲国产看品久久| 菩萨蛮人人尽说江南好唐韦庄| 婷婷成人精品国产| 十八禁人妻一区二区| 9热在线视频观看99| 少妇裸体淫交视频免费看高清 | 久久午夜综合久久蜜桃| 成人av一区二区三区在线看 | 窝窝影院91人妻| 他把我摸到了高潮在线观看 | 首页视频小说图片口味搜索| 在线观看www视频免费| 老司机影院毛片| 人人妻人人添人人爽欧美一区卜| 免费黄频网站在线观看国产| 国产欧美日韩一区二区精品| 成年人午夜在线观看视频| 日本精品一区二区三区蜜桃| 老司机午夜十八禁免费视频| 蜜桃国产av成人99| 人成视频在线观看免费观看| 国产成人免费无遮挡视频| av视频免费观看在线观看| 69精品国产乱码久久久| www日本在线高清视频| 亚洲人成77777在线视频| 国产高清国产精品国产三级| 国产97色在线日韩免费| 日韩制服丝袜自拍偷拍| 成年女人毛片免费观看观看9 | 国产精品1区2区在线观看. | 精品少妇一区二区三区视频日本电影| 亚洲七黄色美女视频| www.熟女人妻精品国产| 亚洲精品国产av蜜桃| 老鸭窝网址在线观看| 高清欧美精品videossex| 91成年电影在线观看| 国产一区二区 视频在线| 久久久精品免费免费高清| 自拍欧美九色日韩亚洲蝌蚪91| 男女午夜视频在线观看| 精品久久久久久电影网| 久久人人97超碰香蕉20202| 1024视频免费在线观看| 九色亚洲精品在线播放| 交换朋友夫妻互换小说| 一本久久精品| 深夜精品福利| 精品一区二区三区av网在线观看 | 国产精品影院久久| 女警被强在线播放| 免费在线观看影片大全网站| 国产亚洲精品久久久久5区| 久久久久久久久免费视频了| 欧美午夜高清在线| 久久午夜综合久久蜜桃| 十八禁网站网址无遮挡| 免费在线观看视频国产中文字幕亚洲 | 欧美久久黑人一区二区| 亚洲性夜色夜夜综合| 日韩 亚洲 欧美在线| 别揉我奶头~嗯~啊~动态视频 | 中文字幕最新亚洲高清| 国产精品一区二区在线观看99| 手机成人av网站| 久久九九热精品免费| 亚洲少妇的诱惑av| 搡老乐熟女国产| 久久女婷五月综合色啪小说| 国产精品免费视频内射| 国产成人av教育| 精品一品国产午夜福利视频| 亚洲精品av麻豆狂野| 免费观看av网站的网址| 国产无遮挡羞羞视频在线观看| 国产97色在线日韩免费| 久久亚洲国产成人精品v| 精品卡一卡二卡四卡免费| 美女大奶头黄色视频| 欧美变态另类bdsm刘玥| 免费一级毛片在线播放高清视频 | 日本猛色少妇xxxxx猛交久久| 日本黄色日本黄色录像| 色婷婷久久久亚洲欧美| 国产熟女午夜一区二区三区| 巨乳人妻的诱惑在线观看| 午夜免费鲁丝| 狠狠狠狠99中文字幕| 亚洲三区欧美一区| 极品人妻少妇av视频| 日韩欧美一区二区三区在线观看 | 99久久精品国产亚洲精品| 久久毛片免费看一区二区三区| 精品高清国产在线一区| 1024视频免费在线观看| 97在线人人人人妻| 久久热在线av| 天堂中文最新版在线下载| 超碰97精品在线观看| 两人在一起打扑克的视频| 久久狼人影院| 久久国产亚洲av麻豆专区| 欧美精品人与动牲交sv欧美| 亚洲 国产 在线| 欧美国产精品一级二级三级| 国产亚洲欧美在线一区二区| 九色亚洲精品在线播放| 国产麻豆69| 91麻豆精品激情在线观看国产 | 亚洲精品自拍成人| 嫩草影视91久久| 高清在线国产一区| 精品国产一区二区三区久久久樱花| 欧美成人午夜精品| 日韩大片免费观看网站| 91精品三级在线观看| 免费日韩欧美在线观看| 欧美日韩福利视频一区二区| 精品国产一区二区三区四区第35| 丁香六月欧美| 一级,二级,三级黄色视频| 黑人巨大精品欧美一区二区蜜桃| 国产精品欧美亚洲77777| 午夜福利在线免费观看网站| 午夜两性在线视频| 久久久久视频综合| 啦啦啦 在线观看视频| 国产老妇伦熟女老妇高清| 操出白浆在线播放| 日韩欧美一区二区三区在线观看 | 成人黄色视频免费在线看| 男人舔女人的私密视频| 人妻久久中文字幕网| av有码第一页| 黄色视频在线播放观看不卡| 日韩制服骚丝袜av| 午夜福利,免费看| 亚洲va日本ⅴa欧美va伊人久久 | 91大片在线观看| 天天添夜夜摸| 啦啦啦啦在线视频资源| 99热网站在线观看| h视频一区二区三区| 国产精品一区二区在线不卡| 国内毛片毛片毛片毛片毛片| 婷婷成人精品国产| 国产成人啪精品午夜网站| 日韩熟女老妇一区二区性免费视频| 久久久久国产精品人妻一区二区| 最新在线观看一区二区三区| 亚洲精品美女久久久久99蜜臀| 欧美精品啪啪一区二区三区 | 啦啦啦啦在线视频资源| av网站在线播放免费| 午夜福利免费观看在线| 一区二区三区激情视频| 中文精品一卡2卡3卡4更新| 成人免费观看视频高清| 亚洲中文字幕日韩| 成人三级做爰电影| 五月开心婷婷网| 麻豆乱淫一区二区| 91麻豆av在线| 90打野战视频偷拍视频| 亚洲精品在线美女| 国产激情久久老熟女| 天天添夜夜摸| 丝袜在线中文字幕| www.熟女人妻精品国产| 肉色欧美久久久久久久蜜桃| 三上悠亚av全集在线观看| 久久午夜综合久久蜜桃| 亚洲国产精品一区二区三区在线| 韩国精品一区二区三区| 欧美精品av麻豆av| 在线看a的网站| 亚洲精品一区蜜桃| 精品少妇久久久久久888优播| 亚洲激情五月婷婷啪啪|