• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Estimating Age in Short Utterances Based on Multi-Class Classification Approach

    2021-12-11 13:29:38AmeerBadrandAliaAbdulHassan
    Computers Materials&Continua 2021年8期

    Ameer A.Badrand Alia K.Abdul-Hassan

    1College of Managerial and Financial Sciences,Imam Ja’afar Al-Sadiq University,Salahaddin,Iraq

    2Department of Computer Science,University of Technology,Baghdad,Iraq

    Abstract:Age estimation in short speech utterances finds many applications in daily life like human-robot interaction,custom call routing,targeted marketing, user-profiling, etc.Despite the comprehensive studies carried out to extract descriptive features, the estimation errors (i.e.years) are still high.In this study, an automatic system is proposed to estimate age in short speech utterances without depending on the text as well as the speaker.Firstly,four groups of features are extracted from each utterance frame using hybrid techniques and methods.After that, 10 statistical functionals are measured for each extracted feature dimension.Then,the extracted feature dimensions are normalized and reduced using the Quantile method and the Linear Discriminant Analysis (LDA) method, respectively.Finally, the speaker’s age is estimated based on a multi-class classification approach by using the Extreme Gradient Boosting (XGBoost) classifier.Experiments have been carried out on the TIMIT dataset to measure the performance of the proposed system.The Mean Absolute Error(MAE)of the suggested system is 4.68 years,and 4.98 years,the Root Mean Square Error(RMSE)is 8.05 and 6.97,respectively,for female and male speakers.The results show a clear relative improvement in terms of MAE up to 28%and 10%for female and male speakers,respectively,in comparison to related works that utilized the TIMIT dataset.

    Keywords: Speaker age estimation; XGBoost; statistical functionals;Quantile normalization; LDA; TIMIT dataset

    1 Introduction

    The speech contains valuable linguistic context information and paralinguistic information about speakers, such as identity, emotional state, gender, and age [1].Automatic recognition of this kind of information can guide Human-Computer Interaction (HCI) systems to adapt in an automatic way for various user needs [2].

    Automatic age estimation from short speech signals has a variety of forensic and commercial applications.It may be used in several forensic scenarios like threat calls, kidnapping, and falsified alarms to help identify criminals, e.g., shorten the number of suspects.Automatic age estimation may also be used for effective diverting of calls in the call centers [1-3].

    Estimating the age of the speaker is a difficult estimation issue for several reasons.First, the age of the speaker is a continuous variable, which makes it difficult to be estimated by algorithms of machine learning that are working with discrete labels.Second, there is usually a difference between a speaker’s age as perceived, namely the perceptual age, and their actual age, that is,the chronological age.Third, there are very few publicly available aged labeled datasets with a sufficient number of speech utterances for various age groups.Finally, the speakers of the same age may sound different due to the intra-age variability such as speaking style, gender, weight,speech content, height, emotional state, and so on [1-4].

    The TIMIT speech reading corpus has been designed for providing speech data for the acoustic-phonetic studies and for developing and evaluate the systems of automatic speech recognition.TIMIT includes broadband recordings of 630 speakers of 8 main American English dialects; every one of them reads 10 phonetically rich sentences.Recently, some studies are interested in studying the age estimation system based on the TIMIT dataset.Singh et al.[5]proposed an approach to estimating speakers’psychometric parameters such as height and age.They stated that when analyzing the signal at a finer temporal resolution, it may be possible to analyze segments of the speech signal that are obtained entirely when the glottis is opened and thereby capturing some of the sub-glottal structure that may be represented in the voice.They used a simple bag-of-words representation together with random forest regression to make their predictions.For age estimation, the Mean Absolute Error (MAE) of their best results was 6.5 and 5.5 for female and male speakers, respectively.Kalluri et al.[6] proposed an end-to-end architecture for the prediction of the height as well as the age of the speaker based on the Deep Neural Networks (DNNs) for short durations of speech.For age estimation, the Root Mean Square Error (RMSE) of their best results was 8.63 and 7.60 for female and male speakers,respectively.Kalluri et al.[7] explored, in a multilingual setting, the estimation of the speaker’s multiple physical parameters from a short speech duration.At different resolutions, they used various feature streams for the estimation of the body-build and age that have been derived from the spectrum of the speech.To learn a Support Vector Regression (SVR) model for the estimation of the speaker body-build and age, the statistics of these features are used over speech recording.For age estimation, the MAE of their best results was 5.6 and 5.2, respectively, for female and male speakers.

    The previous studies conducted a variety of different methods and techniques such as random forest regression, DNN, and SVR to estimate age from short speech utterances accurately.However, the prediction errors (i.e., years) are still high for real-time applications like humanrobot interaction.The reason behind this is their inability to efficiently find the combination of features that characterize the speaker’s age; they use a sort of old estimation techniques.The main objective of this study is to build an accurate speaker age estimator that bridges the gap of an appropriate combination of features by finding the optimal feature vectors that depend on statistical functionals as well as the LDA method to make the prediction errors as small as possible.

    The primary contributions of the present study can be highlighted and summarized as follows:

    (1) Combining four feature groups, which are Mel-Frequency Cepstral Coefficients (MFCCs),Spectral Subband Centroids (SSCs), Linear Predictive Coefficients (LPCs), and Formants to extract 150-dimensional feature vectors from each utterance.

    (2) Measuring 10 statistical functionals for each extracted feature dimension to achieve the greatest possible gain from each feature vector.

    (3) Exploring the role of using the Quantile technique as a feature normalization method.

    (4) Exploring the role of using Linear Discriminant Analysis (LDA) as a supervised approach for dimensionality reduction.

    (5) Treating the age estimation issue as a multi-class classification issue by using the XGBoost classifier to predict the speaker’s age from short utterances.

    The rest of this study is organized as follows:Section 2 presents the theoretical backgrounds of the proposed system.Section 3 deals with the proposed method.The results of simulations and experiments are shown in Section 4.Finally, Section 5 sets out the study conclusions and future works.

    2 Theoretical Backgrounds

    In the present study, several methods and techniques were used to extract the features from each speech utterance, reduce the extracted features dimensionality, and estimate speaker age.These methods and techniques are described briefly below.

    2.1 Features Extraction Methods

    As mentioned before, the speech signal contains various types of paralinguistic information,e.g., speaker age.Features are determined at the first stage of all classification or regression systems, where the speech signal is transformed into measured values with distinguishing characteristics.Such methods used in this study are briefly described below.

    2.1.1 The Mel-Frequency Cepstral Coefficients(MFCCs)

    Among all types of speech-based feature extraction domains, Cepstral domain features are the most successful ones, where a cepstrum has been obtained by taking the inverse Fourier transform of the signal spectrum.MFCC is the most important method to extract speech-based features in this domain [8,9].MFCCs magnificent role stems from the ability to exemplify the spectrum of speech amplitude in a concise form.The voice of the speaker is filtered by the articulator form of the vocal tract, like the nasal cavity, teeth, and tongue.This shape affects the vibrational characteristics of the voice.If the shape is precisely controlled, this should give an accurate depiction of the phoneme being formed [10].The procedure for obtaining the MFCC features are shown in Fig.1 [11] and the following steps [12]:

    (1) Preemphasis:This relates to the filtering, which stresses the higher frequency values.It aims to offset the range of voiced sounds that are in the high-frequency area have a steep roll-off.Therefore, some of the glottal effects are eliminated from the parameters of the vocal tract by preemphasis.

    (2) Frame blocking and windowing:Speech must be examined over a short period of time (i.e.,frames) for stable acoustic characteristics.A window is applied on each frame for tapering the signal toward the limits of the frame.Hamming windows are usually used.

    (3) FFT spectrum:By applying the Fast Fourier Transform (FFT), every one of the windowed frames has been converted to a spectrum of magnitude.

    (4) Mel spectrum:By passing the FFT signal through a set of the bandpass filters referred to as Mel-filter bank, a Mel spectrum is computed.A Mel is a measurement unit that has been based upon the perceived frequency of the human ear.The Mel scale is approximately below 1 kHz linear frequency spacing and above 1 kHz logarithmic spacing.It is possible to express the Mel approximation from the physical frequency as in Eq.(1).The warped axis, based on the non-linear function that has been given in Eq.(1), has been implemented to mimic the perception of human ears.The filter shaper most commonly used is triangular.Through multiplying the spectrum magnitude by every triangular Mel weighting filter, the Mel of magnitude spectrumX(k)has been calculated as expressed in Eq.(2).

    Figure 1:The MFCC analysis process [11]

    wherefMelis the perceived frequency in Hz, and f is the physical frequency.M is the total number of triangular Mel weighting filters,Hm(k)is thekth energy spectrum bin weight contributing tomth output band.

    (5) Discrete cosine transform (DCT):The levels of the energy in the adjacent bands have the tendency of being correlated since the vocal tract is smooth.A set of cepstral coefficients is produced by the DCT applied to the coefficients of the transformed Mel frequency.Finally,MFCC is calculated as expressed in Eq.(3).

    wherec(n)is the cepstral coefficients, andCrepresents the number of the MFCCs.

    (6) Dynamic MFCC features:The additional information on the time dynamics of a signal is obtained by computation of the cepstral coefficients’first and second derivatives since they contain only information from the given frame.Eq.(4) show the commonly utilized definition for the computation of the dynamic parameter.

    wherecm(n)is themthfeature fornthtime frame,kirepresentsithweight and T represents the number of successive frames that are utilized for the computation.

    2.1.2 Spectral Subband Centroids(SSCs)

    SSC feature proposed by Paliwal [13] is intended to be a complement to the cepstral features in speech recognition.High sensitivity to additive noise distortion is considered as a major problem concerning the cepstral-based features, the addition of the white noise to the speech signals affects the spectrum of speech power at all frequencies, but in the higher amplitude (formant)portions of the spectrum, the effect is less noticeable.Therefore, to ensure the robustness of the feature, some formant-like features have to be investigated; SSC features are similar to the formant frequencies and can be easily and reliably extracted [13].The entire frequency band (0 to Fs/2)has been divided into N number of sub-bands for computation of SSCs, whereFsis the speech signal sampling frequency.SSCs are found through the application of the filter banks to the signal power spectrum and, after that, the calculation of the first moment (i.e., centroid) of every one of the sub-bands.SSC ofmth subband is calculated as seen in Eq.(5), whereFsis the frequency of sampling,ωm(f)represents the frequency response ofmth bandpass filter,P(f)represents the short-time power spectrum, andγrepresents the parameter that controls the dynamic power spectrum range [14].

    2.1.3 Linear Predictive Coefficients(LPCs)

    LPCs are techniques developed to analyze speech.The idea behind this is to model the production of speech as an additive model consisting of a source and a filter with one or more resonant frequencies.The source corresponds to the vocal folds’primary vibrations, and the filter is due to the vocal tracts’shapes and movements, that is, the throat, the tongue, and the lips [15].By predicting a formant, LPC analysis decided on a signal format, which is referred to as the inverse filtering, after that, estimated the frequency and intensity from the residue speech signal.Due to the fact that the speech signal has numerous time-dependent types, the estimate will cut a signal that is referred to as a frame.The process for obtaining the LPC coefficient is illustrated in Fig.2 and the following steps [16]:

    (1) Preemphasis:This relates to filtering, which stresses the higher frequency levels.It aims to offset the range of voiced sounds that in the high-frequency area have a steep rolloff.As expressed in Eq.(6), the Preemphasis filter is based on the time-domain input/output relation.

    (2) Frame blocking and windowing:Speech should be examined over a short period of time(i.e., frames) for stable acoustic characteristics.A window is applied on each frame for tapering the signal toward the limits of the frame.Hamming windowsw(n)are usually used as expressed in Eq.(7).

    (3) Autocorrelation Analysis:in this step, autocorrelation analysis is implemented toward each frame result by Eq.(7), as expressed in Eq.(8).

    wherepdenotes the LPC order that is often from (8) to (16).

    (4) LPC Analysis:every one of the frames from the autocorrelation of (p+1) is converted in this step to become a compilation of LPC parameters.This compilation then becomes an LPC coefficient or becomes a transformation of another LPC.The formal method to do that is called the Durbin method.

    Figure 2:The LPC process [16]

    2.1.4 Formant Based Features

    The vocal tract shape includes a lot of the relevant information, and it was commonly represented in numerous applications that are associated with the speech.The formants, a vocal tract resonance representation, may be modeled with the LPC [17].The formants are merely spectral spectrum peaks of the voice.In the phonetics of the speech, the formant frequency levels are the acoustic resonance of the human vocal tract, which is measured as a peak of the amplitude in the sound frequency spectrum.In acoustics, formants are known as a peak in the sound envelope and/or the resonance in sound sources, in addition to sound chambers.The process for getting the formant features is shown in Fig.3 [18].

    Figure 3:The formants detection process [18]

    2.2 Quantile Normalization Method

    There are numerous normalization techniques used with machine learning algorithms such as min-max transformation, z-score transformation, and power transformation.Among them,Quantile normalization was originally developed for gene expression microarrays, but today it is applied in a wide range of data types.Quantile normalization is a global method of adjustment that assumes that each sample’s statistical distribution is the same.The method is supported by the concept that a quantile-quantile plot indicates that if a plot is a straight diagonal line, the distribution of two data vectors is the same, and not the same, if it is different from a diagonal line.This definition is extended to N-dimensions; therefore, in the case where every N data vector has an identical distribution to the others, then plotting the quantiles in N-dimensions provides a straight line along the unit vector line.This indicates that if one projects the points of our N-dimensional quantile plot onto the diagonal, one could create a set of data with the same distribution.This implies that an identical distribution may be given to every one of the arrays by taking the average quantile and substituting it as the data item value in the original dataset [19,20].This results in motivating the following steps by giving them the same distribution to normalize a set of data vectors [19]:

    (1) Givennarrays of lengthp, formXof dimension(p×n)where every one of the arrays is a column;

    (2) Sorting every one of the columns ofXto result inXsort;

    (3) Taking the mean values across the rows ofXsortand assigning that mean to every one of the elements in a row to obtain ~Xsort;

    2.3 Linear Discriminate Analysis(LDA)Method

    Mainly, there is a degree of redundancy in extracted high-dimensional characteristics.Subspace learning may be utilized for the elimination of those redundancies through the additional processing of the obtained features for the purpose of reflecting their semantic information in a sufficient way.There are numerous dimensionality reduction techniques used with machine learning algorithms such as Principal Component Analysis (PCA), Non-negative Matrix Factorization (NMF), and, Factor Analysis.Among them, LDA is one of the very common supervised techniques for the problems of dimensionality reduction as a preprocessing step for applications of pattern classification and machine learning.The LDA technique aims at projecting the original data matrix in a lower-dimensional space.There have been three steps required for achieving that goal.The initial step is calculating the variance amongst classes (in other words, the distance between the mean values of various classes).The second step is to calculate the within-class variance, which is the distance between the mean and the samples of every one of the classes.The third step is the construction of the lower-dimensional space, which minimizes the within-class variance and maximizes the between-class variance [21].Tab.1 illustrates the main steps of the supervised LDA algorithm.

    2.4 Extreme Gradient Boosting Machine(XGBoost)

    Based on the ensemble boosting idea, the XGBoost is combining all sets of weak learner’s predictions through additive training strategies to develop a strong learner.The XGBoost seeks to avoid over-fitting beside optimize the computing resources.This is accomplished by modeling the objective functions, allowing regularization and predictive terms to be combined but also maintain an optimum computation speed.During the XGBoost phase training, parallel calculations are performed automatically for each function [22].

    Table 1:The class-dependent LDA

    In the learning process of the XGBoost, the first learner is fitted to the entire input data space,while a second model is to tackle the drawbacks of a weak learner by fitted to residuals.Until the stopping criterion has been met, this fitting process will be repeated a number of times.The model’s final prediction is obtained by the sum of each learner’s prediction.[22].For prediction at step t, the general function has been presented as [22]:

    whereft(xi)is the learner at step t,ftiandare the predictions at step t and step t ?1, andxithe input variable.

    The XGBoost model prevents overfitting issue without compromise the model computational speed by evaluating the model goodness from the original function as in the following expression [23]:

    where L represents the loss function, n represents the number of the used observations, andΩrepresents the term of regularization, and it is obtained from [23]:

    whereωrepresents the leaves scores vector,λrepresents the parameter of regularization, andγrepresents the minimal loss that is required for partitioning the leaf node T further.

    3 The Proposed Speaker Age Estimation System

    As can be seen from Fig.4, the methodology of this study consists of five main stages:features extracting, statistical functionals measuring, features normalizing, dimensionality reducing, and speaker age estimating.Initially, appropriate features are extracted from each speaker’s utterance, followed by features scaling to fall within a smaller range using normalization techniques.Then, by using the dimensionality reduction method, the high dimensional features will be transformed into more meaningful low dimensional features.Finally, an estimator based on the XGBoost classifier is used to predict the speaker’s actual age.

    61.No one shall ever be my bride but the woman who can do this: In folklore, bride tests are often centered around domestic duties such as cleaning, cooking or sewing. The woman who best completes the domestic tasks is chosen as bride for the prince or suitor.Return to place in story.

    3.1 Utterance Based Features Extraction

    As mentioned earlier, the issue of estimating a speaker’s age is a difficult one where the extracted features need to be speaker-independent.Therefore, four groups of features were incorporated in this study in which the errors of the estimation from the variety of the feature groups are complementary, allowing estimates from those feature groups to be combined to additionally enhance system performance.In the beginning, each speaker’s utterance is split into frames with a window size of 250 milliseconds and a frameshift of 10 milliseconds to ensure that each frame contains robust information.Then, four groups of features are extracted from each utterance frame, which are MFCC (i.e., 20-dimensions with its first and second derivative), LPC (i.e., 20-dimensions with its first and second derivative), SSC (i.e., 26-dimension), and formants (i.e., F1,F2, F3, and F4.).The total dimensions of the extracted features in this stage are 150, as seen in Fig.5.

    3.2 Statistical Features Generation

    To override the issue of varying features size between different speaker utterances as well as to achieve the greatest possible gain from each feature dimension, the features with dynamic size extracted from the previous stage (i.e., 150-dimension) are turned into features with static size by measuring 10 statistical functionals for each dimension.These statistical functionals include mean, min, max, median, stander deviation, skewness, kurtosis, first quantile, third quantile, and interquartile range (Iqr).The total output of features dimension in this stage is 1500, as seen in Fig.5.

    Figure 4:The general framework of the proposed system

    3.3 Feature Normalization Using Quantile Method

    The expression of features in smaller units will result in a wider range for these features and thus will tend to give such features a greater effect.The normalization process involves transforming the data to fall in a smaller range.Therefore, due to the great usefulness of the normalization process in machine learning methods, the 1500-dimensional features extracted from the previous stage will be normalized by using the quantile method.

    Figure 5:The proposed features fusion

    3.4 Dimensionality Reduction Using LDA Method

    At this stage, LDA takes as its input a set of 1500-dimensional normalized features grouped into labels.Then, it tends to be finding an optimum transformation mapping those input features to a lower-dimensional space at the same time as maintaining the structure of the label.Employing LDA algorithm as in Tab.1 would maximize the between-label distance and at the same time,minimize the within-label distance, thereby achieving the maximal differentiation.The output of this stage is determined depending on Eq.(19), to produce a feature vectors containing the most important information to accurately estimate the speakers’ages from their voice.

    whereNCdenote the number of classes, whileNFdenote the number of features dimensions.

    3.5 Age Estimation Using the XGBoost Classifier

    In this study, the XGBoost is relied on because of its apparent superiority over most other ensemble algorithms in many respects such as parallelization, cache optimization, optimal computational speed, and curbs over-fitting easily.The XGBoost has been used recently in several speechbased applications such as Epilepsy detection [23] and Parkinson’s disease classification [24].Age estimation is typically considered as a regression problem.However, some studies have recently treated it as a multi-class classification problem, as in Ghahremani et al.[25].Therefore, to take advantage of the XGBoost classifier strength, the XGBoost classifier is trained on the previous stage output to estimate the speaker’s age at the minimum error rate.

    4 Experimental Results and Discussions

    The dataset used in this study is described, and the experiments that were performed are explained and discussed in detail in this section.Two objective measures that are utilized in the earlier studies [6,25] have been considered to evaluate the efficiency of the proposed system of age estimation.

    MAE is calculated according to Eq.(20); lower MAE means better performance.RMSE is computed as in Eq.(21); lower RMSE means better performance.

    where N represents the number of the test utterances, ˇYnrepresents the predicted age andYnrepresents the age ground truth.

    For providing a more objective measure of comparison to the related works, the relative improvement of MAE and RMSE to the prior system is calculated as in Eqs.(22) and (23),respectively [26].

    whereMAEandRMSEdenote the estimation error measures for the proposed system, whileMAEpriorandRMSEpriordenote the same measures for the related system.

    4.1 TIMIT Dataset Description

    The TIMIT corpus of the reading speech has been designed in order to develop and evaluate automatic speech recognition systems.Text corpus design is a joint effort amongst Stanford Research Institute (SRI), Texas Instruments (TI), and Massachusetts Institute of Technology(MIT).The speech is recorded at TI and transcribed at MIT.The sampling frequency of recoded utterances is chosen to be 16 kHz with a 16-bit rate.The duration of each utterance is about(1-3 seconds).TIMIT includes a total of 6,300 sentences, 10 sentences that are spoken by every one of the 630 speakers from 8 main dialect regions of the U.S.[27].The statistics of the dataset are given in Tab.2.

    Table 2:TIMIT dataset statistics

    To train, test, and compare the proposed system consistently, the TIMIT dataset has been divided into two parts; the training set contains 154, and 350 speakers (i.e.80%) while the test set contains 38, and 88 speakers (20%), respectively, for females and males.To prevent overfitting during the training process, the overlapping of speakers as well as utterances in partitioning have been avoided.

    4.2 Results and Discussions

    Different experiments have been carried out to find the optimal configuration of the proposed age estimation system parameters.In the first experiment, the performance evaluation of the suggested age estimation system in terms of MAE, and RMSE is conducted.Tab.3 lists the results of this experiment.The table reports the results for both gender-dependent and genderindependent system.The table also compares the XGBoost regressor, and the XGBoost classifier results.It demonstrates the high efficiency of the XGBoost classifier over the XGBoost regressor by taking advantage of the XGBoost classifier strength when dealing with multi-class classification issues.In gender-dependent system, the MAE and RMSE metrics of the XGBoost classifier are better than the XGBoost regressor metrics for both female and male speakers, where the MAE is decreased from 4.96 as a regression output into 4.68 as a classification output in the female’s part.The MAE is also significantly decreased from 7.73 as a regression output into 4.98 as a classification output in the male’s part.On the other hand, the RMSE is significantly decreased from 8.50, 10.15 as regression outputs into 8.05, 6.97 as a classification output for female and male speakers, respectively.In gender-independent system, the MAE and RMSE metrics of the XGBoost classifier are also better than the XGBoost regressor, where the MAE and RMSE is significantly decreased from 10.75, 13.03 as a regression output into 6.06, 8.66 as a classification output, respectively.The table demonstrates the high efficiency of the proposed system in both gender-dependent and gender-independent one.

    Table 3:Performance evaluation of the proposed age estimation system in terms of MAE (years),and RMSE (years) on the TIMIT dataset

    The second experiment shows the impact of each feature group on the performance of the suggested system in terms of the MAE.Tab.4 shows the results of this experiment.The table shows the impact of each of the proposed feature groups on the efficiency of the proposed system in terms of MAE.The table demonstrates that the estimation errors from those different groups of features are complementary, allowing estimates from those feature groups to be combined to additionally enhance the results.

    Table 4:The impact of each feature group on the proposed system performance in terms of MAE on the TIMIT dataset

    The third experiment compares the proposed normalization method (i.e., quantile) with other normalization methods in terms of MAE.Tab.5 shows the results of this experiment.The table shows a comparison in terms of MAE between the proposed normalization method, which is a quantile, and three other baseline methods, which are min-max transformation, z-score transformation, and power transformation, in addition to without normalization case.The table demonstrates that implementing the proposed Quantile normalization method presents better performance (i.e., lower MAE) because the Quantile transformation method tends to spread out the most frequent values which therefore reduced the impact of outliers.

    Table 5:Comparison in terms of MAE (years) between the proposed normalization method and other normalization methods on the TIMIT dataset

    The fourth experiment compares the proposed dimensionality reduction method (i.e., LDA)with other dimensionality reduction methods in terms of MAE.Tab.6 shows the results of this experiment.The table shows a comparison in terms of MAE between the proposed dimensionality reduction method, which is the LDA and three other baseline methods which are PCA NMF, and,Factor Analysis, in addition to without reduction case.The table demonstrates that implementing the LDA method presents a significantly better performance (i.e., lower MAE) because the LDA method reduces the dimensions depending on the class-label (i.e., supervised reduction).

    Table 6:Comparison in terms of MAE (years) between the proposed dimensionality reduction method and other dimensionality reduction methods on the TIMIT dataset

    Finally, a comparison of the proposed system with related works utilizing the same dataset(i.e., TIMIT dataset) in terms of MAE and RMSE is presented in the fifth experiment.Tab.7 shows the results of this experiment.In the case of MAE, the relative improvement (iMAE) of the proposed system is up to 10% and 28% for male and female speakers, respectively.In the case of RMSE, the relative improvement (iRMSE) of the proposed system is up to 14% and 10%respectively for male and female speakers.The table demonstrates the superiority of the proposed system taking advantage of using features fusion, statistical functionals, supervised LDA, and XGBoost classifier.

    Table 7:Comparison in terms of the MAE (years) and RMSE (years) between the proposed system and related works which are utilized the TIMIT dataset

    5 Conclusion and Future Works

    An automatic system to estimate age in short speech utterances without depending on the text as well as the speaker is proposed in this study.Four groups of features are combined to further improve system performance.Then, the dynamic size features are turned into static size features by measuring 10 statistical functionals for each dimension.After that, the use of the LDA method has a major impact on the efficiency of the system by producing a reduced informative feature vector.Finally, the proposed system treating the age estimation problem as a multi-class classification problem taking advantage of the XGBoost classifier strength.The experimental results clearly show the effectiveness of the proposed system in both gender-dependent and genderindependent with MAE of 4.68, 4.98, and 6.06 for female, male, and male & female speakers,respectively, using the TIMIT dataset.For future work, a DNN may be utilized for joint gender and age estimation from short utterances where the network can be fed with the same reduced feature vectors proposed in this study.

    Funding Statement:The authors received no specific funding for this study.

    Conflicts of Interest:The authors declare that they have no conflict of interest to report regarding the present study.

    中文字幕av电影在线播放| 久久精品久久久久久久性| 中文字幕免费在线视频6| 午夜免费男女啪啪视频观看| 国产白丝娇喘喷水9色精品| 人妻夜夜爽99麻豆av| 久久99蜜桃精品久久| 日韩三级伦理在线观看| 亚洲,一卡二卡三卡| 午夜激情av网站| 新久久久久国产一级毛片| 黄色配什么色好看| 国产一区二区在线观看日韩| 99九九线精品视频在线观看视频| 亚洲人与动物交配视频| 两个人免费观看高清视频| 成年人免费黄色播放视频| 亚洲激情五月婷婷啪啪| 多毛熟女@视频| 国产片特级美女逼逼视频| 久久久a久久爽久久v久久| 国产探花极品一区二区| 中文欧美无线码| 亚洲欧洲日产国产| 亚洲欧美中文字幕日韩二区| 久久国产精品男人的天堂亚洲 | 亚洲精品aⅴ在线观看| 精品国产一区二区久久| 99九九在线精品视频| 欧美日韩亚洲高清精品| 少妇高潮的动态图| 亚洲欧洲国产日韩| 能在线免费看毛片的网站| 黑人巨大精品欧美一区二区蜜桃 | 看非洲黑人一级黄片| 欧美性感艳星| 国产精品99久久99久久久不卡 | 欧美bdsm另类| 毛片一级片免费看久久久久| 精品亚洲成国产av| 天堂中文最新版在线下载| 日本vs欧美在线观看视频| av国产精品久久久久影院| 国产日韩欧美视频二区| 欧美老熟妇乱子伦牲交| 日本色播在线视频| 久久国产亚洲av麻豆专区| 免费看av在线观看网站| 免费看光身美女| 91aial.com中文字幕在线观看| 黄片播放在线免费| 丰满饥渴人妻一区二区三| 91成人精品电影| 国产色婷婷99| 国产视频首页在线观看| 国产成人精品婷婷| 国产精品国产三级国产av玫瑰| 人人妻人人添人人爽欧美一区卜| 日韩欧美精品免费久久| 嫩草影院入口| 热re99久久国产66热| 99热全是精品| 一区二区日韩欧美中文字幕 | 嫩草影院入口| 高清黄色对白视频在线免费看| 秋霞在线观看毛片| 美女内射精品一级片tv| 王馨瑶露胸无遮挡在线观看| 2021少妇久久久久久久久久久| 麻豆精品久久久久久蜜桃| 91成人精品电影| 一本一本久久a久久精品综合妖精 国产伦在线观看视频一区 | 亚洲美女黄色视频免费看| 精品久久久久久电影网| 欧美性感艳星| 亚洲四区av| 飞空精品影院首页| 免费黄频网站在线观看国产| 国产探花极品一区二区| 国产成人精品无人区| 国产极品粉嫩免费观看在线 | 男男h啪啪无遮挡| 久久久久久久大尺度免费视频| 少妇高潮的动态图| 自拍欧美九色日韩亚洲蝌蚪91| 男人爽女人下面视频在线观看| 久久狼人影院| 91国产中文字幕| 日本爱情动作片www.在线观看| 午夜精品国产一区二区电影| 国产成人精品久久久久久| 一级片'在线观看视频| 一本一本综合久久| 日韩中字成人| 国产视频内射| 少妇高潮的动态图| 亚洲av.av天堂| 美女cb高潮喷水在线观看| 亚洲第一av免费看| 亚洲精品视频女| 久久ye,这里只有精品| 青春草国产在线视频| 国产精品99久久久久久久久| 国产伦精品一区二区三区视频9| av电影中文网址| videos熟女内射| 美女脱内裤让男人舔精品视频| 日韩中文字幕视频在线看片| 丝袜脚勾引网站| 国产成人午夜福利电影在线观看| 日产精品乱码卡一卡2卡三| 狂野欧美激情性xxxx在线观看| 久久久久精品性色| 久久毛片免费看一区二区三区| 91国产中文字幕| 久久99热这里只频精品6学生| 国产69精品久久久久777片| 午夜福利,免费看| 五月伊人婷婷丁香| 在线 av 中文字幕| 国产探花极品一区二区| xxx大片免费视频| 丝袜脚勾引网站| 国产伦理片在线播放av一区| 亚洲av.av天堂| 亚洲国产色片| 国产精品人妻久久久久久| 久久久亚洲精品成人影院| 91久久精品国产一区二区三区| 久久久久国产精品人妻一区二区| 国产精品99久久久久久久久| 国产精品国产三级国产av玫瑰| 黄色视频在线播放观看不卡| 国产精品.久久久| 最近最新中文字幕免费大全7| a级毛片黄视频| 国产伦理片在线播放av一区| 成人国产av品久久久| 伊人久久精品亚洲午夜| 各种免费的搞黄视频| 熟女人妻精品中文字幕| 精品少妇黑人巨大在线播放| 国产精品人妻久久久影院| av不卡在线播放| 亚洲人与动物交配视频| 亚洲av福利一区| 综合色丁香网| 51国产日韩欧美| 久久精品熟女亚洲av麻豆精品| av网站免费在线观看视频| 一级片'在线观看视频| 下体分泌物呈黄色| 午夜影院在线不卡| 亚洲av二区三区四区| 亚洲熟女精品中文字幕| 国产成人av激情在线播放 | 黄色欧美视频在线观看| av.在线天堂| 自拍欧美九色日韩亚洲蝌蚪91| 日日摸夜夜添夜夜爱| 最近2019中文字幕mv第一页| 欧美老熟妇乱子伦牲交| 最新的欧美精品一区二区| 亚洲国产欧美在线一区| 国产国拍精品亚洲av在线观看| 国产视频内射| 天堂中文最新版在线下载| 成人18禁高潮啪啪吃奶动态图 | 久久这里有精品视频免费| 美女cb高潮喷水在线观看| 精品一区二区三区视频在线| 九九爱精品视频在线观看| 久久久久久久久久久丰满| 日日摸夜夜添夜夜添av毛片| 男女边摸边吃奶| 亚洲av.av天堂| 久久久久久久精品精品| 99热全是精品| 如何舔出高潮| 日韩一区二区视频免费看| 成年人免费黄色播放视频| 一区二区三区精品91| 国产精品三级大全| 大香蕉97超碰在线| 亚洲少妇的诱惑av| 亚洲无线观看免费| 日产精品乱码卡一卡2卡三| 777米奇影视久久| 我要看黄色一级片免费的| 久久久久久人妻| 欧美日本中文国产一区发布| 街头女战士在线观看网站| 伦理电影免费视频| 久久人人爽人人片av| 国产不卡av网站在线观看| 日韩精品有码人妻一区| 在线观看www视频免费| 人体艺术视频欧美日本| 日本猛色少妇xxxxx猛交久久| 99久国产av精品国产电影| av福利片在线| 日本wwww免费看| 91久久精品国产一区二区三区| 国模一区二区三区四区视频| 国产乱来视频区| 99久久精品一区二区三区| 22中文网久久字幕| 国产一区亚洲一区在线观看| 久久ye,这里只有精品| 51国产日韩欧美| 人妻 亚洲 视频| 久久久亚洲精品成人影院| 91久久精品国产一区二区三区| 午夜激情福利司机影院| 99久久精品一区二区三区| 日韩av在线免费看完整版不卡| 一边亲一边摸免费视频| 高清欧美精品videossex| 91成人精品电影| 在线观看免费日韩欧美大片 | 免费看光身美女| xxx大片免费视频| 香蕉精品网在线| 欧美精品人与动牲交sv欧美| 国语对白做爰xxxⅹ性视频网站| 两个人免费观看高清视频| 精品久久久久久电影网| 999精品在线视频| 国产女主播在线喷水免费视频网站| 卡戴珊不雅视频在线播放| 80岁老熟妇乱子伦牲交| 亚洲精品乱久久久久久| 中文字幕亚洲精品专区| 91久久精品电影网| 99久久精品国产国产毛片| 久久鲁丝午夜福利片| 亚洲av福利一区| 建设人人有责人人尽责人人享有的| 99精国产麻豆久久婷婷| 久久韩国三级中文字幕| 色5月婷婷丁香| 亚洲av福利一区| 纯流量卡能插随身wifi吗| 日韩av在线免费看完整版不卡| 一级毛片我不卡| 人人妻人人爽人人添夜夜欢视频| 男女边吃奶边做爰视频| 特大巨黑吊av在线直播| 自线自在国产av| 搡老乐熟女国产| 免费久久久久久久精品成人欧美视频 | 特大巨黑吊av在线直播| 看十八女毛片水多多多| 极品人妻少妇av视频| 99久久精品国产国产毛片| 久久精品熟女亚洲av麻豆精品| 91午夜精品亚洲一区二区三区| 制服人妻中文乱码| 18禁动态无遮挡网站| 精品国产一区二区三区久久久樱花| 日韩中文字幕视频在线看片| 下体分泌物呈黄色| 熟女av电影| av福利片在线| 婷婷色综合www| 日韩视频在线欧美| 国产精品久久久久久久电影| 久久久久久久精品精品| 国产成人一区二区在线| 中国国产av一级| 欧美97在线视频| 精品国产一区二区久久| 国产爽快片一区二区三区| 汤姆久久久久久久影院中文字幕| 国产精品一区www在线观看| 国产av精品麻豆| 一区二区三区四区激情视频| 91精品三级在线观看| 一级毛片电影观看| 精品人妻在线不人妻| 亚洲一级一片aⅴ在线观看| 欧美日韩亚洲高清精品| av免费观看日本| 日韩一区二区三区影片| 亚洲精品自拍成人| 熟女人妻精品中文字幕| 男的添女的下面高潮视频| 性色avwww在线观看| 亚洲国产日韩一区二区| 最近中文字幕高清免费大全6| 久久久欧美国产精品| 亚洲国产欧美在线一区| 日本欧美视频一区| 亚洲精品色激情综合| 九九爱精品视频在线观看| 制服诱惑二区| 麻豆乱淫一区二区| 婷婷色麻豆天堂久久| 精品一区二区三区视频在线| 久久99一区二区三区| 一级毛片我不卡| 成人免费观看视频高清| 一二三四中文在线观看免费高清| 黄色一级大片看看| 91久久精品国产一区二区成人| 色5月婷婷丁香| 伦理电影免费视频| 久久毛片免费看一区二区三区| 啦啦啦在线观看免费高清www| 亚洲少妇的诱惑av| 亚洲av.av天堂| 亚洲不卡免费看| 看非洲黑人一级黄片| 妹子高潮喷水视频| 国产精品无大码| 最近的中文字幕免费完整| 三级国产精品欧美在线观看| 亚洲精品日本国产第一区| 日本免费在线观看一区| 亚洲精品国产av成人精品| 亚洲av二区三区四区| 久久97久久精品| 久久av网站| 91久久精品国产一区二区三区| 一级毛片aaaaaa免费看小| 一本—道久久a久久精品蜜桃钙片| 亚洲不卡免费看| av福利片在线| 国产爽快片一区二区三区| 一区在线观看完整版| 99热这里只有是精品在线观看| a级毛片在线看网站| 精品一区二区三区视频在线| 美女大奶头黄色视频| 亚洲av免费高清在线观看| 99精国产麻豆久久婷婷| 春色校园在线视频观看| 日本黄大片高清| 亚洲欧美清纯卡通| 成年美女黄网站色视频大全免费 | 成人毛片60女人毛片免费| 国内精品宾馆在线| 永久免费av网站大全| 国内精品宾馆在线| 亚洲精品aⅴ在线观看| 日韩亚洲欧美综合| 纵有疾风起免费观看全集完整版| 国产精品一区www在线观看| 永久免费av网站大全| 亚洲av免费高清在线观看| 免费观看av网站的网址| 免费日韩欧美在线观看| av网站免费在线观看视频| 亚洲国产色片| 亚洲精品视频女| 国产黄色视频一区二区在线观看| 2022亚洲国产成人精品| 在线观看www视频免费| 在线观看免费日韩欧美大片 | 久久久久久伊人网av| 伊人久久国产一区二区| 一级毛片 在线播放| 新久久久久国产一级毛片| 国产成人91sexporn| 99久久人妻综合| 国产一区二区在线观看av| 午夜精品国产一区二区电影| 极品人妻少妇av视频| 伊人久久国产一区二区| 黑人高潮一二区| 天天影视国产精品| 国产精品欧美亚洲77777| 午夜免费观看性视频| 国产精品欧美亚洲77777| 大话2 男鬼变身卡| 中文乱码字字幕精品一区二区三区| 成年美女黄网站色视频大全免费 | 国产精品偷伦视频观看了| 国产精品一二三区在线看| 黄片无遮挡物在线观看| 狠狠精品人妻久久久久久综合| 国产极品粉嫩免费观看在线 | av免费观看日本| 女的被弄到高潮叫床怎么办| 搡女人真爽免费视频火全软件| 国产深夜福利视频在线观看| 一本色道久久久久久精品综合| 久久久国产欧美日韩av| 国产亚洲欧美精品永久| 80岁老熟妇乱子伦牲交| 国产精品女同一区二区软件| 亚洲国产日韩一区二区| 国产免费一区二区三区四区乱码| 天堂中文最新版在线下载| 女性被躁到高潮视频| 天天操日日干夜夜撸| 老司机亚洲免费影院| 欧美精品一区二区大全| 麻豆乱淫一区二区| 日韩中字成人| 丁香六月天网| 国产精品免费大片| 欧美老熟妇乱子伦牲交| 中文字幕av电影在线播放| 色94色欧美一区二区| 一级毛片我不卡| 亚洲欧美精品自产自拍| 午夜激情福利司机影院| 一区在线观看完整版| 久久久久久人妻| 亚洲av男天堂| av专区在线播放| 国产精品一区二区在线不卡| 黄色配什么色好看| 插阴视频在线观看视频| 欧美精品人与动牲交sv欧美| 日本黄色片子视频| 久久久久精品久久久久真实原创| 涩涩av久久男人的天堂| 色哟哟·www| 国产午夜精品一二区理论片| 校园人妻丝袜中文字幕| 久久久国产精品麻豆| 一区二区三区乱码不卡18| 午夜福利在线观看免费完整高清在| 亚洲欧美中文字幕日韩二区| 中文欧美无线码| 99热这里只有精品一区| 97超视频在线观看视频| 久久毛片免费看一区二区三区| 久久人人爽av亚洲精品天堂| 国产白丝娇喘喷水9色精品| av电影中文网址| 天天操日日干夜夜撸| 麻豆乱淫一区二区| 日韩精品有码人妻一区| 亚洲第一av免费看| 青春草亚洲视频在线观看| 成年人免费黄色播放视频| 99九九在线精品视频| 满18在线观看网站| 一级黄片播放器| 欧美日韩精品成人综合77777| 99久久人妻综合| 亚洲国产欧美在线一区| 欧美激情 高清一区二区三区| 国产av国产精品国产| 最近中文字幕2019免费版| 中文乱码字字幕精品一区二区三区| 婷婷色麻豆天堂久久| 欧美成人精品欧美一级黄| 国产成人91sexporn| 青春草视频在线免费观看| 久久影院123| 女人久久www免费人成看片| 国产在视频线精品| 天天躁夜夜躁狠狠久久av| 亚洲av综合色区一区| 国产一区二区在线观看av| 精品一区二区三卡| 天堂8中文在线网| 久久精品人人爽人人爽视色| 老司机影院成人| 日产精品乱码卡一卡2卡三| 国产欧美日韩一区二区三区在线 | 国产伦精品一区二区三区视频9| 欧美日韩亚洲高清精品| 爱豆传媒免费全集在线观看| 日韩在线高清观看一区二区三区| 亚洲欧美日韩卡通动漫| 久久人妻熟女aⅴ| 99热6这里只有精品| 欧美bdsm另类| 黄色欧美视频在线观看| 国产综合精华液| 亚洲成人av在线免费| 下体分泌物呈黄色| 美女主播在线视频| 国产一区有黄有色的免费视频| 国产视频首页在线观看| 色94色欧美一区二区| 精品一区二区三区视频在线| 一区在线观看完整版| 在线观看人妻少妇| 成人18禁高潮啪啪吃奶动态图 | 久久久久久久精品精品| 肉色欧美久久久久久久蜜桃| 少妇人妻精品综合一区二区| 成人国产麻豆网| 青春草视频在线免费观看| 欧美日韩视频高清一区二区三区二| 久久久久久久久久人人人人人人| 狂野欧美激情性bbbbbb| 少妇丰满av| 欧美97在线视频| 视频中文字幕在线观看| 美女脱内裤让男人舔精品视频| 欧美xxxx性猛交bbbb| 精品国产乱码久久久久久小说| 亚洲第一av免费看| 人人妻人人添人人爽欧美一区卜| 狠狠婷婷综合久久久久久88av| 免费黄色在线免费观看| 人人妻人人澡人人看| 亚洲欧美色中文字幕在线| 18禁在线播放成人免费| 国产av码专区亚洲av| 黄色视频在线播放观看不卡| 少妇丰满av| 日韩电影二区| 一区二区三区四区激情视频| 国产免费一级a男人的天堂| 欧美亚洲日本最大视频资源| av又黄又爽大尺度在线免费看| a级片在线免费高清观看视频| 亚洲婷婷狠狠爱综合网| 男女啪啪激烈高潮av片| 18禁观看日本| 国产亚洲午夜精品一区二区久久| 黄色配什么色好看| 下体分泌物呈黄色| 人人妻人人澡人人看| 久久这里有精品视频免费| 久久 成人 亚洲| 男的添女的下面高潮视频| 纯流量卡能插随身wifi吗| 国产亚洲欧美精品永久| 大香蕉久久网| 国产黄色视频一区二区在线观看| 国产精品秋霞免费鲁丝片| av视频免费观看在线观看| 啦啦啦视频在线资源免费观看| 人成视频在线观看免费观看| 成年av动漫网址| 国产亚洲午夜精品一区二区久久| 久久精品人人爽人人爽视色| 免费黄色在线免费观看| 久久久久久伊人网av| 久久亚洲国产成人精品v| 丰满少妇做爰视频| 亚洲av中文av极速乱| 亚洲人成网站在线播| 色哟哟·www| 亚洲av二区三区四区| 亚洲欧美日韩另类电影网站| 久久久久久久久大av| 国产精品不卡视频一区二区| 午夜福利影视在线免费观看| 日本av免费视频播放| av在线老鸭窝| 视频中文字幕在线观看| 亚洲国产精品一区二区三区在线| 91精品国产九色| 免费久久久久久久精品成人欧美视频 | 欧美变态另类bdsm刘玥| 午夜91福利影院| 久久99精品国语久久久| 国产高清三级在线| 日本av手机在线免费观看| 纵有疾风起免费观看全集完整版| 蜜桃久久精品国产亚洲av| 多毛熟女@视频| 黑丝袜美女国产一区| 久久精品久久久久久久性| 在线观看三级黄色| 伦理电影大哥的女人| 少妇 在线观看| 女人精品久久久久毛片| 国产探花极品一区二区| 人妻少妇偷人精品九色| 少妇人妻久久综合中文| 人妻一区二区av| 国产亚洲精品久久久com| 欧美日韩精品成人综合77777| 日韩欧美一区视频在线观看| 丰满少妇做爰视频| 国产 一区精品| h视频一区二区三区| videossex国产| 亚洲美女黄色视频免费看| 青春草视频在线免费观看| 精品少妇内射三级| 欧美亚洲 丝袜 人妻 在线| 一级爰片在线观看| 在线观看免费高清a一片| 汤姆久久久久久久影院中文字幕| 午夜影院在线不卡| 91精品国产国语对白视频| 蜜臀久久99精品久久宅男| 国产日韩欧美在线精品| 午夜福利视频精品| 2022亚洲国产成人精品| 免费少妇av软件| 热re99久久国产66热| 国产精品成人在线| 日韩精品免费视频一区二区三区 | 少妇熟女欧美另类| 国产精品国产三级国产av玫瑰| 看免费成人av毛片| 久久人人爽人人片av| 在线观看三级黄色| 人人妻人人澡人人爽人人夜夜| 人人妻人人澡人人爽人人夜夜| 性色av一级| 日韩av免费高清视频| 欧美日韩综合久久久久久| 91精品国产九色| √禁漫天堂资源中文www| 一级毛片黄色毛片免费观看视频| 亚洲av不卡在线观看| 成人影院久久| 制服丝袜香蕉在线| 自线自在国产av| 成人漫画全彩无遮挡| 91久久精品电影网|