• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    A Novel Auto-Annotation Technique for Aspect Level Sentiment Analysis

    2022-03-14 09:24:48MuhammadAasimQureshiMuhammadAsifMohdFadzilHassanGhulamMustafaMuhammadKhurramEhsanAasimAliandUnazaSajid
    Computers Materials&Continua 2022年3期

    Muhammad Aasim Qureshi,Muhammad Asif,Mohd Fadzil Hassan,Ghulam Mustafa,Muhammad Khurram Ehsan,Aasim Ali and Unaza Sajid

    1Department of Computer Sciences,Bahria University,Lahore Campus,54000,Pakistan

    2Computer and Information Science Department,University Teknologi,Petronas,32610,Malaysia

    Abstract: In machine learning, sentiment analysis is a technique to find and analyze the sentiments hidden in the text.For sentiment analysis, annotated data is a basic requirement.Generally,this data is manually annotated.Manual annotation is time consuming,costly and laborious process.To overcome these resource constraints this research has proposed a fully automated annotation technique for aspect level sentiment analysis.Dataset is created from the reviews of ten most popular songs on YouTube.Reviews of five aspects—voice,video,music,lyrics and song,are extracted.An N-Gram based technique is proposed.Complete dataset consists of 369436 reviews that took 173.53 s to annotate using the proposed technique while this dataset might have taken approximately 2.07 million seconds (575 h) if it was annotated manually.For the validation of the proposed technique,a sub-dataset—Voice,is annotated manually as well as with the proposed technique.Cohen’s Kappa statistics is used to evaluate the degree of agreement between the two annotations.The high Kappa value(i.e.,0.9571%)shows the high level of agreement between the two.This validates that the quality of annotation of the proposed technique is as good as manual annotation even with far less computational cost.This research also contributes in consolidating the guidelines for the manual annotation process.

    Keywords: Machine learning; natural language processing; annotation;semi-annotated technique; reviews annotation; text annotation; corpus annotation

    1 Introduction

    In recent years, the internet has gained popularity and it has become an eminent platform for socializing among users [1].It has transformed the real world into a cyber-world [2].Now almost everyone has easy access to hand-held devices with a reliable internet connection.Over the last few years, it is witnessed that due to the popularity of these handheld gadgets, massive data is being generated on the daily basis [3].This bulk data is being generated from diverse sources like social media platforms, e-commerce, games, etc [4].The generated data is both in a structured and unstructured form [3].Most of the unstructured data is being produced by e-users (i.e., people using Twitter, WhatsApp, Facebook, Instagram, YouTube, etc.) [4].This unstructured data is the prime challenge for the data analysts.

    To overcome this challenge there exist pre-processing techniques like data cleaning, dimensionality reduction, data standardization, data transformation and data annotation etc.that structure the unstructured data.Data annotation is one of the preprocessing techniques [5,6].It is a process to label the data into its targeted classes [7,8].There exist different schemes of data annotation like fully-automated annotation [9], manual annotation [10] and semi-automated annotation [11].Certain applications and platforms use an annotation scheme that suits them best based on their customized requirements.Manual annotation is a way to declare the subjectivity of each entity present in the data according to the metadata file i.e., annotation guidelines by involving humans [12].It is considered ideally reliable and accurate than the other two schemes.At the same time, it requires time, cost and efforts, which makes it less practical.A fully-automated annotation scheme is a way to annotate the documents with the help of fully automated tools like Portable document format annotation (PDFAnno), MyMiner, BARAT rapid annotation tool(BRAT) and team-text annotation tool (TeamTat) etc.A semi-automated annotation scheme is a way to annotate the dataset both taken manual and fully-automated annotation schemes into the consideration.

    Availability of different social networks, online blogs and other forums that enable people to discuss various aspects of products or services.Using new technologies, most social networking or e-commerce websites allow users to express their experience regarding the products, services and features [13].These reviews can help in analyzing any product, service and company [14].

    The exponential growth of reviews/comments can be witnessed due to the drastic increase in the number of e-users [15].The internet has re-modelled the communication world as the backbone of a digital era [16].Now the showbiz industry is also using this paradigm to progress [17].Due to its accessibility and innovativeness, people can, now, easily access the entertainment contents, watch them and give their feedback in the form of likes and reviews.Further, this content is judged by these likes, rating and reviews on it [18].A simple formula to check the quality of the content is:

    where,CQ=TL-TD

    where CQ quantifies the content quality using the metrics of Total likes (TL) and Total dislikes(TD).

    This provides limited insight into the quality of the content.A better way to evaluate the quality of content is through the analysis of comments/reviews.If the count of these reviews is not so high, this goal can easily be achieved by reading and analyzing these reviews manually.It becomes humanely impossible to analyze these reviews if they are in huge amount.That creates a need to analyze these reviews through a proper automated channel.In this context Sentiment analysis (SA), as an important paradigm, is used to know the general opinion towards that content [19].SA is a way to categorize the people’s opinions towards the entity into positive,neutral or negative [20].It can also be said that it is a way to classify the sentiments according to the class assigned by the reviewer [21].

    As discussed above, a huge amount of unstructured data is being generated by e-users on daily basis [22] in the form of comments and reviews.To analyze this data and mine the hidden patterns, the data is required to be in a structured form.There exist different preprocessing [23]techniques to overcome different data anomalies and prepare them for analysis.In Data Annotation every entity of the dataset is assigned a label according to their subjectivity.There exist different semi-automated and automated tools to annotate different types of data contents like the video [12], audio [24], image [11] and text [25].

    The rest of the paper is organized into six sections.Section 2 focuses on the previous related research on annotation.Section 3 discusses the entire corpus generation process including the steps involved.Section 4 focuses on the N-gram based proposed technique of auto-annotation for the English text data.Section 5 discusses the experimental results obtained from the proposed technique.Finally, Section 6 concludes the paper.

    2 Literature Review

    State-of-the-art studies have presented different tools for annotation.These tools can be classified into two categories that are annotation for image data and annotation for text data.

    2.1 Annotation for Image Data

    Bio-notate is a web-based annotation tool for biomedical annotation to annotate gene-disease and to annotate the binary association between the proteins [26].AlvisAE, reported in [27], is a semi-automated annotator to annotate tasks and assign different rules, based on the expertise and generate automatic annotation which can also be modified by the users.It is mostly used in biology and crop-sciences.GATE teamware [28] is a web-based open-source semi-automatic annotator which performs pre-annotation of fungal enzymes with the facility of manual correction.

    2.2 Annotation for Text Data

    Catma [29], is a web-based annotator which allows the users to import the text data using document browsing as well as by using Hypertext markup language (HTML) document by entering Uniform resource locater (URL).It allows the corpus creation.It also has the capability of automated document annotation as well as assigning manual tag sets.FLAT [30] folia is a web-based annotator which provides linguistic and semantic-based annotation using Folia format to annotate the biomedical document.MAT [31] is an active learning tool to annotate the text by importing a file in Extensible markup language (XML) and exports the annotations in either XML or Javascript object notation (JSON) formats.It is an offline application to annotate the text.BRAT, reported in [32], is a web-based text annotation tool to support Natural language processing (NLP) such as named entity recognition and part of speech tagging.BioQRator [33]is another web-based tool to annotate biomedical literature.

    TeamTat [34] presents an open-source web-based document annotation tool that annotates the plain text inputs as well as document input (BioC XML or XML) and the output document is in the form of BioC XML inline annotated document.Djangology [35] is a collaborative based document annotator to annotate the documents using web services.To annotate, the document is imported as plain text and after annotation, the annotated document is exported in plain text format.In [36] geo-annotator is presented.It is a collaborative semi-automated platform for constructing geo-annotated text corpora.The annotator is a semi-automatic web-based tool with collaborative visual analytics to solve place references in Natural language.

    There exist some articles annotator, Loomp [37] was a web-based tool or the annotation of articles that annotate articles based on the article‘s annotator.RDFa [38] based on a generalpurpose annotation framework, to annotate the news articles automatically.MyMiner [39] a web-based annotation tool can retrieve the abstracts and create a corpus for the annotation.A plain document is imported to find a binary relationship or for tagging the entity and output is also exported in the plain text.WebAnno [40] provides full functionality for syntax and semanticbased annotations.It allows a variety of formats to import the document for the annotation and as well to export the annotated document.PDFAnno [41] a PDF document annotator available open-source.The document is imported to annotate in PDF file format the PDFAnno performs annotation and to find the relationship between entities.It can also provide the facility to annotate the figures and tables as well.The tagtog [42] provides annotation at the entity level and as well as document level.It uses an active learning approach to annotate the retrieved abstracts or full test retrieve for the annotation purpose.LightTag [43] is a commercial tool to annotate the text and its supports different languages.It can learn from active annotators using machine learning and annotate the unseen text.

    There exist different automated tools to annotate the image data and text data as well.BRAT [32] is a tool that performs intuitive annotation, named entity annotation and dependency annotation.Where ezTag [44] tool is used to annotate the medical-based text data using lexicon base tagging concepts.CAT [45] is a tool that annotates the Ribonucleic acid (RNA) sequences and annotates the clades and to identify the relationship in orthology.According to the best of our knowledge, there hardly exists any tool to annotate the English text (comments/reviews)for SA.

    Recent studies like [46-48], have witnessed that researchers are annotating the text manually and some of them by using TextBlob [49-52].There exist tools like PDFAnno [41], MyMiner [39],BtableRAT [32] for text annotation, but no literature has witnessed text annotation for sentiment analysis at the aspect level.Manual annotation of reviews is a very hectic and time taking task [12]e.g., this research has figured out that on average 5.6 s are required to annotate one review.

    This study presents a corpus of 369,436 reviews, annotated through N-gram based proposed technique.If the manual annotation were performed, it might have taken 2.07 million seconds(574.68 h) i.e., approximately 24 days to annotate.Manual text annotation is the bottleneck in NLP because it is very time consuming [12].To overcome this bottleneck, this study presents an automated annotation technique for the English text using N-Gram based technique at the aspect level.The technique is also validated with Cohen’s Kappa Coefficient value.After the validation of the technique, the entire corpus is annotated at the aspect level using the N-gram based proposed technique.

    3 Corpus Generation

    A quality corpus needs systematic collection and thorough preprocessing which can further be divided into three sub-tasks named data collection, preprocessing and data annotation.Details can be seen below:

    3.1 Dataset Collection

    Data is a vital part of any analysis.No analysis can be performed without data.To collect data and to build a gold-standard dataset, the top ten songs are selected [50].Details can be seen in Tab.1.

    3.2 Preprocessing

    Data quality directly affects data analysis [18].To separate reviews carrying targeted aspects and to get processed data, different preprocessing techniques are applied like aspect filtration, data integration, lowercasing, emojis’removal and string size standardization.Details are as below.

    Table 1: Songs and number of reviews scraped

    3.2.1 Aspect Filtration

    In this study, five aspects/featur es (lyrics, music, song, video and voice) are targeted for autoannotation of reviews.Reviews that contain these aspects are separated by applying filters and saved in CSV file format.Total 4,886,406 reviews are scraped and after aspect level filtration, the obtained number of record is 369,436.

    The pre-processing extracted 7916 reviews for lyrics, 49238 for music, 199248 for the song,106127 for video and 6907 for voice.Dataset is now in ffity data files, details can be viewed in Fig.1.

    Figure 1: Number of reviews per song per aspect after aspect level filtration

    3.2.2 Data Integration

    The data from ten different files of one aspect are gathered in one file.As this study covers the five different aspects which resulted overall five files, one for each aspect and we call them sub-dataset.Each sub-dataset is named after one aspect e.g., sub-dataset—Voice.

    3.2.3 Lowercasing

    The case though has no special impact on the analysis of the data but when the same data is presented in different cases then it has adverse effects [48] e.g., algorithms will consider “Yes”and “yes” two different values.Therefore, to overcome this effect, whole data is converted into lowercase.

    3.2.4 Noise Removal

    It is reported, time and again, that noise directly affects the classification results [51].It was noted that collected data contained a lot of noise like white spaces, special characters, punctuation signs, etc.that has nothing to do with the analysis.To improve the quality of data all these characters were removed.

    3.2.5 Remove Number

    The dataset contained English text and numbers as well.This study is to analyze English text.The extra data increase computational power and also diverse the results [49].To address the said problems, all numbers are removed.

    3.2.6 Remove the Emoji’s

    Emoji’s is a popular way to show one’s feelings.It is widely used by e-users to show their feelings towards the entity.They leave their sentiments using emoji‘s [52].This study only focused on the text, therefore, emoji’s are removed from the dataset.

    3.2.7 Trim String Size

    In the dataset, there were several reviews of extraordinary length.For example in subdataset—Lyrics, a review had 11,487 number of tokens.In the same way in sub-dataset—Music there was a review that had 9,914 tokens.Such lengthy reviews are outliers and have a bad impact on classification [53].To overcome the issue length standardization is applied.To improve the quality of data, to have the least impact of data loss maximum length size is defined as 150 for all sub-datasets except for lyrics (due to the very small number of reviews).

    To resolve this issue the string size of sub-dataset—Lyrics is trimmed to 300 tokens that cover the 77.68% data and for the rest of the sub-datasets the max string size is defined as 150 characters.The rest of the details can be seen in Tab.2.

    Table 2: Trimming ratio

    Average reduction in tokens in sub-datasets—Lyrics, Music, Song, Video and Voice is 47.85%,43.23%, 27.36%, 54.10% and 84.25% respectively.Details of tokens before and after preprocessing can be seen in Fig.2.

    Figure 2: Number of tokens, before and after preprocessing

    3.3 Data Annotation

    Being the static part, data annotation is a process of categorizing the text (e.g., instance,review or comment) into positive, neutral or negative based upon its subjectivity.Previous studies showed different ways to annotate the data i.e., auto-annotation, semi-annotation and as well as manual annotation.Many automated tools are witnessed to annotate the image and video data.

    Very few tools exist to annotate text data, especially, there is no automated tool exists to annotate the English text for sentiment analysis.Generally, manual annotation is used to label text data.Details can be seen in the subsequent section.

    3.3.1 Manual Annotation

    In manual annotation, each review is labelled according to its subjectivity, manually.Each review is labelled by reading it one by one and assigned class according to its behaviour as positive, neutral or negative.Following the process explained in [54] manual annotation process was divided into four steps.

    In step-I, to annotate the reviews manually, the guidelines are prepared.In step-II, three volunteers were contacted to annotate the data.To start with they were given basic training based upon the guidelines.In step-III, the dataset was given to the annotators for annotation.The conflicts were resolved using an inter-annotator agreement.Finally, in step-IV, the computational value of the inter-annotator agreement was calculated using Kappa-statistics.The details of all steps are as below:

    Annotation Guidelines Preparation

    In the light of guidelines for each class—positive, negative and neutral, presented in different research works [55-57] are mapped on the current problem.Details are as below:

    Guidelines for Positive Class

    A review will be assigned “Positive”

    ? If it shows positive sentiments [53].

    ? If its behaviour is both neutral and positive [53,58].

    ? If there exists some positive word(s) in the sentence [59] e.g., good, beautiful, etc.

    ? If there exist illocutionary speech act like wow, congrats and smash classified as positive [60].

    Examples: In “best music yet” the word “best” clearly shows the positive polarity of the aspect—music.In another review “his voice it’s so soft and cool,”the behaviour is positive towards aspect—voice.

    Guidelines for Negative Class

    A review will be assigned “Negative”

    ? If it shows negative sentiments [55].

    ?If the use of language is abusive [1].

    ?If the behaviour is un-softened [59].

    ? If there exist some negative word(s) in the sentence [60] e.g., bad, ugly, annoying, etc.

    ? If there exist negation, in a review e.g., not good.

    Examples: In “music is trash,”the word trash expresses the polarity i.e., negative of the review for the aspect—music.In, “this is the stupid voice,” the word “stupid” shows negative sentiments of the reviewer on aspect—voice.

    Guidelines for Neutral Class

    A review will be assigned “neutral”

    ? If it is not showing any positive or negative sentiments [56].

    ? If a review has a piece of realistic information [57].

    ? If a review has both positive and negative sentiments [61].

    Examples: In “that music going too far away” the subjectivity of the review isn’t clear so it will be annotated as neutral.In, “the video is neither good nor bad,”in the review both sentiments are present so it will be annotated as neutral.

    Training of Annotators

    For the manual annotation the help of three volunteers was pursued, let’s call them A, B and C.The volunteers were graduates, well familiar with reviews and concepts of annotation and had a good grip on the English language.Three hours of the hands-on training session was conducted to explain the guidelines and discuss possible issues with them.

    Conflict Resolution

    For conflict resolutions, a short sample dataset of 100 reviews was created let’s call it SSD100.SSD100 was given to the first two annotators—annotator A and annotator B.Once they completed the annotation, a short meeting was arranged to resolve the conflicts by involving the third annotator too.After the conflict resolution, SSD100 was given to the annotator C for annotation.

    3.3.2 Problems Faced During Manual Annotation

    Though volunteers were very cooperative but still the process faced few problems as listed below:

    (i) Training

    (ii) Individual’s perception

    (iii) Clash removals

    (iv) Confidence level

    Even after 3 h of hands-on practice, still, annotators were consulting the trainers for the resolution of the issues.There was a big issue of an individual’s perception.5.33% of manual annotations, done by three annotators, was still updated by the trainer.During the annotation process, they were also asked to mention their confidence level (1-10) regarding their annotated label.The average confidence was 90.50%.It took almost 6 h to annotate 3700 reviews with an average of 5.6 s per review.This showed that manual annotation even with qualified and trained annotators is not perfect and unseen and unreported lags always remain there.A sample of annotated data is shown in Tab.3.

    Table 3: Sample annotated data

    4 Proposed Technique for Auto-Annotation

    To overcome the hectic and time-consuming process of manual annotation, this study has is presented a new fully automated technique for text annotation (at aspect level) based upon the language modelling technique of N-gram.

    4.1 N-Gram

    Models that assign probabilities to sequences of words are called language models (LMs).N-gram is one of the simplest models that assign probabilities to sentences and sequences of words.An N-gram is a sequence of N words.For example in a sentence “Best song ever justin...”a 2-gram (or bigram) is a two-word sequence like “Best song,”“song ever,”or “ever justin,”and a 3-gram (or trigram) is a three-word sequence of words like “Best song ever,”or “song ever justin.”N-gram model estimates the probability of the last word of an n-gram given the previous words,and also assign probabilities to entire sequences, thus the term N-gram is used to mean either the word sequence itself or the predictive model that assigns it a probability.

    For the joint probability of each word in a sequence having a particular valueP(W=w1,X=w2,Y=w3,...;Z=wn)we’ll useP(w1,w2,w3,...,wn).

    Applying the chain rule to the words

    where1,w2,w3,...,wnandP(wx|is conditional probability of occurrence ofwxgiven the occurrence ofw1,w2,w3,...,wy.

    For Bi-gram i.e., N = 2 Eq.(2) can be updated as

    4.2 Re-Definition of Ps

    This research is redefiningP’s to solve the problem in hand.We need to find the polarity of the text having n-words i.e.,w1,w2,w3,...,wn, we define it as P.The polarity will be counted if the behavior of the word is with respect to the aspect.So the polarity is being checked in the form of pair of words, out of which one is supposed to be aspect.P(wx|wy)defines the two words polarity wherewxis aspect andwycan be anything.Ifwyis positive then the polarity of these two words’combination would be positive and ifwyis negative then the polarity of these two words combination would be negative else neutral.

    The polarity of two-word combination is required to be checked for all occurrences ofwy where1 ≤y≤n and yx.To find the aspect in all words of the textwxis varied from 1 to n.Hence we have Eq.(2) to express all this where:

    1,w2,w3,...,wn

    where wk is the aspect

    P(wx) is the occurance of aspect

    P(wx|wy)is the polarity of occurrence of aspect,wx, given the occurrence ofwy

    We define the value of P as below:

    And

    whereBagpandBagn, are bag of positive words and bag of negative words respectively (the list of these words can easily be found online (e.g., GitHub) that can then be updated according to the tokens).

    For a complete sentence, this will result in an expression like

    Irrespective of the powers,pxis assigned 1 andnyis assigned -1

    For example, if the value of P(wn|wn-1)=p3n2we will assignp3=1and n2=-1.

    In this way, nearest words are associated with the targeted aspect and after computing its value, a label is assigned.To understand, Eq.(8) is explained, condition by condition, through examples as below:

    Example:p,if wa=Aspect and wa-1∈Bagp

    The value p will be assigned to P(wa|wa-1), ifwais an aspect andwa-1is a word that belongs to the bag of positive words.E.g., in “justin bieber you have amazing voice” if we check the value of P(wa|wa-1)at highlighted words then it will be p aswa=Voice, which is an aspect andwa-1=amazingwhich belongs to bag of positive words.

    Example;n,if wa=Aspect and wa-1∈Bagn

    The value n will be assigned to P(wa|wa-1), ifwais an aspect andwa-1is a word that belongs to the bag of negative words.E.g., in “justin bieber you have annoying voice” if we check the value of P(wa|wa-1)at highlighted words then it will be n aswa=Voice, which is an aspect andwa-1=annoyingwhich belongs to bag of negative words.

    Example:p,if waAspect and if wa and wa-1∈Bagp

    The value p will be assigned to to P(wa|wa-1), ifwais not an aspect andwaandwa-1is a word that belongs to the bag of positive words.E.g., in “l(fā)ove justin bieber voice look is so amazing”if we check the value of P(wa|wa-1)at highlighted words then it will be p aswaVoice, which is not an aspect andwa&wa-1belongs to bag of positive words.

    Example:n,if waAspect and if wa∈Bagp and wa-1∈Bagn

    The value n will be assigned to P(wa|wa-1), ifwais not an aspect andwais a word that belongs to the bag of negative words andwa-1is a word that belongs to the bag of negative words.E.g., in “ad to say but justin biebers voice is lighter not good as baby old justin bieber” if we check thewathat belongs to the bag of positive words wherewa-1is belongs to bag of negative words.

    Example:n,if waAspect and if wa or wa-1∈Bagn

    The value n will be assigned to P(wa|wa-1), ifwais not an aspect andwaandwa-1is a word that belongs to the bag of negative words.E.g., in “three billion viewers of justin hates i want to hear his voice” if we check the value of P(wa|wa-1)at highlighted words ‘hates’then it will be n aswa/Voice, which is not an aspect andwa&wa-1belongs to bag of negative words.

    Example: 1, &otherwise

    If the review not having any word that belongs to the bag of positive nor from the bag of negative words, the value 1 is assigned to that type of reviews which is labelled as neutral.E.g.,in “the october from bangladesh who are with me rise your voice” not any single word that belongs to the bag of positive words or belongs to the bag of negative words, the polarity of all these types of reviews is declared as neutral.

    5 Validation of Proposed Technique

    The technique is validated using Cohen’s Kappa statistics.It is a statistic that is used to indicate the inter-rater reliability between the annotators [62].According to Cohen’s Kappa, the value of Kappa >90% indicates almost perfect agreement between the annotators [63].Tab.4 presents the details of the interpretation of different levels of values of Cohen’s Kappa.

    Table 4: Interpretation of cohen’s kappa

    To prove the efficacy of the proposed technique, the inter-annotator agreement is calculated using Cohen’s Kappa statistics.For this purpose, three experiments were conducted.Two on SSD100 and one on sub-dataset—Voice.The details of the experiments are shown in Tab.5.

    Table 5: Comparison of kappa statistics value

    5.1 Experiment 1

    In this experiment, SSD100 was used for the manual annotation.This experiment was conducted during the training of Annotators—A, B and annotator C.Kappa value i.e., interannotator agreement between the three annotators was calculated (the value appeared 85.28%).

    5.2 Experiment 2

    In this experiment, the annotation results of SSD100 using manual annotation and using the based proposed technique are compared and the Kappa Statistics value is calculated.The estimated value of Kappa is 90.96%.The level of agreement using the value of Kappa are shown in Tab.4.

    5.3 Experiment 3

    In this experiment the sub-dataset—Voice is annotated once with manual annotation technique and then using the proposed technique.Kappa Statistics value is calculated to validate the reliability of the results of the proposed technique.The value appeared to be 95.71%.This proves that the proposed technique is giving results as good as manual annotation even with far less computational cost.The details of the Kappa value of different sizes of datasets can be seen in Tab.5.

    The remaining sub-datasets of four aspects are also annotated using N-gram based proposed technique.Details of all sub-datasets can be seen in Tab.6.

    Table 6: Number of reviews regarding each aspect

    It took almost 2 min 53.45 s in annotating complete dataset of size 369,436 reviews of 24534205 tokens with an average of 0.46963 milliseconds per review and 7.07 microseconds per token If it was attempted manually then it had taken approximately 14 weeks, 1 day and 7 h while working 40 h a week with the average of 5.6 s per review and 23.44 microseconds per token.Fig.3 explains the difference between the two.The Proposed technique completes the task in few seconds the manual might had taken days and weeks.In terms of the expected time of manual annotation, the proposed technique is efficient with a ratio of 1:11934.28.

    Figure 3: Comperison of time between manual and proposed technique

    6 Conclusion

    This study also established that manual annotation is too subjective.It is not as good as it is supposed to be.Unknowingly inaccuracies do exist.This research has presented a new technique to annotate large datasets as good as manual annotation and even with far less computational cost in comparison to manual annotation.The dataset of English text reviews is scraped, preprocessed and annotated manually as well as using the proposed technique.This technique may benefit in multiple ways.It does not need additional resources—financial as well as human.It is very efficient and requires very less time without any additional cost.The performance ratio of manual to proposed i.e., 1:11934.28 shows its efficiency.If the complete dataset were annotated manually then it might had have taken approximately 2.07 million seconds, 14 weeks, 1 day and 7 h while working 40 h a week (i.e., 575 h) but this technique has done the same with 173.53 s (2 min and 53.45 s).The high value of Kappa statistics i.e., 95.71% shows and validates the reliability of results generated by proposed technique.Machine learning and deep learning algorithms can be applied to the datasets, one with manual annotation and other with this proposed technique, to extend this analysis and study the variation in the two techniques.

    Acknowledgement:The authors would like to express their most profound gratitude towards, Mr.Umar Shoukat, Mr.Muneeb Fazal, Mr.Burhan Ul Haq Zahir and Ms.Rabbia Abrar for their valuable time and efforts for helping us in data collection and in the annotation process.

    Funding Statement:The authors received no specific funding for this study.

    Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.

    久久av网站| 热re99久久精品国产66热6| 又粗又硬又长又爽又黄的视频| 精品人妻熟女毛片av久久网站| 成年人午夜在线观看视频| 亚洲欧洲日产国产| 亚洲欧美成人综合另类久久久| 精品一品国产午夜福利视频| 在线免费观看不下载黄p国产| 欧美另类一区| 午夜福利网站1000一区二区三区| 国产亚洲精品久久久com| 精品久久国产蜜桃| a级毛片在线看网站| 久久久久久久亚洲中文字幕| 另类精品久久| 青春草国产在线视频| 亚洲国产成人一精品久久久| 亚洲国产看品久久| 18禁国产床啪视频网站| av黄色大香蕉| 王馨瑶露胸无遮挡在线观看| 男女边吃奶边做爰视频| av在线观看视频网站免费| 国产免费现黄频在线看| 国产成人精品一,二区| 在线免费观看不下载黄p国产| 美女福利国产在线| 伦理电影免费视频| 中文欧美无线码| av网站免费在线观看视频| 人妻 亚洲 视频| 女人被躁到高潮嗷嗷叫费观| 热99国产精品久久久久久7| 午夜免费鲁丝| 777米奇影视久久| 国产一区二区在线观看日韩| 五月开心婷婷网| 国产免费又黄又爽又色| 美女视频免费永久观看网站| av不卡在线播放| 成人漫画全彩无遮挡| 99久久中文字幕三级久久日本| 欧美精品一区二区免费开放| 黑丝袜美女国产一区| a级片在线免费高清观看视频| 精品99又大又爽又粗少妇毛片| 水蜜桃什么品种好| 中文字幕av电影在线播放| 搡老乐熟女国产| freevideosex欧美| 九草在线视频观看| 日韩欧美一区视频在线观看| 免费看不卡的av| 欧美+日韩+精品| 女人被躁到高潮嗷嗷叫费观| 成年人午夜在线观看视频| 99久国产av精品国产电影| 日韩免费高清中文字幕av| 日本av手机在线免费观看| 午夜激情久久久久久久| 桃花免费在线播放| 99热国产这里只有精品6| 国产黄色免费在线视频| 插逼视频在线观看| 97超碰精品成人国产| 午夜影院在线不卡| 香蕉精品网在线| 青春草国产在线视频| 97人妻天天添夜夜摸| 日韩欧美精品免费久久| 黄色怎么调成土黄色| 制服人妻中文乱码| 久久久精品94久久精品| 2022亚洲国产成人精品| videosex国产| 免费大片黄手机在线观看| 午夜福利,免费看| 精品国产一区二区三区四区第35| 如何舔出高潮| 97精品久久久久久久久久精品| 欧美国产精品va在线观看不卡| 校园人妻丝袜中文字幕| 免费av中文字幕在线| 九色成人免费人妻av| 搡女人真爽免费视频火全软件| 亚洲精品久久久久久婷婷小说| 亚洲av日韩在线播放| 国产在线免费精品| 欧美激情 高清一区二区三区| 久久婷婷青草| 免费黄频网站在线观看国产| 欧美成人午夜免费资源| 亚洲精品国产av蜜桃| 中文字幕免费在线视频6| 18禁观看日本| 黄色 视频免费看| 欧美日韩av久久| 2021少妇久久久久久久久久久| 国产成人a∨麻豆精品| 黑人欧美特级aaaaaa片| 成人漫画全彩无遮挡| 美女xxoo啪啪120秒动态图| 丰满乱子伦码专区| 爱豆传媒免费全集在线观看| 99热国产这里只有精品6| 亚洲av国产av综合av卡| 男女下面插进去视频免费观看 | 日本-黄色视频高清免费观看| 看免费成人av毛片| 人人妻人人爽人人添夜夜欢视频| 成年动漫av网址| 韩国精品一区二区三区 | 看十八女毛片水多多多| 国产毛片在线视频| 青春草视频在线免费观看| 人妻一区二区av| 国产av一区二区精品久久| 亚洲精华国产精华液的使用体验| 日韩欧美一区视频在线观看| 一个人免费看片子| 97在线人人人人妻| av线在线观看网站| 午夜影院在线不卡| 宅男免费午夜| 国产色爽女视频免费观看| 自线自在国产av| 欧美成人午夜免费资源| 国产精品国产三级国产专区5o| 我要看黄色一级片免费的| 亚洲综合精品二区| 久久97久久精品| 激情五月婷婷亚洲| 国国产精品蜜臀av免费| 欧美少妇被猛烈插入视频| 精品一区在线观看国产| 天美传媒精品一区二区| 午夜免费鲁丝| 在线观看美女被高潮喷水网站| 午夜福利视频精品| 日日爽夜夜爽网站| 亚洲高清免费不卡视频| 91精品伊人久久大香线蕉| 久久 成人 亚洲| 啦啦啦中文免费视频观看日本| 免费大片18禁| 久久99热这里只频精品6学生| 大片免费播放器 马上看| 99久久精品国产国产毛片| 啦啦啦视频在线资源免费观看| 两个人看的免费小视频| 青春草亚洲视频在线观看| 丰满乱子伦码专区| 国产高清国产精品国产三级| 最新的欧美精品一区二区| 这个男人来自地球电影免费观看 | 午夜激情久久久久久久| 91久久精品国产一区二区三区| 2018国产大陆天天弄谢| 搡老乐熟女国产| 成人漫画全彩无遮挡| 18禁在线无遮挡免费观看视频| 欧美 亚洲 国产 日韩一| 最近2019中文字幕mv第一页| 韩国高清视频一区二区三区| 亚洲少妇的诱惑av| 日本欧美视频一区| 日韩成人伦理影院| 大香蕉久久网| 天天操日日干夜夜撸| 国产极品天堂在线| 一级a做视频免费观看| 成人无遮挡网站| 狂野欧美激情性bbbbbb| 国产精品国产三级国产专区5o| 国产精品久久久久久久久免| 熟妇人妻不卡中文字幕| tube8黄色片| 一本一本久久a久久精品综合妖精 国产伦在线观看视频一区 | 黑人高潮一二区| 中国三级夫妇交换| 午夜久久久在线观看| 国产日韩一区二区三区精品不卡| 两个人看的免费小视频| 高清不卡的av网站| 国产精品久久久久久av不卡| 精品久久久精品久久久| 国产 精品1| 晚上一个人看的免费电影| 狠狠精品人妻久久久久久综合| av线在线观看网站| 在线观看一区二区三区激情| 日本欧美视频一区| 久久狼人影院| 久久久久久久久久成人| 大香蕉久久网| 中文字幕另类日韩欧美亚洲嫩草| 寂寞人妻少妇视频99o| 亚洲激情五月婷婷啪啪| 最黄视频免费看| 自拍欧美九色日韩亚洲蝌蚪91| 国产日韩一区二区三区精品不卡| 亚洲婷婷狠狠爱综合网| 久久久久久久大尺度免费视频| 最近手机中文字幕大全| 老司机影院成人| 亚洲精品国产av蜜桃| 久久毛片免费看一区二区三区| 高清黄色对白视频在线免费看| 大码成人一级视频| 婷婷色综合www| 国产色爽女视频免费观看| 成人亚洲欧美一区二区av| 免费高清在线观看视频在线观看| 国产国语露脸激情在线看| 亚洲内射少妇av| 人人妻人人添人人爽欧美一区卜| 精品少妇内射三级| a级毛片黄视频| 青春草国产在线视频| 日韩制服丝袜自拍偷拍| 天天操日日干夜夜撸| 1024视频免费在线观看| 精品国产乱码久久久久久小说| 亚洲综合色惰| 国产视频首页在线观看| 精品少妇内射三级| a 毛片基地| 久久久久久久大尺度免费视频| av在线老鸭窝| 亚洲性久久影院| 久久精品国产亚洲av涩爱| 波野结衣二区三区在线| 伊人亚洲综合成人网| 欧美日韩精品成人综合77777| 黑人高潮一二区| 精品第一国产精品| 精品卡一卡二卡四卡免费| 久久久久网色| 亚洲av综合色区一区| 国产高清三级在线| 伦精品一区二区三区| 国产av精品麻豆| 插逼视频在线观看| 一区二区三区精品91| 视频区图区小说| 欧美精品国产亚洲| 黄片播放在线免费| 曰老女人黄片| 久久精品夜色国产| 国产色爽女视频免费观看| 国产午夜精品一二区理论片| 成人毛片60女人毛片免费| 久久99蜜桃精品久久| 国产精品国产三级国产av玫瑰| 中国国产av一级| 夜夜骑夜夜射夜夜干| 天堂俺去俺来也www色官网| 97在线人人人人妻| 色5月婷婷丁香| 黑丝袜美女国产一区| 亚洲激情五月婷婷啪啪| 草草在线视频免费看| 狂野欧美激情性xxxx在线观看| 亚洲四区av| 在线亚洲精品国产二区图片欧美| 国产日韩欧美在线精品| 久久亚洲国产成人精品v| 全区人妻精品视频| 最黄视频免费看| 午夜久久久在线观看| 亚洲国产最新在线播放| 国产69精品久久久久777片| 亚洲人成网站在线观看播放| 久久人妻熟女aⅴ| 美女国产视频在线观看| 在线观看美女被高潮喷水网站| 亚洲五月色婷婷综合| 亚洲人成网站在线观看播放| 中文字幕人妻丝袜制服| 哪个播放器可以免费观看大片| 九色成人免费人妻av| 嫩草影院入口| 免费看av在线观看网站| 国产亚洲精品第一综合不卡 | 中文字幕人妻熟女乱码| 黑人猛操日本美女一级片| 午夜91福利影院| 成人二区视频| 蜜桃在线观看..| 欧美日韩一区二区视频在线观看视频在线| 婷婷色麻豆天堂久久| 亚洲成国产人片在线观看| 青春草亚洲视频在线观看| 亚洲精品日本国产第一区| 999精品在线视频| 美女中出高潮动态图| 岛国毛片在线播放| 交换朋友夫妻互换小说| 色婷婷av一区二区三区视频| 黄片播放在线免费| 熟女人妻精品中文字幕| 男女午夜视频在线观看 | 极品人妻少妇av视频| 人人妻人人添人人爽欧美一区卜| 最近的中文字幕免费完整| 一个人免费看片子| 国产精品.久久久| 国产精品一国产av| 激情五月婷婷亚洲| 99久国产av精品国产电影| 亚洲精品一区蜜桃| 亚洲经典国产精华液单| 免费高清在线观看视频在线观看| 又粗又硬又长又爽又黄的视频| 亚洲婷婷狠狠爱综合网| 国产欧美日韩一区二区三区在线| 亚洲成av片中文字幕在线观看 | 国产深夜福利视频在线观看| 欧美激情 高清一区二区三区| 欧美精品人与动牲交sv欧美| 亚洲成国产人片在线观看| 精品国产国语对白av| 18禁动态无遮挡网站| 夫妻性生交免费视频一级片| av免费在线看不卡| av国产久精品久网站免费入址| 国产成人欧美| 国产成人一区二区在线| 22中文网久久字幕| 免费久久久久久久精品成人欧美视频 | 蜜臀久久99精品久久宅男| 亚洲综合色网址| 国产精品久久久久成人av| 国产精品女同一区二区软件| 曰老女人黄片| 男人添女人高潮全过程视频| 精品熟女少妇av免费看| 成人国产av品久久久| 桃花免费在线播放| 美女福利国产在线| 18禁观看日本| 日本黄色日本黄色录像| 观看av在线不卡| 欧美日韩视频高清一区二区三区二| 激情视频va一区二区三区| 午夜福利,免费看| 亚洲天堂av无毛| 国产成人免费无遮挡视频| 久久精品国产a三级三级三级| 熟女电影av网| 国产麻豆69| 亚洲国产最新在线播放| 亚洲伊人色综图| 久久久久久久久久人人人人人人| 久久精品久久久久久噜噜老黄| 久久青草综合色| 女人被躁到高潮嗷嗷叫费观| 亚洲成人av在线免费| 人人妻人人澡人人看| 久久青草综合色| 久久久久久久大尺度免费视频| 黄色配什么色好看| 最新中文字幕久久久久| 91精品国产国语对白视频| 香蕉精品网在线| 亚洲中文av在线| 桃花免费在线播放| www日本在线高清视频| 久久精品国产鲁丝片午夜精品| 国产精品一区二区在线观看99| 亚洲欧洲国产日韩| 欧美国产精品va在线观看不卡| 免费人成在线观看视频色| 这个男人来自地球电影免费观看 | 国产亚洲av片在线观看秒播厂| 人人妻人人澡人人爽人人夜夜| 又大又黄又爽视频免费| 两个人看的免费小视频| 国产一级毛片在线| 大香蕉久久网| 精品久久久久久电影网| av电影中文网址| 国产免费又黄又爽又色| 最新的欧美精品一区二区| 9色porny在线观看| 久久久精品区二区三区| 久久久久精品人妻al黑| 一区二区av电影网| 性色avwww在线观看| 国产成人精品一,二区| 亚洲激情五月婷婷啪啪| 黄色怎么调成土黄色| 欧美 日韩 精品 国产| 免费av不卡在线播放| 亚洲av.av天堂| 亚洲国产精品成人久久小说| 赤兔流量卡办理| 香蕉国产在线看| 久久韩国三级中文字幕| 精品亚洲成a人片在线观看| 肉色欧美久久久久久久蜜桃| 国产成人精品婷婷| 欧美精品亚洲一区二区| 成年动漫av网址| 亚洲精品,欧美精品| 国产av国产精品国产| 伊人久久国产一区二区| 永久网站在线| 欧美精品av麻豆av| 丝袜在线中文字幕| 久久国产精品大桥未久av| 国产成人免费观看mmmm| 国产老妇伦熟女老妇高清| 热99久久久久精品小说推荐| 哪个播放器可以免费观看大片| 久久精品久久精品一区二区三区| 男女午夜视频在线观看 | 9191精品国产免费久久| 国产精品久久久av美女十八| 一区二区三区精品91| 91精品国产国语对白视频| 精品亚洲成国产av| 这个男人来自地球电影免费观看 | 国产精品.久久久| 一级毛片 在线播放| 最近的中文字幕免费完整| 免费大片黄手机在线观看| 黑人高潮一二区| 亚洲欧美日韩另类电影网站| 捣出白浆h1v1| 18禁裸乳无遮挡动漫免费视频| 在线免费观看不下载黄p国产| 久久女婷五月综合色啪小说| 一边亲一边摸免费视频| 国产成人欧美| 亚洲成人av在线免费| 欧美另类一区| 18禁动态无遮挡网站| 制服丝袜香蕉在线| 亚洲av电影在线观看一区二区三区| 蜜桃国产av成人99| 国产精品久久久久久久电影| 搡老乐熟女国产| 日韩制服丝袜自拍偷拍| 寂寞人妻少妇视频99o| 91久久精品国产一区二区三区| 丝袜在线中文字幕| 大陆偷拍与自拍| √禁漫天堂资源中文www| 亚洲成人一二三区av| 国产熟女欧美一区二区| 日本猛色少妇xxxxx猛交久久| 国产一级毛片在线| 国产av码专区亚洲av| 日韩成人av中文字幕在线观看| 久久久a久久爽久久v久久| 亚洲熟女精品中文字幕| a 毛片基地| 亚洲av电影在线进入| 9色porny在线观看| 日韩视频在线欧美| 免费在线观看黄色视频的| 九色亚洲精品在线播放| 欧美最新免费一区二区三区| 99国产精品免费福利视频| 在线观看免费日韩欧美大片| 美国免费a级毛片| 亚洲,欧美,日韩| 亚洲国产欧美在线一区| 女人被躁到高潮嗷嗷叫费观| 亚洲av日韩在线播放| 国产成人a∨麻豆精品| 国产伦理片在线播放av一区| 大码成人一级视频| 美女视频免费永久观看网站| 国产女主播在线喷水免费视频网站| 一二三四中文在线观看免费高清| 蜜桃在线观看..| 极品人妻少妇av视频| 热re99久久精品国产66热6| 最近最新中文字幕大全免费视频 | 人人妻人人添人人爽欧美一区卜| 成人综合一区亚洲| av福利片在线| 在线免费观看不下载黄p国产| 人人妻人人澡人人爽人人夜夜| 国产片内射在线| 精品亚洲成a人片在线观看| 欧美日韩视频精品一区| a 毛片基地| 国产男女超爽视频在线观看| 丰满迷人的少妇在线观看| 久久国产精品男人的天堂亚洲 | 青春草国产在线视频| 你懂的网址亚洲精品在线观看| 日本爱情动作片www.在线观看| 国产xxxxx性猛交| 国产一区亚洲一区在线观看| 99热这里只有是精品在线观看| 久久久国产精品麻豆| 久久ye,这里只有精品| 精品亚洲成国产av| 亚洲熟女精品中文字幕| 亚洲欧美中文字幕日韩二区| 精品酒店卫生间| 国产激情久久老熟女| 国产在线一区二区三区精| 亚洲精品视频女| 一本大道久久a久久精品| 欧美日本中文国产一区发布| 色哟哟·www| 国产在线一区二区三区精| 香蕉国产在线看| 下体分泌物呈黄色| 国产男女超爽视频在线观看| 色视频在线一区二区三区| 久久精品熟女亚洲av麻豆精品| 国产一区二区在线观看av| 国产无遮挡羞羞视频在线观看| 大话2 男鬼变身卡| 热re99久久国产66热| 亚洲精品美女久久av网站| 男女国产视频网站| 在线观看一区二区三区激情| 国产极品粉嫩免费观看在线| 韩国av在线不卡| 成人毛片60女人毛片免费| videossex国产| 看免费成人av毛片| 国产日韩一区二区三区精品不卡| 成人综合一区亚洲| 极品人妻少妇av视频| 如何舔出高潮| 只有这里有精品99| 久久精品久久久久久久性| 制服诱惑二区| 韩国高清视频一区二区三区| 欧美精品人与动牲交sv欧美| 一本大道久久a久久精品| 久久精品人人爽人人爽视色| 日本爱情动作片www.在线观看| 国产精品人妻久久久久久| 国产男女内射视频| 天天影视国产精品| 亚洲av.av天堂| 亚洲美女搞黄在线观看| av免费观看日本| 另类精品久久| 欧美日韩成人在线一区二区| 欧美xxxx性猛交bbbb| 人成视频在线观看免费观看| 亚洲av免费高清在线观看| 国产av精品麻豆| av国产久精品久网站免费入址| 国产免费现黄频在线看| 日日爽夜夜爽网站| 免费看光身美女| 人人妻人人添人人爽欧美一区卜| 中国美白少妇内射xxxbb| 欧美性感艳星| 亚洲国产日韩一区二区| 一二三四中文在线观看免费高清| 精品酒店卫生间| 久久 成人 亚洲| 亚洲经典国产精华液单| 久久 成人 亚洲| 亚洲经典国产精华液单| av国产久精品久网站免费入址| 色网站视频免费| 午夜91福利影院| 成人亚洲精品一区在线观看| 久久99热这里只频精品6学生| 免费大片黄手机在线观看| 免费观看性生交大片5| 成年人免费黄色播放视频| xxxhd国产人妻xxx| videosex国产| 国产免费一区二区三区四区乱码| 欧美最新免费一区二区三区| 美女内射精品一级片tv| 久久精品人人爽人人爽视色| 嫩草影院入口| 精品国产露脸久久av麻豆| 日韩欧美精品免费久久| 黑丝袜美女国产一区| 亚洲国产最新在线播放| 久久久久久久久久人人人人人人| 久久精品国产综合久久久 | 91久久精品国产一区二区三区| 国产不卡av网站在线观看| 亚洲精品国产色婷婷电影| 蜜桃在线观看..| 欧美性感艳星| 成年人免费黄色播放视频| 高清不卡的av网站| 国产白丝娇喘喷水9色精品| 美女福利国产在线| 国产成人精品在线电影| 久久 成人 亚洲| 90打野战视频偷拍视频| 美国免费a级毛片| 欧美精品一区二区大全| 秋霞在线观看毛片| 日本av免费视频播放| 亚洲国产欧美日韩在线播放| 久久久久精品久久久久真实原创| 18禁观看日本| 少妇人妻久久综合中文| 久久国产精品男人的天堂亚洲 | 国产深夜福利视频在线观看| 日日撸夜夜添| 久久精品熟女亚洲av麻豆精品| 搡老乐熟女国产| 国产在线视频一区二区|