• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Big Data Analytics in Telecommunications: Literature Review and Architecture Recommendations

    2020-02-29 14:13:30HiraZahidTariqMahmoodAhsanMorshedandTimosSellis
    IEEE/CAA Journal of Automatica Sinica 2020年1期

    Hira Zahid, Tariq Mahmood, Ahsan Morshed, and Timos Sellis,

    Abstract—This paper focuses on facilitating state-of-the-art applications of big data analytics (BDA) architectures and infrastructures to telecommunications (telecom) industrial sector.Telecom companies are dealing with terabytes to petabytes of data on a daily basis. IoT applications in telecom are further contributing to this data deluge. Recent advances in BDA have exposed new opportunities to get actionable insights from telecom big data. These benefits and the fast-changing BDA technology landscape make it important to investigate existing BDA applications to telecom sector. For this, we initially determine published research on BDA applications to telecom through a systematic literature review through which we filter 38 articles and categorize them in frameworks, use cases, literature reviews, white papers and experimental validations. We also discuss the benefits and challenges mentioned in these articles. We find that experiments are all proof of concepts (POC) on a severely limited BDA technology stack (as compared to the available technology stack), i.e.,we did not find any work focusing on full-fledged BDA implementation in an operational telecom environment. To facilitate these applications at research-level, we propose a state-of-the-art lambda architecture for BDA pipeline implementation (called LambdaTel) based completely on open source BDA technologies and the standard Python language, along with relevant guidelines.We discovered only one research paper which presented a relatively-limited lambda architecture using the proprietary AWS cloud infrastructure. We believe LambdaTel presents a clear roadmap for telecom industry practitioners to implement and enhance BDA applications in their enterprises.

    II. INTRODUCTION

    THE telecommunications (telecom) industry is facing an avalanche of data on a daily basis due to smart phone usage and boom of social media and IoT along with availability of next generation communication networks. Data occurs in both batch and real-time modes. Notable data examples are call detail records, user clickstream, mobile network usage,geographical user data, network performance, network monitoring, customer/subcriber profiles, hardware and VOIP data.In telecom, big data can be characterized by the standard 3V’s:volume, variety and velocity[1]-[4]. The value of this data(generally the 4th V) is Big Data Analytics (BDA) [5], [6]which is the process of extracting valuable insights from big data streams that can help align business strategies to meet critical KPIs. BDA can harness big data for telecom by employing knowledge from diverse domains notably machine learning, statistics, pattern recognition, and business intelligence. BDA is mostly implemented in the context of NoSQL databases which tear away from the tight relational storage to more loose, unstructured and semi-structured data models [7],[8]. Well-known examples include MongoDB (which stores data as JSON documents) and Redis (which stores data as key-value pairs), along with Apache Hadoop and its ecosystem [2], [9], [10]. These databases are capable of addressing the ACID (Atomicity, Consistency, Integrity and Durability)requirements of relational databases [8]. In telecom, BDA can enhance customer relationship management through more efficient resource management, identification of root causes of service failure, more intelligent marketing campaigns, boosted-up sales, detection of high-velocity fraud activities in real time, and timely inception of new business partnerships [5],[11].

    BDA is an expensive, resource-intensive and a complicated process which is plagued by many problems leading to significant project failures in different industries [5], [6],[12]-[19]. According to Gartner, up to 85% of BDA projects were failing in 2017 [15]. A McKinsey survey has determined the impact of investment in BDA initiatives by telecom companies on the actual benefits; of the 273 telecom companies who invested in BDA, only 5% companies are getting more than 10% benefit. Also, 75% to 80% companies ran into a loss due to BDA application [11]. The more important problems in BDA initiatives are lack of data quality,poor data management, mistakes in selecting the analytical model, lack of an existing BDA infrastructure, lack of expenditure, making non-scalable BDA infrastructures,difficulty in creating a roadmap for BDA skills, and a rising complexity in integrating heterogeneous big data [20]. The BDA landscape is also increasing at an exponential pace;termed as “firing on all cylinders” in industry [21]. Hence, the speed of innovation largely outpaces the speed of adoption.Most of these tools are open-source initiatives and require expert skills to understand and employ directly in an operational environment. This time for learning slows down adoption and demotivates a majority of businesses to invest in BDA [5], [6], [17], [22].

    BDA complexity is another challenge. In a BDA process, a considerable number of activities/tasks are executed as apipeline. Each of these activities can be implemented through an increasing diversity of both open-source and proprietary tools. There is a lack of skilled BDA pipeline developers due to the diversity of tasks to be performed, e.g., data upload,data transformation/clearning, statistical analysis,communication of the back-end activities with front-end GUIs, along with different types of analytics and visualization activities. Each tool has a learning curve, and the problem becomes severe when BDA developers need to integrate several tools together in the same pipeline. Moreover, the BDA pipeline runs perpetually until the analytical requirement is fulfilled, which requires the automation of core tasks like ETL and Machine Learning. The progress and now the domination of Python as a pipelining language has largely facilitated development of BDA pipelines in the last decade[20]. Some tools have also matured and have seeded the rise of BDA applications in telecom, for instance, MongoDB,Redis, Hbase, Spark, Flink, and Hadoop (described in Section 2). Due to these technologies, the BDA applications in telecom are increasing and likely to increase further [5], [6],[23], [24]. For instance, BDA can identify traffic delay sensitivity and accurate identification of small packet traffic,and brings much-reduced delay and processing complexity from data [25]-[28].

    In this paper, our intent is to determine the extent to which the huge potential of BDA has been realized by the telecom sector in academic research, and to identify and address the concrete challenges. We focus on academic research because the fast-changing BDA landscape leaves much space for formal research activities and projects to determine the impact of BDA tools on telecom. In other words, we want to gauge the actual benefits of BDA that the research community has brought to the telecom sector. For this, we formulate three research questions:

    1) RQ1:How much research literature is focused on BDA applications to telecom sector and what is the BDA technology stack in these articles?

    2) RQ2:What are the benefits and challenges mentioned in these articles and how much benefit has been actually realized?

    3) RQ3:How can the challenges be strongly addressed to facilitate BDA applications to telecom sector?

    To investigate these questions, we conduct a Systematic Literature Review (SLR) according to standard guidelines[29], [30]. To the best of our knowledge, this is the first SLR application for telecom sector. We have modeled the SLR and this paper from a big data perspective and avoid any operational detail of telecom domains and technologies. For this latter knowledge, we refer the readers to [31], [32]. Later on, we address the BDA challenges in telecom by proposing and describing a comprehensive, state-of-the-art BDA architecture called LambdaTel for telecom practitioners.

    II. BACKGROUND

    Gartner defines big data as “high volume, high velocity and high variety information assets that demand cost effective,innovative forms of information processing for enhanced insight and decision making” [33 ]. Here, four properties pertinent to our SLR are: 1) Volume is the large size of big data reaching generally from terabytes and petabytes, 2)Velocity is the speed of data generation and required processing of both batch and real-time data, 3) Variety is different types of data from heterogenous data sources,grouped as structured, unstructured and semi-structured data,and 4) Value indicates the hidden, previously unknown information or knowledge in data that is potentially useful for business decision making. The process of extracting value from big data sets is called big data analytics (BDA) [5], [6],[12], [13].

    A. Apache Hadoop and MapReduce

    A big challenge facing telecommunication companies today is the difficulty of employing a software and hardware infrastructure to handle big data. Apache’s Hadoop is an open source framework used for distributed processing of big data across a cluster of commodity hardware [34]. Each Hadoop cluster is highly available and fault tolerant. Hadoop version 2.x is a three-layered model classified as storage layer,processing layer and management layer (Fig. 1) described as follows. HDFS is Hadoop’s file system which provides faulttolerance and high throughput over low-cost commodity hardware. Large files are split into smaller blocks in a redundant fashion to achieve fault tolerance and stored across multiple machines to provide easy access. HDFS also provides file permission and authentication rights.MapReduce is the batch processing framework which works over Hadoop based on divide and conquer rule. It comprises of a ‘Map’ and ‘Reduce’ function. Input key-value pairs process during map step which generates intermediate keyvalue pairs. Then, all the intermediate values related to the same key will combine so that reduce function is able to access them and compress the value set into a smaller set.Overhead of steps like data scheduling, fault-tolerance, and inter-node communications are eliminated in MapReduce[18]. YARN is Hadoop’s resource management framework which abstracts MapReduce from managing resources (as was the case in Hadoop version 1.x). Finally, we have the common utilities which are components needed to operate Hadoop submodules and projects. Shared libraries support other operations like error detection, compression codes implementation, and I/O utilities etc.

    Fig. 1. Hadoop V2.x Architecture.

    Hadoop’s data management occurs through a master-slave architecture (Fig. 2). Master is called Name Node and slaves are Data Nodes. Name Node manages the file system name space, regulates clients access to files and executes file system operations such as renaming, closing, and opening files and directories. Data nodes perform read-write operations on HDFS, as per client request. They also perform operations such as block creation, deletion, and replication. Name Node runs the Job Tracker process to process MapReduce tasks that distributes and assigns work to Task Tracker daemon processes running on Data Nodes. The Hadoop ecosystem is a set of software APIs available as open source Apache projects which use Hadoop to provide different functionalities, e.g.,database (Hbase), data warehouse (Hive), SQL Querying(Hive and Drill), stream processing (Spark, Storm, Flink),machine learning (Mahout, H2O), MapReduce programming(Pig) and cluster coordination (Zookeper) [34]-[36]. The ecosystem relevant to our SLR is:

    Fig. 2. Master slave architecture of hadoop.

    1) Apache Hbase:This is Hadoop’s database built on HDFS[37]. It is capable of providing real-time read and write operations on big data sets stored as a wide columnar store(discussed below). In Hbase, data is stored column wise, with each row having a sorted key indexed with timestamp.Columns can be grouped together to formcolumn familieswhich can be grouped insuper column families.These column families are the basic units for access control. The time stamps are 64-bit integers to maintain different editions for a cell’s content in Hbase. Clients flexibly determine the number of cell editions stored. These editions are sequenced in the descending order of time stamps, so the latest edition will always be read. Fig. 3 shows column families overvoiceandsmsentities being grouped into a single super column family.

    2) Apache Hive:Hive provides the SQL interface and a relational model for big data processing over Hadoop [34].Hive is also considered a data warehousing application infrastructure on top of Hadoop that provides summarization,query and analysis.

    Fig. 3. A snapshot of a super column family for the choice of services. (Adapted from [35]).

    3) Apache Pig:Pig Latin is an ETL-level language which facilitates textual programming, parallel execution and optimization of complex tasks comprised of multiple interrelated data transformations, by encoding them as data flow sequences [34]. It also provides users the facility to encode their own user defined functions.

    4) Apache Spark:Spark is an execution engine in which data streams are interpreted as a series of deterministic batchprocessing jobs, making traditional MapReduce 100 times faster [38]. Spark is based on master/slave architecture.Master instance runs on user-defined driver program and can launch a set of workers in the cluster and read data from HDFS. Spark uses resilient distributed datasets (RDDs) that are partitioned across multiples machines to achieve faulttolerance and slaves create partitions on RAM for RDDS as defined by the driver program. Spark Streaming is a Spark API for stream data processing.

    5) Apache Kafka:Kafka is an ingestion API which processes real-time data streams and stores them into the queue [39]. Each queue has a topic component and it is a user defined category. The topic decides which event put in which queue. As events arrive randomly, they are sorted and arranged in a queue so that they consumed by the message broker component easily, which are servers consuming the queue. Servers can be based on Apache Spark, Apache Flink or Apache Storm.

    6) Apache Flink:Flink is a data flow streaming engine and implements “true streaming” in that the whole job is deployed concurrently in the cluster [40]. Operators in the long run continuously consume input and produce output. These output tuples are immediately forwarded to further processing by next level operators which enables pipeline parallelism.

    7) Apache Storm:Apache Storm is a distributed realtime computation system which can reliably process data streams[41]. In Storm, spouts represent information sources and bolts represent data manipulations. Storm architecture is a processing pipeline modeled as directed acyclic graph with spouts and bolts as vertices and data streams as edges.Streams can be repartitioned as per need to enhance efficiency(over a million tuples processed per second per node). It is efficient, fault-tolerant and can integrate with database sources.

    Spark Streaming, Flink, Storm (along with Kafka ingestion)have successful use cases in realtime analytics, online machine learning, continuous computation, distributed RPC,and data preprocessing (ETL). From SLR, we found that social network analysis (SNA), machine learning, stochastic modeling, data mining, cluster computing and cloud computing have been proposed/applied. As these domains are vast and generally well-known we do not present any background here.

    B. Big Data Storage Technologies

    NoSQL (Not Only SQL) is a new breed of databases that address the high scalability, complexity, and elastic schema requirements of big data [42]. They allow storage over four data models:wide columns, documents, key-value pairs, andgraphs. Initially NoSQL compromised somewhat on ACID,formalized through CAP (Consistency, Availability and Partition Tolerance), i.e., given a tolerance to definite partitioning of nodes through system failures, we can provide availability at cost of consistency, or vice versa. In the case of latter, the system was in BASE, i.e., basically available in a soft (temporarily inconsistent) state which will eventually become consistent with time. CAP and BASE are still used in NoSQL, e.g., in Amazon’s DynamoDB which forms the storage backbone of Amazon Web Services. However,NoSQL now largely caters for ACID in powerful databases such as MongoDB and Redis [42].

    We now define key-value, columnar, document and graph stores with examples of telecom data as shown in Fig. 4.

    1) Key-Value Data Stores:In key-value stores, data is input and accessed using key-value pairs. Keys are randomly generated and value can be any data type associated with inbuilt database objects. Notable examples include Redis and DynamoDb. Keys are stored in hash tables and logically grouped into a ‘bucket’. Both bucket and key can be used to access the value as they are hashed for unique indexing. Keyvalue stores provide much faster query response times than relational or other NoSQL stores due to indexing and simplicity of storage. They also support adhoc querying for complicated unstructured analytical applications like web usage, social network feeds, and real-time response processes.Fig. 4(a) shows telecom call detail records (CDRs) contained in a key-value schema. In this scenario, a CDR instance (or a collection of instances) can be inserted as value while key is a set of CDR flags.

    2) Document Data Stores:In document stores, data is stored in documents comprising a collection of key-value pair data.Every document has its unique identifier (key) and serializes data in semi-structured formats, particularly JSON which provides a widespread, flexible and information-rich structure for data modeling. A new field can be added at anytime without considering its schema. The document data model maintains data locality and is hence easy to distribute. It is useful to store complicated data formats related to web applications, blogs, mobile/smart phone usage, chat applications, and social media clients. MongoDB is a worldrenowned NewSQL document store. A sample document model for a telecom’s data is shown in Fig. 4(b). Here, JSON document with key 1001 is storing a set of key-value pairs(attributes) related to a CDR instance.

    3) Wide Columnar Data Stores:In wide columnar stores,data is stored and processed in the form of tables which are schema-free; it is not necessary to provide a value for each cell and each row can have its own schema. For instance, data in HBase is stored intableswhich are further stored in logical spaces calledregions. Due to large size of Hbase table, it is partitioned into multiple regions and assigned toregion serversacross the cluster. Each region server contains multiple regions and each region contain multiple storage units. Fig. 4(c) shows an Hbase table of telecom CDR data segmented in threeregionsbeing managed by tworegion servers. The data model is a multi-dimensional sorted map as discussed above (refer to Fig. 3). Each row is indexed by key and columns can be combined together to formcolumn families. Two or more column families form asuper column family.

    4) Graph Data Stores:Graph data stores store data consisting of objects (nodes) and edges linking nodes through relationships. Indexes are used to traverse the graph (either directed or undirected) which can be scaled out and distributed across nodes. Frequent analytical queries include identification of clusters, shortest path between two nodes and community detection. New edges can be added and existing edges removed so that social graph entities like friends,followers, endorsements, messages, and responses can be accommodated along with their relationships. Time-evolving graphs can be analyzed by monitoring changes to architecture over time. Fig 4(d) shows a sample graph store for telecom.Nodes are entities (company, call date, voice call, call from,call to) or actions (connect) with relevant attributes(connection time, company name). Edge labels are characterized by their roles, e.g., caller, callee, type of call etc.

    III. SLR RESEARCH METHODOLOGY

    We focus our SLR on the following domains of research:

    telecom analytics, big data applications to telecom, big data analytics (BDA) applications to telecom,andNoSQL applications to telecom. We selected these domains to be generic enough to constitute all the possible (hundreds of)different BDA solutions available in the market today and so,this domain list is complete to the best of our knowledge. We use the more common and popular terms from these domains to develop our search queries (described below). We targeted digital sources commonly known for computer science related publications, i.e., Institute of Electrical & Electronics Engineers (IEEE), Association for Computing Machinery(ACM), Elsevier, Springer and Google Scholar. We relied on Google Scholar for coverage of the remaining digital sources as it is the most frequently referenced digital source in 2018 and continues to approach and make contracts with sources for indexing their research databases [3], [43]1The complete list of indexed resources is not made public by Google.. To ensure state of the art results, we focused on research content from 2010 onwards but decided to include past content also if we deemed it critical. To manage the retrieved articles, we used the Mendeley tool as we found it to be more acclaimed and comprehensive for our needs. It is also broadly connected in scholarly groups and has large online community [44].

    Fig. 4. NoSQL stores’ examples for telecom: (a) key value store; (b) document data store; (c) wide columnar data store; (d) graph data store.

    To formulate search queries, we selected eleven (11)keywords related to big data, i.e.,big data, NoSQL, NewSQL,Hadoop, columnar store, key-value store, document store,graph database, big data tools, big data techniques,andbig data analytics.We also selected three keywords related to telecom, i.e.,telecommunications, VoIP,andmobile communication. Then, we combined each big data keyword with a telecom keyword to generate 33 queries, for instance big data AND telecommunications. We executed these queries on our selected digital sources.

    We adopted a three-step methodology to extract the relevant research articles pertaining to our research questions. In the first step, we scanned the title of each paper (on each digital source) to determine its significance to our scope. We excluded completely irrelevant papers but we did not exclude papers which had fuzzy or unclear titles. This gave us 233 articles, which we added to our Mendeley database.We considered the most appropriate resulting literature papers from each source that are closely related to the topic. We used Mendeley to remove articles duplicated through Google Scholar, leaving us with 222 articles. In the second step, we read the abstracts of the papers filtered from first step. This provided more insights about the scope and helped in further filtering out irrelevant papers. This gave us 61 papers. In the third step, we read the first two sections of the articles filtered from second step, to perform a final filtration of papers which did not map our scope. This finally gave us 38 articles matching our scope, which we review in this paper. The breakdown of our 222 papers filtered in initial step with respect to digital sources is shown in Fig. 5. Google Scholar retrieved the maximum articles (60), followed closely by IEEE (57) and ACM (56), while Elsevier provided the least hits (23). Moreover, Fig. 6 shows the distribution of our selected 38 articles with respect to the complete set of 33 search strings, summed over all digital sources. The horizontal axis shows the big data keywords, and different colors represent different telecom keywords. There were no articles for search strings combining any telecom keyword withNewSQL, columnar store, key-value store, document storeandgraph database(hence, we do not display them). This shows that our selected literature does not use the more technical terms related to BDA. Also, a majority of the articles are retrieved through combination withtelecomkeyword followed bymobile communicationandsparsely byVoIP.

    Fig. 5. Distribution of 222 articles in first step of filtration with respect to digital sources.

    Fig. 6. Distribution of selected 38 articles with respect to 33 search strings over all digital sources. BD=big data, BDA=big data analytics, BDTch = big data techniques, BDTls = big data tools, Mob = mobile communication.

    In Table I, we show the distribution of selected articles with respect to type of research content. Majority of the content came from the top conferences (17), and journals (8) followed by a couple of magazines. master thesis, workshop or symposium papers and commentaries had limited contributions. We did not find any book matching our scope.

    Finally, we show the distribution with respect to citation year in Fig 7. From 2011-2015, no more than 5 articles were published each year. A spike was observed in 2016 with 18 publications, signaling the time when major interest and research was generated. This again fell in 2017.

    Fig. 7. Distribution of selected 38 papers with respect to year of citation.

    A. Quality Assessment

    We assessed the quality of the selected 38 studies against eight (8) selected quality criteria (QC), to determine whether our selected research articles suitably address our research objectives. We created 8 questions to assess the quality of our articles, with “Yes”/“No” responses having weights “1” and“0” respectively. After these responses, we evaluated the results through Cohen’s score for inter-rater agreement. We debated on the discrepancies using the Delphi method [77]until consensus was reached on filtered articles. This activity was carried out by five researchers who are faculty members of three different universities in Pakistan (not anyone in the authors’ universities), including three females and two males.Based on a selected threshold of 5, our QC process did not exclude any of the 38 selected articles (all obtained a score ≥5(see Table II which records the mode response for each QC).Following are our quality checklist questions:

    1) QC 1:Are research objectives clearly stated?

    2) QC 2:Is methodology well-defined?

    3) QC 3:Is big data tool utilization present?

    4) QC 4:Any characteristics of big data out of 4V’s catered in the study clearly described and its solution with respect to telecommunication?

    5) QC 5:Is the process of BDA or any one step of BDA process clearly stated to resolve telecom’s big data major issues?

    6) QC 6:Does the study perform the comparison of proposed approach with existing baseline approaches?

    7) QC 7:Were the performance metrics fully defined?

    8) QC 8:Are results properly interpreted and discussed and does the conclusion reflect the research findings?

    IV. RESULTS

    We now discuss the 38 articles extracted from SLR. Article frequency distribution over telecom domain is shown in Fig. 8;mobile telecom is most frequent (23), followed by general telecom (12) and VoIP (3). In mobile, 20 articles (87%)discuss BDA applications to 2G, 3G, 4G and 5G networks, 2 on cellular networks and 1 on wireless networks. In Table III,we show a classification of our 38 articles with respect to telecom domain, big data technology stack and the type of article. We identified 4 such types:literature reviews,frameworks, use cases, white papers,andvalidation(with experiments). BDA frameworks have been proposed in 23(61%) articles, out of which 15 (40%) include validation with experiments, 5 (13%) present a use case without experiments,while 3 (8%) discuss only the framework without any use case or experiments. Also, in 7 (18%) articles, the authors do an experimental validation without proposing any framework.There are 4 (11%) literature reviews, a single white paper(commentary), and 3 (8%) use cases. Of the 22 (58%) articles on experimental validation, 16 (42%) employ Hadoop or its ecosystem APIs, and remaining use different NoSQL databases. Of the 16 (42%) papers that do not involve any experiments, only 5 (13%) papers employ Hadoop while the remaining focus on NoSQL. Of these 16, 2 papers also focus on stream analytics. The analytical applications targeted in our articles focus on Hadoop ecosystem, machine learning/data mining, deep learning, distributed databases, self organizing networks, computational visualization, graph analytics, Monte Carlo simulations, social network analysis and cloud computing-based data processing particularly for mobile telecom. Finally, Spark has also been used in 2 works for experimental validation and also once in a case study. We also estimated the number of articles focusing on BDA benefits and challenges of its implementation, along with future research directions and characteristics of telecom big data(shown in Fig 9). The authors discuss benefits most frequently, probably due to hype of big data and BDA,followed by challenges which are significant as outlined in Section I. Apparently, benefits/opportunities of BDA are clearly known and also its implementation challenges.However, experimental validations to realize these benefits and address the challenges remain very limited in research literature. We now discuss our 38 papers in detail. We have further classified them according topics (labels) and subtopics of telecom domain which we identified during our SLR(shown in Fig 10 and Table 4).

    TABLE I CLASSIFICATION OF PAPERS WITH RESPECT TO ARTICLE TYPE AND THEIR RANKING

    A. Frameworks

    We now discuss big data frameworks over following five topics:mobile network operators, 5G, network optimization,CDR analyticsandstreaming data.

    1) Mobile Network Operators:Authors target MNO’s2Mobile Network Operators.in[36], [45], [51] to address challenges in BDA implementation.One framework proposed automation of a manual reporting system of a Moroccan MNO to deal with unstructured data and CDRs, server logs, billing, and social network data. Kafka is used for data ingestion and Flink to process HDFS data in both streaming and batch mode, with final visualizations being shown on dashboards. Another framework uses Hadoop, Spark and machine learning to achieve network KPIs and enhance revenue and the third proposes alambdaarchitecture that caters for both batch and stream data processing, and the self-organizing network (SON) approach[78] work by inferencing data from a relevant knowledge base. Case studies are presented which extend previous works to create data intervals for data reduction, identify sleeping telecom cells, and find correlations in telecom data, all employing MapReduce to estimate parameters before a selforganizing network (SON) application. Generally, authors propose key-value data stores for storing mobile data.

    Fig. 8. Distribution of 38 papers with respect to telecom domain.

    Fig. 9. Distribution of 38 articles with respect to telecom big data characteristics, benefits of applying BDA, challenges faced and specification of future research directions.

    2) 5G:In [54], the authors discuss a novel SON-based approach for 5G networks. They first identify challenges hindering SONs to meet 5G requirements and then propose a BDA framework for SON geared towards 5G based on machine learning and data analytics which could be exploited to extract insights for creating end-to-end network intelligence. Through a case study, they show how this approach can diagnose a sleeping telecom network cell.Authors also conduct simulations to demonstrate superior performance over 3G/4G SON. Also, in [25], the authors propose a BDA architecture for mobile wireless 5G communication to achieve optimized resource allocation,mobile content distribution and network optimization. The authors categorize four types of big data, i.e., application,user, network and link channel data, and describe the protocol stack, procedure for signaling as well as operations at the physical layer. It is claimed that this architecture can provide a drilled-down view of operations and customers. There is no mention of any specific big data storage or processing technology in this paper although the benefits of machine learning are mentioned.

    3) Network Optimization:Authors expound on the benefits of using machine learning and deep learning approaches [28],[54], [56] for network optimization. To meet the network requirements (for e.g. 5G services), generalized BDA framework is presented. NoSQL databases are proposed in ETL step. BDA through machine learning extracts insights for creating end-to-end network intelligence and generates network optimization solutions for different types of big data(private, redundant, distributed etc.) Also authors investigate the application of BDA to 5G in [28] and show that it can produce architectures which are flat, both green and soft, i.e.,more agile and efficient.

    4) CDR Analytics:In [49], the authors consider CDRs and prove it is a big data source and a candidate for BDA with respect to storage, processing and CDR analysis. Considerable research has been done to address the CDR analytics challenges along specification of BDA architecture, utilization of big data tools and techniques, and use case scenarios that presents better performance measures and cost efficient solutions in batch processing and real time. Authors stress the importance of using Apache Hive for querying with HDFS as a file system, while using Apache Cassandra for storing streaming data.

    5) Streaming Data:In [26], the author presents an architecture for real-time predictive BDA for telecom domain.This model has four BDA capabilities: a) real-time analytics,b) detecting most probable cause of network failure, c)modeling the users’ telecom usage experience, and d) realtime actuation of business goals. There are three relevant business areas: a) optimizing management of telecom customer experience, b) enhancing business efficiency through real-time (stream) processing, and c) social network analysis (SNA) to analyze social and business relations among telco customers. The salient features of proposed architecture are: a) gathering more reatime data from all nodes, b) ETL connectors for extracting data from any source (NoSQL,relational etc.), c) real-time analytics, opinion mining of users along with real-time SNA, d) developing business processes and rules within the organization, and e) measurement of relevant KPIs.

    B. Literature Reviews, White Papers, Use Cases

    We now explain literature reviews, white papers and use cases over two topics:BDA applications to telecom sectorandBDA usecases.

    1) BDA Applications to Telecom Sector:In order to reap BDA benefits for wireless telecom, authors in [46], [74]present actionable steps such as to identify big data sources in their domain, selection of new technology (particularly opensource), vendors, resource management including employees,BDA architecture, its execution with optimization. Further,identified BDA techniques according to network type particularly for wireless are stochastic modeling (e.g., Markov models, time series), machine learning (e.g., classification,regression, dimensionality reduction, reinforcement learning,deep learning) and data mining (pattern matching, clustering)algorithms as a solution to a specific problem such as predicting mobility of a user.

    The authors in [57] direct attention towards format of inputdata generated by IoT, crowdsourcing and social media in structured, semi-structured or unstructured format, and either pseudo-random sampling, compressive sampling or distributed source coding being used to shred the streams for management. Such a streaming situation occurs when Internet-based mobile networking being performed in realtime on the cloud and there is a need to manage the off-loaded and uploaded streams in a real-time fashion. Therefore, a case study on BDA for streaming big mobile telecom data is presented via streaming architecture which utilizes distributed database technology to manage the streams; none of the wellknown BDA tools like Spark, Storm and Kafka have been used.

    TABLE II QUALITY ASSESSMENT OF 38 SELECTED ARTICLES WITH DELPHI METHOD OVER 8 QUESTIONS QCI-QC8; MODE RESPONSE IS SHOWN FOR EACH QC FOR FIVE DIFFERENT RATERS

    Fig. 10. Classification and sub-classifications of article types of our 38 SLR papers.

    2) BDA Usecases:Hadoop, NoSQL and machine learning are the primary big data technologies being employed by the companies. In [24], [59], [65], [73], authors review BDA usecases identified through interviews, online research and through critical analysis from a representative sample of global telecom companies. The domains identified are marketing, sales, customer analysis, security, business development, innovation of new business models, products and services development, billing, intelligent transportation service, quality control, partner analysis, cost and contribution analysis, public sector, healthcare, media and entertainment,banking and insurance, quality of experience (QoE) or satisfaction, mobile retail shopping, mobile pricing analysis of products and SIM box detection, i.e., recognizing fraudsters who do not use their mobile sims as per policy.

    The findings of review show that the remarkable benefits can be achieved through the earlier adoption of BDA with significant challenges. Also, real-time analytics and data management are core BDA requirements nowadays in telecom, which is expected to get serious due to IoT boom

    [79]. The primary motivation for a BDA initiative is to enhance customer satisfaction by providing a unique customer experience each time. The author recommends telecom companies to focus BDA efforts on satisfying customers,develop BDA architectures that are applicable to the complete organizational business process, not to wait for more data but to initiate BDA with currently available data to achieve streaming results, build BDA skillset based on business requirements through defining measurable outcomes.

    C. Validation

    We now discuss experimental validation articles based on the following architectural topics:performance, accuracy,scalability, reliability, securityandusability. In some cases,we identified classifications of these aspects and discuss papers according to these classifications3Where necessary, we have itemized the paper discussion of validation-related papers to enhance readability..

    1) Performance:We found following six sub-classifications of articles related to performance of BDA applications to telecom:NoSQL Data Stores, Cloud Computing, Machine Learning, Use of Hadoop with Spark,andUse of R Language.A. NoSQL Data Stores:The objective is to highlight the role of NoSQL in improving performance of a system and provide some guidelines for selecting the data store that is most efficient in terms of computational complexity.

    ● In [35], the authors present the method of migrating from a relational store to a NoSQL store and prove superior performance of Apache Cassandra over PostgreSQL relational store in a telecom scenario. The data model consists ofcustomer data, customer account data, bucket(balance of customer),bucket type, subscribed service, service type,andtariff plan. The specific query is determining the list of services subscribed by a customer (ordered by priority) and account information on receiving an in-coming call. The authors create a super column family for each user (row key is the caller number) where super columns contain the list of services (and details) being subscribed by each user. Aregular dayworkload and aChristmas workloadis created. In the former, Cassandra is able to handle 0.24 million calls over 600 seconds and in the latter 0.34 million calls, as compared to 0.16 million and 0.18 million respectively for PostgreSQL.Similarly, In [69], the authors utilize the NoSQL data storage technology

    ● In [66], the authors employ a NoSQL document store in an enterprise BDA telecommunication application. This system collects, merges and analyzes data from several subsystems, with major entities being staff, shift, permit, work activity, service, turnstile transaction, and department.Traditional relational storage is converted to document model;for instance, shift, permit, work activity, and turnstile transaction tables are combined into a document structure called Activity Package through a synchronization service.The use of NoSQL facilitates much better automatic load balancing in case of heavy querying load. The author’s do notmention the specific document store they have implemented and detailed experimentation is not shown.

    TABLE III DIST RIBUTION OF 38 ARTICLES WITH RESPECT TO TELECOM DOMAIN, BIG DATA TECHNOLOGY STACK AND ARTICLE TYPE

    ● In [52], the authors implement a BDA cellular network planning system which is based on a common telecom use case of combining OLAP and OLTP technologies for a realworld concurrency scenario. The authors focus on HP’s Vertica, which is a MPP columnar warehouse providing support for both OLTP and OLAP type queries. In a cluster with standard configuration, inserting 10 K records in a table consumes an average of 200 ms, updating 10 K records consumes 3500 ms on average and deleting 10 K records consumes 2000 ms on average. A theoretical comparison with SAP HANA is also presented, which caters for the limitation in older versions of Hadoop to provide transaction processing,i.e., low-latency, SQL-oriented Hadoop solution was inefficient. The authors then propose a BDA architecturewhich can perform unified data processing to process analytical and highly concurrent transactional tasks efficiently within one system which run above various applications with loaded data and answering different end users requests.

    TABLE IV CLASSIFICATION AND SUB-CLASSIFICATIONS (IN ITALICS) OF ARTICLE TYPES OF OUR 38 SLR PAPERS

    B. Cloud Computing: Latency problem effects the quality of service and thus lowers the performance as well. This section highlight the importance of cloud computing as this solution is found abundantly.

    ● In [67], [68], the authors tackle the problem of delayed transfer of mobile data from smartphones and related devices to mobile cloud computing data centers over wireless channels. To overcome this latency, an efficient BDA-based data transfer approach is proposed that employs overlapping features of heterogeneous wireless networks (HetNets) by splitting data into chunks transferred simultaneously. In this way, data is first transferred to small clouds called cloudlets(associated with small telecom cells) which then transmit to public cloud. Through Monte Carlo simulations, the efficiency of the proposed approach is demonstrated. The authors do not mention any use of standard NoSQL databases for big data storage management.

    ● In a similar vein, the authors in [58] justify the complexity of maintaining QoS guarantees in a cloud computing(software defined network) scenario, due to un optimized network design, load balancing, access control and prioritization of traffic. Here, we report the findings from a couple of papers to exhibit the role of ML algorithm in improving the performance of the telecom big data system.

    C. Machine Learning: A number of ML algorithms are available that can be leveraged. These ML algorithms range from supervised learning (e.g., Logistic Regression, Support Vector Machine, Naive Bayes, Random Forest, and Decision Trees) to unsupervised learning algorithms 14 (e.g., K-means and Neural Networks). When selecting a ML algorithm,several factors need to be considered. These factors include the time complexity, incremental update capability,offline/online mode, and generalization capacity of the algorithm, and most importantly the impact of the algorithm on the detection rate (accuracy) of a system. Due to the diverse role of the algorithm, it is quite challenging to pick the most appropriate and efficient algorithm.

    ● The authors in [58] propose and implement a BDA approach for QoS optimization, based on quantifying the correlation between telecom KPIs (e.g., packet size, number of packets transmitted, length of queues) and QoS metrics(e.g., end-to-end delay, throughput, packet loss etc). Initially,correlation coefficients are computed for each KPI and then machine learning using a combination of decision trees and linear regression is used to predict the QoS metrics. One core finding shows that CPU utilization is correlated with number of packets transmitted and packet loss with average communication delay. There are some unexpected results,e.g., bi-directional delay increases end-to-end delay by a factor of 4. Similarly to above, the authors do not mention any use of standard NoSQL databases for big data storage management.

    ● In [53], the authors combined the output of random forest,logistic regression and support vector machines algorithms to build a predictive model for their customers In [70], machine learning algorithms are also involved specifically neural networks, decision trees and logistic regression to identify influential telecom subscribers

    D. Use of Hadoop With Spark: In [62], the authors propose an efficient statistical BDA solution to identify encrypted,non-encrypted, or tunneled VoIP media (voice) flows. A system is also proposed to efficiently process high-speed realtime network traffic. The BDA solution uses association rule mining to extract rules regarding VoIP parameters/KPIs, for instance, packet size, current flow, and packet transmission time. The authors treat this as a classification problem, with classes depicting different outputs being generated by the rules’ combination. A single-node Hadoop setup is used for batch processing along with Spark for processing of streaming VoIP data. The authors demonstrate that the system efficiency in terms of number of packets transmitted outperforms performance of existing VoIP analytics systems.

    E. Use of R Language: In [60], the authors tackle the problem of understanding and analyzing VoIP applications through standard parameters such as bandwidth, packet loss rate, delay, jitter, codec type and CPU power of the end devices. Performing ETL on VoIP big data is a challenge due to the diversity of Internet data being generated by a single device. As part of a performance measurement research, a robust ETL approach is proposed which is based on execution of certain scripts during extraction, conversion to XLS and txt formats of extracted data in transformation, and loading of data into R for analytics in loading phase. More important results show that GoogleTalk, Skype and Express Talk are sensitive to packet loss rate and jitter rather than to delay.Also, bandwidth and de-jitter buffer and gateway CPU and memory are important in order to produce a good quality VoIP service.

    2) Accuracy:The following articles dealing with accuracy of BDA application results have used Spark:

    ● In [53], the authors attempt to increase the adoption of 4G technology by predicting the relevant customers currently using 2G/3G from big data streams of a Chinese telecommunication firm. The exact work is to enhance 4G transfer rates through prediction of peer influence in CDR graphs of customers with data like service subscription,service usage, demographic information, and calling and messaging history. The graph contains 0.15 million nodes and 62000 edges. In a field study, the authors first perform feature selection and then build a predictive model that combines outputs of ML algorithms using an Apache Spark cluster setup. The authors demonstrate excellent predictive accuracy by comparing real-vs-predicted values.

    ● In [75], the authors propose a deep learning framework for video analytics. Specifically, convolution neural networks are used to classify each frame of the video in real-time to determine its importance, in which case it will be retained.This activity controls the size of the input video hence leading to load reduction by discarding frames unnecessary for analytics. The action decision to discard is taken by determining correlation between consecutive frames in a streaming fashion. The authors show that their system has better accuracy as compared to other deep learning models for stream processing (temporal stream convnet and two stream model).

    ● The authors in [27] also use neural networks to predict the anomalies in their mobile network when implementing a BDA framework for analyzing customer-centric mobile wireless big data using Hadoop for processing and Apache Pig for ETL.Initially, customer CDR data are used to detect anomalous behavior using k-means and hierarchical clustering, for instance, unusual traffic at a given location and time. These outputs are compared with ground truths for verification, and held identify regions in the network for specific actions such as resource allocation.

    3) Scalability:We now discuss papers that have implemented scalable BDA applications, over two classifications:Use of HadoopandSocial Network Analysis.

    a) Use of Hadoop:

    ● In[55], [63], MapReduce is used for processing and analysis of data sets at different layers of a telecom platform.For this purpose, employ Apache pig or hive for programming MapReduce. Hbase is used as persistent NoSQL data store The authors in [64] implement a NoSQL infrastructure for criminal investigation through telecom CDRs. The infrastructure uses Apache Hive with Apache Hadoop/HDFS at backend. The authors vary several parameters including block and partition size in HDFS. The ideal parameters demonstrate the superiority of using Hive, with considerable improvements to the efficiency of query execution and scalability along with reducing cost of cloud usage.

    ● In [55], the authors implement a scalable network traffic monitoring for a large scale telecommunications company based in China. MapReduce is used to execute traffic analysis,application analysis, and user behavior analysis on telecom big data streams. Some specific KPIs which are estimated are traffic statistics in terms of bytes and packets, traffic classification at the application level, web service provider analytics from the perspective of mobile Internet, and clustering the user behavior data to extract useful homogeneous groups. The authors persist data in Hbase and employ Apache Pig for programming MapReduce. It is shown that Hadoop can efficiently processes 4.2 terabytes of traffic daily from 123 Gb/s links with high performance and reduced cost.

    ● In [63], the authors implement a big data platform with a domain specific language (DSL) for telecom sector, which uses MapReduce for processing and HBase as persistent NoSQL store, along with a visualization layer for displaying results. A specific type of file descriptor is also created for organizing communication in the above process. DSL is a high level Language which abstracts the user from directly writing MapReduce code or performing ETL through Apache Pig or Hive. It transforms the datasets in such a way that various information about a particular grid such as multiple calls and SMS activities on a particular date and time could be done with few lines of code. The authors demonstrate the scalability (rate of change throughput is more than the increase in number of nodes), reduced average execution times and linearly increasing write performance of HBase with increase in data.

    ● In [69], the authors propose a three-tiered BDA architecture for the agriculture domain to solve the problems of farmers not being able to access crop yield information online due to unstable nature of wireless communication. A middleware allows farmers to access the data with WiFi or 3G/4G connection. Another tier stores farmers’ requests in a NoSQL database which acts as a cache to avoid DoS messages. Finally, in an offline mode, farmers can communicate through Bluetooth to synchronize information.Strangely enough, the authors do not specify the exact NoSQL technology which could be useful in their application.

    b) Social Network Analysis: In [70], the authors propose a scalable BDA solution for identifying influential telecom subscribers through several social network analysis metrics combined through machine learning, specifically neural networks, decision trees and logistic regression. To support scalability, a prototype system is implemented on Hadoop and algorithms are executed through MapReduce. Results show that the system can scale to millions of users through actual data from a telecom company with 2.4 million subscribers and experimental data for networks with 100 million subscribers.

    4) Reliability:We now discuss articles focusing on reliability aspect of validating BDA applications to telecom sector, on following classifications:Use of Hadoop, Use of SplunkandNoSQL databases as Caching Mechanism in Service Oriented Architectures (SOAs).

    a) Use of Hadoop:

    ● In [27], the authors implement a BDA framework for analyzing customer-centric mobile wireless big data using Hadoop for processing and Apache Pig for ETL. Initially,customer CDR data are used to detect anomalous behavior using k-means and hierarchical clustering, for instance,unusual traffic at a given location and time. These outputs are compared with ground truths for verification, and held identify regions in the network for specific actions such as resource allocation. The authors also use neural networks to predict these anomalies in advance.

    ● In [50], the authors develop a MapReduce framework over a previously implemented distributed system for mobile cloud computing (MDFS). In this way, due to the parallel processing nature of Hadoop, no single mobile device can become a bottleneck for the mobile cloud. Also, resource allocation and task management are handled efficiently by Hadoop in a fault-tolerant manner, and Hadoop has proven its worth in many applications in varied domains. The authors do not employ HDFS as it does not cater for the energy limitations of mobile devices and it requires heavy I/O processing to maintain fault-tolerance, as compared to the lightweight nature of mobile data processes. In experiments,authors achieve optimal parametric settings for HDFS block size, Hadoop cluster size, and node failure rate, along with the effect of changing input sizes on the throughput. The superior performance of MDFS over traditional HDFS is also validated.

    ● In [47], a real-time video/voice over IP (VVoIP)application is implemented using Hadoop cloud computing system to resolve head-of-line blocking, handover interruption, and non-real-time transmission problems in VoIP communication. The authors employ TCP-based Real Time Messaging Protocol instead of the traditional Stream Control Transmission Protocol, and also employ a neural network to tune parameters to optimize handover and analyzing network traffic at any time. VoIP data is moved from user devices to Hadoop cloud with access control to implement rapid facial/fingerprint identifications and reduce the amount of processing data. The authors demonstrate that their system has a faster response time and lesser misclassification rate in access control.

    ● In [71], the authors employ a Hadoop application to solve the problem of detecting signal discontinuity regions for 3G connectivity through a combination of standard KPIs.Experiments are performed through simulations over a cluster of 3 nodes with commodity configuration, where the cluster received around 1 million messages from different access regions daily and MapReduce is used to execute the queries using KPI values. The results prove that the Hadoop system has superior accuracy and efficiency as compared to traditional approaches.

    b) Use of Splunk: In [61], the authors evaluate the health of network of a university wireless network, in order to analyze patterns of outages and failures for reliability improvement.For this, the proprietary Splunk tool is used for analyzing the huge volume big data generated by node outages, link failures and topology information. Simple Network Management Protocol and syslog data are used to investigate reliability along with standard documents of causes and recommended actions, and input of network operators’ on special events and actions. The overall result is that wireless networks are less reliable as compared to wired network.

    c) NoSQL Databases as Caching Mechanisms in SOA: In[69], the authors propose a three-tiered BDA architecture for the agriculture domain to solve the problems of farmers not being able to access crop yield information online due to unstable nature of wireless communication. A middleware allows farmers to access the data with WiFi or 3G/4G connection. Another tier stores farmers’ requests in a NoSQL database which acts as a cache to avoid DoS messages.Finally, in an offline mode, farmers can communicate through Bluetooth to synchronize information. Strangely enough, the authors do not specify the exact NoSQL technology which could be useful in their application.

    5) Security:In this section, we discuss the single paper related to security aspects of BDA applications to telecom.Particularly, in [48], a graph analytics platform is implemented which provides the network operator with an extended toolkit to obtain an overview of the whole network and allowing the operator to gradually focus on the desired information and acquiring useful insights. It facilitates data mining by providing modules for extraction of behavioral patterns, detection of attacks against network, behavioral similarity and detection of anomalies and attacks against networks. For validation, the authors perform root cause analysis of denial of service (DoS) attacks on a mobile network operators, along with early detection of an emerging(hot) event in Twitter streams. Most of the previous solutions have not managed graph based data mining at this level of adequate depth and GAP is a much better visualization platform for big data. The authors do not mention about any NoSQL graph database in their work.

    6) Usability:Finally, this section discusses the single paper related to usability aspect of BDA applications to telecom. In[72], the authors implement a BDA application called SPATE,which is an innovative big data exploration framework for telecom data using Hadoop and Spark to achieve comparable response times with orders of magnitude lesser storage space for spatio-temporal queries. The authors use lossless compression to ingest streaming telecom data and use a concept ofdecayingto distinguish between ‘old’ and ‘new’data. Experiments have been conducted using network data traces and a variety of telecom analytics tasks. SPATE’s future includes advanced smart city application scenarios namely an automated car traffic mapping system and an emergency recovery system which is critical after natural disasters.

    V. CHALLENGES AND BENEFITS

    We now mention the challenges and benefits of telecom BDA applications described in our 38 papers. We also enlist concrete gaps identified from challenges to realize the benefits properly.

    A. Benefits of BDA in Telecom

    We have categorized the benefits of BDA applications to telecom as follows:

    1) BDA As a Smart Solution:BDA brings special infrastructures and tools that provide considerable advantages for telecom industry in terms of infrastructure, programming models, high performance schema free databases and process analysis, all of which offer new and innovative opportunities to telcos, for instance, lesser power consumption and optimized resource management and network performance[35], [36], [46], [59], [63].

    2) Cost Reduction and Revenue Generation:BDA can assist in reducing cost of different operations of communication networks. BDA stream processing technologies help to process complex events with real time requirements which reduce risks, cost, and improve decision making and revenues[76], [80].

    3) Improving Customer Care Services:Business case for big data is substantially focused on addressing customercentric objectives. Companies can use BDA to enhance the customer care services as a result of being able to truly understand customer needs and anticipate future behaviors.Operators can make automated procedure to meet customer requirements such as faster calling [65], [80].

    4) Improving Diverse Usecases:BDA can be applied for Sim-box detection and quality of experience (QoE), the two most compelling use cases in the telecom domain. Important telecom applications that can benefit from big data include QoE analysis, churn prediction, target marketing, and fraud detection [73].

    5) BDA for Next Generation Mobile Communication:BDA can be used to analyze 4G LTE and 5G network from multiple dimensions and provide optimized solutions. Some examples include end-to-end visibility of the wireless network, selfcoordination among network functions and entities, building faster and proactive network, smart and proactive caching and energy efficient network operation [27], [74]. The future 5G network design will be greener and softer and will better meet the user requirements of mobile communication [28].

    6) Future BDA:Future BDA will encapsulate many different data models and algorithms as well as data integration components, for instance, advanced probes and adapters for retrieving data from all network nodes in realtime, advanced adapters for pulling relevant customer data from traditional big data and data warehouse systems, realtime analytics for customer activities, quality indexes for sentiment analysis, and opinion mining in real-time [26].

    B. Challenges of BDA Application in Telecom

    In the research papers which address challenges [36], [46],[48], [51], [54], [60], [65], [72], [73], [76], we have identified three categories:

    1) Lack of a formal Architecture for BDA Pipeline Implementation:BDA initiatives are posing serious challenges in integration of different data sources, complicated and timeconsuming ETL activities and ensuring quality of the BDA outputs by uncovering correlations and actionable insights using distributed machine learning. In fact, BDA offers two types of architectures (pipelines) to streamline the process and solve these problems, i.e.,lambdaandkappa[17]. Briefly,lambda allows processing of streaming and batch data in parallel, while kappa considers everything as streaming data;first data has to be processed as stream and then in batchmode (if required). We have not found any lambda or kappa implementations or proposals in any of our reviewed papers.Perhaps the best match is the work done in [81] who propose an AWS-based (Amazon’s cloud service) lambda architecture for IoT data processing. However, this paper does not propose any much innovation (due to the already available AWS infrastructure) and lacks several important components(shown in next section) which are critical to deal with the heterogeneous nature of telecom’s BDA requirements.Surprisingly, state-of-the-art NoSQL solutions (e.g.,MongoDB and CouchDB document stores and Redis and Riak key-value stores) which have had a global impact are not demonstrated in published research [82], [83].

    2) Lack of BDA Expertise and Knowledge:The BDA technology stack is increasing exponentially and it is difficult to find the relevant human resource who are technology experts for operational BDA architecture implementations.Definitely, guidance is needed in selecting the best architecture and the best combination of BDA technologies in this architecture. We also need to select the standard Python development language which has a massive online community in BDA pipeline development.

    3) Lack of Security Policies:The problems of ensuring data security, privacy, confidentiality and protection are quite significant in a BDA application for telecom companies. Many companies will avoid a BDA initiative if the security policy are not specified or enforced. Cloud computing (through Amazon AWS and Microsoft Azure) have resolved doubts over privacy invasion of companies of diverse types [84], [85]. However, it is still not an acceptable solution for many companies in general because the ‘data is not available within the company’s data center’ and its increased cost as compared to an in-house BDA architecture based on open-source technologies where the data security policies as specified by the IT department can be applied on the BDA pipeline.

    In our opinion, the only solution to these challenges is to propose a formal BDA architecture for the telecom sector, to be implemented in-house with standard open-source technologies and programming languages in order to reduce cost and increase privacy. We firmly believe this is the need of the moment and it will provide a clear roadmap for telecom’s BDA practitioners.

    VI. LAMBDATEL: PROPOSED LAMBDA ARCHITECTURE (BDA PIPELINE) FOR TELECOM SECTOR

    Our proposed BDA architecture for the telecom sector called LambdaTel is shown in Fig 11. The engineering behind LambdaTel is lambda in nature, allowing both batch and streaming data processing to execute in parallel with each other4We have adapted this architecture from one of our previous works [86].. It consists of seven layers which we describe below.

    1) Connection Layer:The Connection layer allows the different types of telecom data sources to feed data to our BDA pipeline. In other words, this layer implements an application programming interface (API) of data connectors for a potentially large set of standard No-SQL databases, SQL databases, IoT feeds and other streaming or batch telecom data feeds. Python’s support for connecting to NoSQL and other databases facilitates the implementation of this layer,e.g.,pymongoAPI for connecting to a MongoDB instance andredis-pyAPI for connecting to Redis database instance.

    Fig. 11. LambdaTel: Proposed Lambda Architecture (BDA Pipeline) for Telecommunication Companies (adapted from our previous work [86]).

    2) Integration Layer:The Integration layer is responsible for integrating telecom data from the Connection layer and inserting that in an integrated data lake. We propose to deploy a master database (either on a single or multiple servers) to store the data lake. We also propose to use MongoDB for this purpose for its flexibility of storage schema and support for both batch and streaming data. The actual integration of data can be done by storing each individual telecom source data in its relevant database (preferably NoSQL) and then implementing a controller API over these different stores for coordination. For example, newsfeeds from social networks can be continuously stored in Neo4J and call detail records in MongoDB where the controller keeps meta-data information of associations, data, storage capacity activities to provide access. Redis is our recommendation for metadata store to encourage quicker recovery and storage with minimize management overhead. Talend and Pentaho tools are not suitable here for data incorporation.In order to maintain the efficiency of BDA process, our proposal is towards Python programming for every stage.

    3) Batch Layer:The Batch layer is responsible for batch(static) processing of big telecom data from the master database. We recommend a using a Hadoop cluster to tackle the major ETL tasks for telecom big data in this layer. If some tasks are required to be processed faster, these can be done through Apache Spark and the more lengthy and time-taking tasks can be processed through MapReduce, for instance,computing the average call time for five years over 250 TB of data through MapReduce [87]. The Batch layer provides a thorough drilled-down analysis to supplement the processing done in the streaming layer.

    4) Streaming Layer:The Streaming layer is responsible for processing of real-time/dynamic) telecom streams. This layer presents real-time views and basic analytics at an abstract level. We recommend the use of Apache Kafka for ingesting the streams Apache Spark’s SparkStreaming feature for processing. Another competitor to Kafka is Flume (geared particularly for log analysis) although Kafka’s usecases outnumber Flume’s by a large margin. Similarly, Apache’s Storm API is a competitor to SparkStreaming but the latter is considered to be more applicable with respect to usecases. The selection of the tool should be preceded by a concrete analysis as BDA technologies continue to evolve. High velocity dynamic data streams could be stored in MongoDB or Redis if any use case occur otherwise it could be inefficient [88].

    5) Serving Layer:The Serving layer consolidates the results of both Batch and Streaming layers. It acts as a staging area where the batch and stream processing results are integrated together as per the requirements of the end-users, e.g., C-level executives of the telecom company. We recommend implementing the layer on a separate server machine, which we call the analytical data lake. This layer prepares processed data for displaying the end-user dashboards.

    6) Interface Layer:The Interface layer combines all the back-end layers (Connection, Integration, Batch, Streaming and Serving) with the front-end layer (Dashboard layer). Here,the well-known Python API’s for implementation of REST interfaces, e.g., Flask and Dash can be employed. Along with this, we recommend using Node.JS web server technology due to its enhanced scalability.

    7) Dashboard Layer:The Dashboard layer involves displays a series of dashboards to be seen by various telecom end-users. Every dashboard connects through the Serving layer through standard connectors, e.g., BI connectors given by various BI tools (such as Tableau, Oracle’s NI and QlikView), or Apache’s Sqoop. Plotly, an on open-source Python APIs is useful to create dashboards. As per client needs, the whole pipeline (or simply the front-end) could be deployed on any of the well-know cloud service providers,e.g., Amazon’s AWS, Microsoft’s Azue or Google’s Cloud;we recommend using SSL technology to connect dashboards to Serving layer through Interface layer.

    Parallel and exclusive data streams can be process by both layers. Static and dynamic both layers ought to include ETL(data cleaning) exercises and statistical modeling, e.g., big data predictive analytics can be marked as BigML. During this process, program the data management modules in python which further develop the background routines for processing static and dynamic data:

    8) Workflow Management:This module manages the large assortment of potential workflows that are possible in a BDA pipeline for telecom data. Apache’s Oozie is recommended here as a task scheduler. Oozie is Python-compatible and allows creation of formation of Hive, MapReduce, and Sqoop tasks as Directed Acyclic Graphs (DAGs).

    9) Session Management:This modules stores the session of every activity in the BDA pipeline in a stateless manner. As this is likely to generate much big data itself with high velocity, we recommend creating archives with respect to a windowing period. Here, our recommendation is to use Redis as the session database. In case of storage requirement over a larger timeframe, we can employ Apache Cassandra or HBase.

    10) Cache Management:This module speeds up telecom BDA processing through implementation of a separate caching mechanism. Redis data structure is highly robust in nature and is the best to use for cache management with more usecases, variety of data structures better strategies for removing data from main memory. [89], [90].

    11) Log Management:Client logging, server preparing and troubleshooting information can be served through log management, as indicated by standard practice. There should no negligence on administrator side for coverage of client clickstream in sessions and logs. Logged information can be acquired through Flume and processed as MapReduce tasks.

    12) Queue Management:Due to the diverse requirement of analytical task at different events, queuing up the tasks (e.g.,in an Oozie instantiation) can be required. Kafka is an ideal data queueing system for streaming real-time data. For static data, we recommend RQ (Redis Queue) programming, which executes queues in Redis scripted in Python. RQ is also utilized for real-time data incase Kafka is not applicable.

    13) Resource Management:This module coordinates the different BDA pipeline resources and activities. The best software for this task is Apache’s Zookeeper which can detect master node and slave node failures and help recover from such faults. It also provides interfaces to manage cluster resources in an effective and efficient manner.

    In our opinion, the aforementioned BDA pipeline is standard as per the current BDA technology stack. Last but not the least, we recommend implementing this pipeline in a Dockerized manner, with each activity running in its own docker container for more efficient processing and ease of coordination with other dockers. Also, BDA pipelining requires a development operations (DevOps) type of structure,with continuous integration and continuous deployment being managed by the standard Jenkins tool. All data and results being generated should be stored in private GitHub repositories for enhanced security and identity and access management (IAM) solution.

    It is important to discuss the strengths and weaknesses of both lambda and kappa architectures to justify our selection. Our motivation for selecting lambda is that telecommunication analytics use cases all require both batch-level analyses as well as real-time analyses (primarily due to the requirement of data cleaning and machine learning at the batch level). In a kappa architecture, there is no operational component for batch-level analyses and analyses are computed on-the-fly. As proof, let us discuss two important use cases of telecom industry for LambdaTel: a) Customer Relationship Management (CRM):call detail records (CDRs) of customers (streaming in nature)are fed into both batch layer and streaming layer in parallel. In batch layer, the CDRs are pre-processed and cleaned and then machine learning is applied to extract important customer segments, all in batch mode over a period of one hour. These segments are then fed to the serving layer. In streaming layer,real-time analytics starts to immediately show basic results like customer call throughput and average calling time per unit time,which are also fed to the serving layer. Here, real-time results are then shown with respect to segments available from batch layer, to present the required CRM picture to business decision makers. This use case can also be applied for other machine learning applications like prediction (classification, regression,time series forecasting) of telecom KPIs (calling time, SMS per second, mobile data usage frequency, revenue, sales, etc.) b)Customer Attrition: CDRs and historical attrition data are fed to the batch and streaming layer in real-time, while marketing data and competitors’ data are fed to the batch layer only at a predetermined time. The batch layer performs ETL to clean all data to predict customer churn, with result sent to serving layer. The streaming layer presents basic attrition analytics with respect to customer segments (computed previously), and in serving layer,the results are combined to present attrition prediction for each segment. We can similarly prove the need for batch-level analytics in other use cases related to marketing, cross-selling/up-selling, human resource management and operational analyses.

    Also, lambda architecture guarantees an error-free data execution process due to the presence of batch layer, hence maintaining a good balance between speed and reliability along with a fault-tolerant and scalable architecture due to Hadoop (plus Spark) implementation at batch level. It is interesting to note a blog questioning the lambda architecture by Kreps in 2014 [91], who actually proposed kappa.According to him, lambda brings much coding overhead for ETL at batch layer, primarily required for machine learning.However, the conquest of Python as a data science and big data language has made coding practices much simpler in the last 5 years. In lambda, we may need to re-process or repeat executions per batch but this can be catered by using inmemory and/or columnar storage solutions, which have also matured since 2014. Finally, a lambda architecture is still difficult to migrate or re-organize but considering the lack of any published lambda architecture for telecom, we think the time for this migration is still far; the need of the moment is to first implement and use it. The kappa use case allows execution of real-time queries, either on real-time data or data previously stored in some in-memory or streaming database without focusing on ETL. These situations can also be tackled by lambda, in which we can temporarily disable the batch layer for such requirements (for more information, kindly refer to [17], [91]-[95]).

    A. An On-Going Application of LambdaTel

    We are currently implementing LambdaTel for a local company (jazz.com.pk) to conduct a proof of concept (POC)application for their use case related to cross-selling/up-selling of customer services for marketing division. This is an ongoing work in which we can mention the current results without providing sensitive information. The company currently maintains an enterprise resource planning (ERP)implementation of SAP, with an Oracle back-end consisting of 5 different databases and around 1450 tables in all. There was no analytical infrastructure in-place previously. All queries were executed through structured query language(SQL) which was generating delays for several complex queries. Results were shown through a standard business intelligence (BI) tool. We also discovered a major problem of data quality in these tables, particularly missing values,incomplete values, inconsistent data, and data entry errors. For the cross-selling/up-selling use case, the requirement was to get the customers to purchase more than one service or upgrade in a single attempt; specifically, to predict which customer will purchase in this manner. For this, we identified 410 relevant tables, measuring around 825 GB. We extracted this data through connection layer, and inserted it in a MongoDB cluster (Master Database) with the metadata. This activity consumed two months. Then, we implemented the batch layer through a Hadoop cluster with Spark front-end.We cleaned data thoroughly through ETL functions encoded in Pig Latin and running on Hadoop. We then used processed data for prediction through MLLib over Hadoop, and fed the results to serving layer. This activity also consumed two months. Once the customers to target were identified, we started inserting their real-time CDRs to streaming layer to compute day-to-day behavioral analyses of mobile phone usage, which were all fed to serving layer. This layer now shows the predictions of customers to target for cross-sell/upsell, along with their real-time behavior to put things in perspective. This activity consumed one month. Currently (as of September 2019), the marketing division is testing the predictions at serving layer, and has demonstrated a small increase in customer loyalty due to some successful predictions. We have used exactly the same technology stack as proposed for LambdaTel. For the company, LambdaTel has brought the following advantages: a) implementation of a complete data quality execution pipeline which can be replicated for different use cases, b) implementation of a master database (data lake) for analytics which was previously unavailable, c) a combination of both real-time analyses and batch-level machine learning to understand the consumers more deeply, and d) implementation of a personalized dashboard using Apache Flask which was more convenient for top-level management as compared to the current BI tool implementation. The batch layer keeps on updating its machine learning model on a daily basis, while the streaming layer always gets the CDR’s for the predicted customers, and all this in an automated manner through Python scripts. All of this has been very effective, efficient and reliable for the marketing department.

    We must mention that LambdaTel is not a solution for companies which do not require machine learning or real-time analytics on-the-fly, or do not tend to focus much on maintaining data lakes for multiple analytical use cases.

    VII. ANSWERING THE RESEARCH QUESTIONS

    We now answer our research questions as follows:

    1) RQ1:How much research literature is focused on BDA applications to telecom sector and what is the BDA technology stack in these articles? Answer: In all, 38 articles are focused on BDA applications to telecom sector (primarily from 2010 - March 2018). Technology stack includes Hadoop and some of its ecosystem APIs, MapReduce, Spark and some of its component APIs, Kafka, Flink, R, NoSQL databases,statistical analysis, machine learning, deep learning, cloud computing and social network analysis.

    2) RQ2:What are the benefits and challenges mentioned in these articles and how much benefit has been actually realized? Answer: Optimized costs and better customer experience, revenues and security are the major benefits.Challenges include lack of a standard BDA architecture for implementation in industries, along with a lack of security and BDA expertise. Benefits are realized in a limited manner in academic research with respect to frequency of articles and experimental validations in industry and use of standard tools from ever-expanding BDA landscape. Although cloud service providers like AWS and Azure can address data security concerns of telecom practitioners, many telcos’ policies prevent data from leaving the premises [96], [97].

    3) RQ3:How can the challenges be strongly addressed to facilitate BDA applications to telecom sector? Answer: We have answered this by proposing a state-of-the-art lambda architecture for telecom practitioners called LambdaTel; we specify the exact components of this architecture and propose the use of Python to implement it due to its massive online community and availability of Python’s APIs in all NoSQL solutions.

    VIII. CONCLUSIONS AND FUTURE WORK

    Big data analytics (BDA) has much to offer for telecommunications industry and its importance can hardly be underestimated. In this paper, we determined through a systematic literature review that the practical applications of BDA to telecom are limited in academic research with respect to a lack of architecture and usage of latest solutions in an expanding technology stack. To solve this problem and address other challenges, we have proposed and described LambdaTel, a state-of-the-art lambda architecture for BDA implementations in telecom sector. It is important to note that we have successfully implemented LambdaTel in a telecom solution called Darbi (https://www.darbi.io/). Darbi currently has one implementation in the military but the experiments cannot be discussed due to the confidential nature of the application. A limitation of LambdaTel is that the BDA implementation solutions it proposes are not eternal in nature,e.g., MongoDB could be replaced by another better NoSQL document store five years later. This defines our future work also: there is a need to “keep up” with the pace of BDA innovation and “ keep on” modifying LambdaTel's implementation solutions accordingly. We believe LambdaTel gives a strong opportunity to telecom practitioners to implement BDA pipelines now in their own enterprises. In fact, fulfilling the requirements of [15], LambdaTel presents a type of roadmap for BDA application to telecom sector through technology and process improvements. We strongly believe that it can be directly implemented in the industry with minor modifications if needed. As future work, we intend to target BDA applications to Telecom (and related sectors) with respect to IoT domain, which has an apparently rapidlyexpanding user base with many companies/startups planning IoT applications in their business operations along with increasing publication of research papers [98], [99]. We would be interested in determining the exact number of Telecom applications and then to propose and implement an IoT-BDA framework for Telecom sector.

    ACKNOWLEDGMENT

    We acknowledge useful inputs regarding LambdaTel with Mr Uzair Ahmed (Project Lead for Darbi) and with Muhammad Zahid Raza (Assistant Vice President IT) from Meezan Bank (www.meezanbank.com).

    亚洲怡红院男人天堂| 网址你懂的国产日韩在线| 亚洲成人久久爱视频| 性色avwww在线观看| 青青草视频在线视频观看| 中文精品一卡2卡3卡4更新| 91aial.com中文字幕在线观看| 亚洲国产日韩一区二区| 国产色婷婷99| 久久久久久久国产电影| 久久精品国产亚洲网站| 日韩欧美精品免费久久| 国产视频首页在线观看| 亚洲美女视频黄频| 男人舔奶头视频| av在线蜜桃| 精品酒店卫生间| 97超碰精品成人国产| 午夜福利视频1000在线观看| 亚洲内射少妇av| 国产精品一及| 美女高潮的动态| 王馨瑶露胸无遮挡在线观看| 嫩草影院精品99| 在线精品无人区一区二区三 | 精品熟女少妇av免费看| 国产精品人妻久久久久久| 成人一区二区视频在线观看| 国产色爽女视频免费观看| 亚洲美女搞黄在线观看| 又爽又黄无遮挡网站| 国产精品久久久久久久久免| 成年免费大片在线观看| 国产成人aa在线观看| 卡戴珊不雅视频在线播放| 在线观看美女被高潮喷水网站| 亚洲国产欧美人成| 日韩精品有码人妻一区| 男的添女的下面高潮视频| 人妻夜夜爽99麻豆av| 亚洲电影在线观看av| 亚洲成色77777| 人妻制服诱惑在线中文字幕| 久久精品熟女亚洲av麻豆精品| 别揉我奶头 嗯啊视频| 美女高潮的动态| 国产视频内射| 在线免费十八禁| 简卡轻食公司| 国产人妻一区二区三区在| 99久久精品国产国产毛片| 亚洲欧美日韩卡通动漫| 久久久精品94久久精品| 99热这里只有精品一区| av在线播放精品| av一本久久久久| av又黄又爽大尺度在线免费看| 菩萨蛮人人尽说江南好唐韦庄| 看非洲黑人一级黄片| 天天躁夜夜躁狠狠久久av| 国产成人精品久久久久久| 人妻一区二区av| 亚洲性久久影院| 人妻制服诱惑在线中文字幕| 亚洲色图av天堂| 伦理电影大哥的女人| 少妇高潮的动态图| 亚洲最大成人中文| 亚洲欧美日韩东京热| 在线精品无人区一区二区三 | 亚洲av成人精品一区久久| 男女那种视频在线观看| 午夜精品一区二区三区免费看| 国产成人a区在线观看| 国产成人精品一,二区| 99热这里只有精品一区| 女人被狂操c到高潮| 亚洲不卡免费看| 日日啪夜夜爽| 免费少妇av软件| 天堂网av新在线| 国产日韩欧美在线精品| 天堂网av新在线| 天堂网av新在线| 成人欧美大片| 成人一区二区视频在线观看| 99久久精品一区二区三区| 另类亚洲欧美激情| 亚洲国产精品999| 人人妻人人爽人人添夜夜欢视频 | av在线观看视频网站免费| 亚洲成人精品中文字幕电影| 亚洲精品,欧美精品| 汤姆久久久久久久影院中文字幕| 22中文网久久字幕| 男人舔奶头视频| 日韩中字成人| 欧美bdsm另类| 精品午夜福利在线看| 国产黄a三级三级三级人| 91久久精品国产一区二区三区| 亚洲av电影在线观看一区二区三区 | 一区二区三区精品91| 亚洲av中文av极速乱| 91aial.com中文字幕在线观看| 黄片无遮挡物在线观看| 日本午夜av视频| www.色视频.com| 国产精品一及| 欧美国产精品一级二级三级 | 欧美激情国产日韩精品一区| 九草在线视频观看| 中文资源天堂在线| 黄色配什么色好看| 肉色欧美久久久久久久蜜桃 | 99久久精品国产国产毛片| 天天躁日日操中文字幕| 最后的刺客免费高清国语| 熟女人妻精品中文字幕| 国产美女午夜福利| 亚洲精品第二区| 国产成人aa在线观看| 晚上一个人看的免费电影| av黄色大香蕉| 秋霞伦理黄片| 亚洲国产日韩一区二区| 亚洲欧美一区二区三区黑人 | 久久久久精品久久久久真实原创| 日韩精品有码人妻一区| 亚洲伊人久久精品综合| 18禁裸乳无遮挡动漫免费视频 | 欧美97在线视频| 尾随美女入室| 黄色日韩在线| 全区人妻精品视频| 欧美日韩一区二区视频在线观看视频在线 | 熟女av电影| 亚洲最大成人中文| av.在线天堂| 2021天堂中文幕一二区在线观| 久久久久精品久久久久真实原创| 美女国产视频在线观看| 久久人人爽人人爽人人片va| 国产成人福利小说| av在线天堂中文字幕| 大片免费播放器 马上看| 男人舔奶头视频| 亚洲欧美精品自产自拍| 两个人的视频大全免费| 深夜a级毛片| 一区二区av电影网| 亚洲一级一片aⅴ在线观看| 2018国产大陆天天弄谢| 亚洲aⅴ乱码一区二区在线播放| 精品一区二区三区视频在线| 免费黄色在线免费观看| av线在线观看网站| 成人综合一区亚洲| 日韩在线高清观看一区二区三区| 黄色一级大片看看| 边亲边吃奶的免费视频| 少妇被粗大猛烈的视频| av又黄又爽大尺度在线免费看| 免费大片18禁| 亚洲婷婷狠狠爱综合网| 亚洲综合精品二区| 99热6这里只有精品| 亚洲图色成人| 在线观看一区二区三区激情| 精品一区二区三卡| av专区在线播放| 男女那种视频在线观看| 久久亚洲国产成人精品v| 国产av国产精品国产| 免费大片18禁| 日韩强制内射视频| 久久午夜福利片| av国产久精品久网站免费入址| 亚洲欧美日韩卡通动漫| 精品少妇久久久久久888优播| 一本一本综合久久| 日韩欧美精品免费久久| 毛片女人毛片| 能在线免费看毛片的网站| 建设人人有责人人尽责人人享有的 | 色视频www国产| 在线 av 中文字幕| 亚洲av免费高清在线观看| 精品久久久精品久久久| 午夜亚洲福利在线播放| 欧美成人a在线观看| 毛片一级片免费看久久久久| 日韩成人av中文字幕在线观看| 青春草亚洲视频在线观看| 国产av国产精品国产| 日本av手机在线免费观看| 国产综合精华液| 69人妻影院| 国产片特级美女逼逼视频| 国产精品嫩草影院av在线观看| 日韩欧美精品免费久久| 男人舔奶头视频| av免费在线看不卡| 777米奇影视久久| 精品国产三级普通话版| 婷婷色综合大香蕉| 自拍欧美九色日韩亚洲蝌蚪91 | 色吧在线观看| 精品少妇久久久久久888优播| 欧美激情在线99| 卡戴珊不雅视频在线播放| 欧美性感艳星| 一级毛片我不卡| 老司机影院成人| 亚洲成人久久爱视频| 少妇 在线观看| 国产在线一区二区三区精| 黄色欧美视频在线观看| 免费人成在线观看视频色| 成人午夜精彩视频在线观看| 国产淫语在线视频| 夫妻性生交免费视频一级片| 人体艺术视频欧美日本| 你懂的网址亚洲精品在线观看| 久久国产乱子免费精品| 日韩不卡一区二区三区视频在线| 精品午夜福利在线看| 大片电影免费在线观看免费| 少妇猛男粗大的猛烈进出视频 | 春色校园在线视频观看| 免费看av在线观看网站| 亚洲精品国产av蜜桃| av黄色大香蕉| 天堂网av新在线| 熟女电影av网| 91午夜精品亚洲一区二区三区| 日韩中字成人| 在线观看国产h片| 亚洲aⅴ乱码一区二区在线播放| 日本欧美国产在线视频| 国产中年淑女户外野战色| 成年免费大片在线观看| 国产在线一区二区三区精| 久久精品熟女亚洲av麻豆精品| 大陆偷拍与自拍| 国内少妇人妻偷人精品xxx网站| eeuss影院久久| 男男h啪啪无遮挡| 久热久热在线精品观看| 80岁老熟妇乱子伦牲交| 两个人的视频大全免费| 18禁裸乳无遮挡动漫免费视频 | 欧美日本视频| 国产女主播在线喷水免费视频网站| 国产成人免费观看mmmm| 国产一区二区亚洲精品在线观看| 亚洲av电影在线观看一区二区三区 | 99热这里只有是精品50| 国产日韩欧美在线精品| 久久久久久久久久久免费av| 寂寞人妻少妇视频99o| 男人舔奶头视频| 婷婷色麻豆天堂久久| 亚州av有码| 国产黄片视频在线免费观看| 国内精品宾馆在线| 狂野欧美激情性bbbbbb| 成人一区二区视频在线观看| 亚洲,欧美,日韩| 肉色欧美久久久久久久蜜桃 | 日本av手机在线免费观看| 六月丁香七月| 日韩 亚洲 欧美在线| 中文在线观看免费www的网站| 日韩电影二区| 高清视频免费观看一区二区| 色网站视频免费| 成年女人看的毛片在线观看| 七月丁香在线播放| 亚洲av电影在线观看一区二区三区 | 国产精品国产三级国产av玫瑰| 国产亚洲午夜精品一区二区久久 | 日日啪夜夜爽| 日韩一区二区视频免费看| 精品熟女少妇av免费看| 人妻夜夜爽99麻豆av| 欧美日韩在线观看h| 丰满人妻一区二区三区视频av| 好男人视频免费观看在线| 黄色视频在线播放观看不卡| 一级黄片播放器| 久久精品国产a三级三级三级| 国产色婷婷99| 精品99又大又爽又粗少妇毛片| 国产视频内射| 99久久人妻综合| 天天一区二区日本电影三级| 国产精品一区二区性色av| 国产 精品1| 可以在线观看毛片的网站| 欧美最新免费一区二区三区| 日韩一本色道免费dvd| 大片电影免费在线观看免费| 亚洲色图综合在线观看| 日韩 亚洲 欧美在线| 亚洲丝袜综合中文字幕| 国产成人freesex在线| 亚洲国产高清在线一区二区三| 欧美三级亚洲精品| 国产精品一及| 国产精品精品国产色婷婷| 一个人观看的视频www高清免费观看| 午夜福利视频1000在线观看| 中文乱码字字幕精品一区二区三区| 久久久久性生活片| 国产亚洲av片在线观看秒播厂| 天堂中文最新版在线下载 | 国产成人91sexporn| 一区二区av电影网| 久久99精品国语久久久| www.色视频.com| 久久久精品欧美日韩精品| 欧美成人一区二区免费高清观看| 在线 av 中文字幕| 亚洲不卡免费看| 亚洲色图av天堂| 亚洲精品日韩在线中文字幕| 黑人高潮一二区| 五月天丁香电影| 亚洲久久久久久中文字幕| 一个人看视频在线观看www免费| 日韩精品有码人妻一区| 一级毛片电影观看| 久久久久九九精品影院| 黄色欧美视频在线观看| 少妇高潮的动态图| 亚洲精品,欧美精品| 亚洲欧洲国产日韩| av卡一久久| 亚洲人与动物交配视频| 中文字幕亚洲精品专区| 国产成人freesex在线| 亚洲精品乱码久久久久久按摩| 亚洲综合色惰| 在线观看免费高清a一片| 欧美成人一区二区免费高清观看| 精品人妻熟女av久视频| 内地一区二区视频在线| 国产毛片a区久久久久| 国产亚洲5aaaaa淫片| 国产男人的电影天堂91| 亚洲欧美清纯卡通| 国产在线一区二区三区精| 欧美激情久久久久久爽电影| 日韩不卡一区二区三区视频在线| 我要看日韩黄色一级片| 好男人在线观看高清免费视频| 免费人成在线观看视频色| 亚洲色图av天堂| 国产一区二区三区av在线| 嫩草影院精品99| 天天躁日日操中文字幕| 成人鲁丝片一二三区免费| 国产亚洲精品久久久com| 国内精品宾馆在线| 青青草视频在线视频观看| 性色avwww在线观看| av在线老鸭窝| 亚洲经典国产精华液单| 国产国拍精品亚洲av在线观看| 日日摸夜夜添夜夜爱| 亚洲人成网站在线播| 91狼人影院| 日韩欧美 国产精品| 精品久久久久久久久av| 一个人观看的视频www高清免费观看| 性色avwww在线观看| 高清毛片免费看| 亚洲电影在线观看av| 3wmmmm亚洲av在线观看| av在线天堂中文字幕| 一区二区三区免费毛片| 国产精品久久久久久久久免| 亚洲国产日韩一区二区| 神马国产精品三级电影在线观看| 成人国产av品久久久| 激情 狠狠 欧美| 嫩草影院精品99| 又爽又黄无遮挡网站| 久热这里只有精品99| 制服丝袜香蕉在线| 亚洲欧美成人综合另类久久久| 69av精品久久久久久| 天堂俺去俺来也www色官网| 大又大粗又爽又黄少妇毛片口| 九草在线视频观看| 日本wwww免费看| 青青草视频在线视频观看| 麻豆成人午夜福利视频| 久久久久久久大尺度免费视频| 18+在线观看网站| 国产精品一区二区性色av| 少妇人妻久久综合中文| 一二三四中文在线观看免费高清| 国产美女午夜福利| 国产高清三级在线| 人妻制服诱惑在线中文字幕| 久久久久久伊人网av| 久久精品久久精品一区二区三区| 日韩,欧美,国产一区二区三区| 亚洲自拍偷在线| 久久久久久国产a免费观看| 男人和女人高潮做爰伦理| 日本三级黄在线观看| 免费av观看视频| 精品人妻偷拍中文字幕| 中国三级夫妇交换| 色播亚洲综合网| 真实男女啪啪啪动态图| 亚洲欧美日韩东京热| 深夜a级毛片| 最近手机中文字幕大全| 婷婷色av中文字幕| 美女被艹到高潮喷水动态| 三级国产精品欧美在线观看| xxx大片免费视频| 亚洲精品国产av成人精品| 成年av动漫网址| 三级国产精品欧美在线观看| 亚洲国产精品专区欧美| 青青草视频在线视频观看| 内地一区二区视频在线| 国产成人freesex在线| 3wmmmm亚洲av在线观看| 大香蕉97超碰在线| 丝袜喷水一区| 免费看a级黄色片| 国产成人精品婷婷| 最近手机中文字幕大全| 免费看av在线观看网站| 美女脱内裤让男人舔精品视频| 最近最新中文字幕免费大全7| 99久久精品国产国产毛片| 国产精品99久久久久久久久| 国产精品福利在线免费观看| 激情 狠狠 欧美| 久久精品国产亚洲av天美| 丝袜喷水一区| 免费观看在线日韩| 久久人人爽av亚洲精品天堂 | 欧美zozozo另类| 亚洲美女搞黄在线观看| 蜜臀久久99精品久久宅男| 一边亲一边摸免费视频| 亚洲人与动物交配视频| 欧美最新免费一区二区三区| 国产精品秋霞免费鲁丝片| 97精品久久久久久久久久精品| 午夜激情福利司机影院| 成人综合一区亚洲| 哪个播放器可以免费观看大片| 国产精品精品国产色婷婷| 在线亚洲精品国产二区图片欧美 | av女优亚洲男人天堂| 中文字幕免费在线视频6| 黄色配什么色好看| 日本午夜av视频| 亚洲国产精品成人久久小说| 少妇的逼水好多| 亚洲欧美精品专区久久| 亚洲最大成人手机在线| 国产免费又黄又爽又色| 亚洲怡红院男人天堂| 日韩免费高清中文字幕av| 亚洲av成人精品一区久久| 黄片wwwwww| 午夜福利视频精品| 97在线人人人人妻| 国产午夜精品久久久久久一区二区三区| 深爱激情五月婷婷| 91精品国产九色| 亚洲真实伦在线观看| 国产黄片视频在线免费观看| 看黄色毛片网站| 婷婷色综合大香蕉| 一区二区三区乱码不卡18| 日本熟妇午夜| 亚洲av免费在线观看| 中文字幕亚洲精品专区| 国产男女超爽视频在线观看| 国产欧美另类精品又又久久亚洲欧美| 婷婷色av中文字幕| 色网站视频免费| 女人十人毛片免费观看3o分钟| 午夜精品国产一区二区电影 | 国产精品久久久久久精品电影小说 | 亚洲熟女精品中文字幕| 国产免费视频播放在线视频| 欧美一级a爱片免费观看看| 人体艺术视频欧美日本| 啦啦啦在线观看免费高清www| 免费在线观看成人毛片| 夜夜看夜夜爽夜夜摸| 国产亚洲最大av| 麻豆乱淫一区二区| 男的添女的下面高潮视频| 91在线精品国自产拍蜜月| 欧美老熟妇乱子伦牲交| 在线观看一区二区三区| 超碰av人人做人人爽久久| 国产一级毛片在线| 99久久人妻综合| 狂野欧美激情性bbbbbb| 亚洲婷婷狠狠爱综合网| 91狼人影院| 国产高清三级在线| 人人妻人人爽人人添夜夜欢视频 | 亚洲成人av在线免费| 边亲边吃奶的免费视频| 丰满人妻一区二区三区视频av| 成人亚洲精品av一区二区| 五月伊人婷婷丁香| 热99国产精品久久久久久7| 在线天堂最新版资源| 我要看日韩黄色一级片| 日韩在线高清观看一区二区三区| 在线观看一区二区三区| 男插女下体视频免费在线播放| 一个人看的www免费观看视频| av又黄又爽大尺度在线免费看| 亚洲av免费高清在线观看| 欧美最新免费一区二区三区| 亚洲内射少妇av| 午夜精品一区二区三区免费看| 少妇被粗大猛烈的视频| av在线播放精品| av在线老鸭窝| 国产真实伦视频高清在线观看| 欧美丝袜亚洲另类| 日本熟妇午夜| 国国产精品蜜臀av免费| 美女高潮的动态| h日本视频在线播放| 欧美性感艳星| 亚洲国产精品国产精品| 天堂中文最新版在线下载 | 99热这里只有是精品在线观看| 人人妻人人看人人澡| 久久精品综合一区二区三区| 欧美丝袜亚洲另类| 蜜桃久久精品国产亚洲av| 91aial.com中文字幕在线观看| 精品少妇久久久久久888优播| 日日啪夜夜爽| 免费av毛片视频| 美女高潮的动态| 久久久午夜欧美精品| 欧美成人a在线观看| 亚洲色图av天堂| 欧美另类一区| 日本黄色片子视频| 亚洲熟女精品中文字幕| 国产精品嫩草影院av在线观看| 三级男女做爰猛烈吃奶摸视频| 亚洲内射少妇av| 久久久久国产精品人妻一区二区| 国产免费福利视频在线观看| 久久久精品94久久精品| 欧美高清成人免费视频www| 99久久中文字幕三级久久日本| 国产亚洲最大av| 亚洲av不卡在线观看| 永久免费av网站大全| 日本一二三区视频观看| 夜夜爽夜夜爽视频| 国内揄拍国产精品人妻在线| 亚洲国产成人一精品久久久| 亚洲av国产av综合av卡| a级毛色黄片| 日本黄大片高清| 亚洲四区av| 黄色配什么色好看| 噜噜噜噜噜久久久久久91| 中文字幕制服av| 秋霞伦理黄片| av播播在线观看一区| 亚洲成人一二三区av| 亚洲精品aⅴ在线观看| 国产片特级美女逼逼视频| 人妻一区二区av| 色视频在线一区二区三区| 99久久精品一区二区三区| 男女边吃奶边做爰视频| 91精品国产九色| 肉色欧美久久久久久久蜜桃 | 国产精品一区www在线观看| 国内少妇人妻偷人精品xxx网站| 国产黄片视频在线免费观看| 免费少妇av软件| 一级a做视频免费观看| 干丝袜人妻中文字幕| 久久国产乱子免费精品| 天天躁日日操中文字幕| 观看免费一级毛片| 亚洲自拍偷在线| 女人被狂操c到高潮| 成人毛片60女人毛片免费| 最近2019中文字幕mv第一页| 亚洲色图av天堂| 亚洲综合精品二区| 岛国毛片在线播放| 六月丁香七月| 亚洲欧美成人精品一区二区| 91在线精品国自产拍蜜月| 熟女电影av网|