Abstract

A Telecommunication company (Telco) is traditionally only perceived as the entity that provides telecommunication services, such as telephony and data communication access to users. However, the radio and backbone infrastructure of such entities spanning densely most urban spaces and widely most rural areas, provides nowadays a unique opportunity to collect immense amounts of data that capture a variety of natural phenomena on an ongoing basis, e.g., traffic, commerce, mobility patterns and emergency response. The ability to perform analytics on the generated big data within a tolerable elapsed time and share it with key Smart City Enablers (e.g., municipalities, public services, startups, authorities, and companies), elevates the role of Telcos in the realm of future Smart Cities from pure network access providers to information providers. In this talk, we overview the state-of-the-art in Telco big data analytics by focusing on a set of basic principles, namely: (i) real-time analytics and detection; (ii) experience, behavior and retention analytics; (iii) privacy; and (iv) storage. We also present experiences from developing an innovative such architecture and conclude with open problems and future directions.

Introduction

Unprecedented amounts and variety of spatiotemporal big data are generated every few minutes by the infrastructure of a telecommunication company (telco). The rapid expansion ofbroadband mobile networks, the pervasiveness of smartphones, and the introduction of dedicated Narrow Band connections for smart devices and Internet of Things (NB-IoT) have contributed to this explosion. For example, a telco in the city of Shenzhen, China, which serves 10 million users produce 5TB per day (i.e., thousands to millions of records every second). Huang et al. break their 2.26TB per day Telco Big Data (TBD) down as follows: (i) Business Supporting Systems (BSS) data, which is generated by the internal work-flows of a telco (e.g., billing, support), accounting to a moderate of 24GB per day and; (ii) Operation Supporting Systems (OSS) data, which is generated by the Radio and Core equipment of a telco, accounting to 2.2TB per day and occupying over 97% of the total volume.

Data exploration queries over big telco data are of great interest to both the telco operators and the smart city enablers (e.g., municipalities, public services, startups, authorities, and companies), as these allow for interactive analysis at various granularities, narrowing it down for a variety of tasks. Effectively storing and processing TBD workflows can unlock a wide spectrum of challenges, ranging from churn prediction of subscribers, city localization, 5G network optimization / user-experience assessment and road traffic mapping. Data exploration and visualization might be the most important tools in the big data era, where decision support makers, ranging from CEOs to frontline support engineers, aim to draw valuable insights and conclusions visually. Our tutorial will tackle the topic from a wide range of perspectives: fundamentals, definitions, current state, academic & industrial perspective, reality & visionary scenarios as well as future challenges. The seminar captures the big picture, such that interested researchers and practitioners can expand their study by following the references. Our presentation is carried out through the lens of an experimental Telco Big Data System we developed at the University of Cyprus, coined SPATE, which is a SPAtio-TEmporal framework that uses both lossless data compression and lossy data decaying to ingest large quantities of telco big data in the most compact manner.

Compression refers to the encoding of data using fewer bits than the original representation and is important as it shifts the resource bottlenecks from storage- and network-I/O to CPU, whose cycles are increasing at a much faster pace. It also enables data exploration tasks to retain full resolution over the most important collected data. Decaying on the other hand, as suggested in [15], refers to the progressive loss of detail in information as data ages with time until it has completely disappeared (the schema of the database does not decay). This enables data exploration tasks to retain high-level data exploration capabilities for predefined aggregates over long time windows, without consuming enormous amounts of storage.

Our tutorial aims to provide an extensive coverage of telco big data research, which falls under the following categories: (i) real-time analytics and detection; (ii) experience, behavior and retention analytics; (iii) privacy; and (iv) storage. There is also traditional telco research not related to big data, rather comprises of topics related to business (BSS) data in relational databases. The given presentation should allow the audience to grasp basic and advanced concepts ranging from the anatomy of a telco network and the structure of telco big data all the way up to applications and benefits of Telco Big Data. We will conclude the seminar with the presentation of the challenges and opportunities in the field ranging from telco big data processing challenges.

In particular, this seminar addresses the following audience:

  1. Graduate and Undergraduate Students
  2. Mobile Data Management Researchers/Educators
  3. Industry Developers

Short Biographies

Constantinos Costa is a full-time Ph.D. Candidate and a Research Assistant at the Department of Computer Science (UCY), being involved in research at the Data Management Systems Laboratory (DMSL). He holds a M.Sc. degree in Computer Science (2013) and a B.Sc. degree in Computer Science (2011) from the University of Cyprus. His research interests include databases and mobile computing, particularly distributed query processing for spatial and spatio-temporal datasets. Costa has contributed extensively to open source projects for indoor navigation, crowd messaging and telco big data. For more information please visit: https://www.cs.ucy.ac.cy/~costa.c/.

Demetrios Zeinalipour-Yazti is an Associate Professor of Computer Science at the University of Cyprus, where he leads the Data Management Systems Laboratory (DMSL). His primary research interests include Data Management in Computer Systems and Networks, particularly Mobile and Sensor Data Management; Big Data Management in Parallel and Distributed Architectures; Spatio-Temporal Data Management; Network and Telco Data Management; Crowd, Web 2.0 and Indoor Data Management; Data Privacy Management. He holds a Ph.D. in Computer Science from University of California - Riverside (2005). Before his current appointment, he served the University of Cyprus as an Assistant Professor and Lecturer but also the Open University of Cyprus as a Lecturer. He has held visiting research appointments at Akamai Technologies, Cambridge, MA, USA, the University of Athens, Greece, the University of Pittsburgh, PA, USA and the Max Planck Institute for Informatics, Saarbrcken, Germany. He is a Humboldt Fellow, Marie-Curie Fellow, an ACM Distinguished Speaker (2017-2020), a Senior Member of ACM, a Senior Member of IEEE and a Member of USENIX. He serves on the editorial board of Distributed and Parallel Databases (Elsevier), Big Data Research (Springer) and is an independent evaluator for the European Commission (Marie Skodowska-Curie and COST actions). His h-index is 24, holds over 2600 citations, has an Erds number of 3, won 10 international awards (ACMD17, ACMS16, IEEES16, HUMBOLDT16, IPSN14, EVARILOS14, APPCAMPUS13, MDM12, MC07, CIC06) and delivered over 30 invited talks. He has participated in over 20 projects funded by the US National Science Foundation, by the European Commission, the Cyprus Research Promotion Foundation, the Univ. of Cyprus, the Open University of Cyprus and the Alexander von Humboldt Foundation, Germany. Finally, he has also been involved in industrial Research and Development projects (e.g., Finland, Taiwan and Cyprus) and has technically lead several mobile data management services (e.g., Anyplace, Rayzit and Smartlab) reaching over 35K users worldwide with over 140K sessions. For more information please visit: https://www.cs.ucy.ac.cy/~dzeina/ or the DMSL website: https://dmsl.cs.ucy.ac.cy/.

Tutorial Material

Location

Contact Details

Constantinos Costa

PhD Candidate

Card image cap

Demetrios Zeinalipour

Associate Professor

Card image cap