Abstract
A telecommunication company (telco) is traditionally only perceived as the entity that provides telecommunication services, such as telephony and data communication access to users. However, the radio and backbone infrastructure of such entities spanning densely most urban spaces and widely most rural areas, provides nowadays a unique opportunity to collect immense amounts of data that capture a variety of natural phenomena on an ongoing basis, e.g., traffic, commerce, mobility patterns and user service experience. The ability to perform analytics on the generated big data within tolerable elapsed time and share it with key smart city enablers (e.g., municipalities, public services, startups, authorities, and companies), elevates the role of telcos in the realm of future smart cities from pure network access providers to information providers. In this tutorial, we overview the state-of-the-art in telco big data analytics by focusing on a set of basic pillars, namely: (i) background and respective architectures; (ii) real-time analytics and detection; (iii) experience, behavior and retention analytics; (iv) privacy; and (v) storage. We also present experiences from developing an innovative such architecture and conclude with open problems and future directions.
Introduction
A telecommunication company (telco) is the entity that provides telecommunication services, such as telephony and data communication access to users. The rapid expansion of broadband mobile networks, the pervasiveness of smartphones, and the introduction of dedicated Narrow Band connections for smart devices and Internet of Things (NB-IoT) have contributed to the expansion of the radio and backbone infrastructure of such entities in a way that these nowadays span densely most urban spaces and widely most rural areas. This has lead to the generation of very large amounts and variety of spatiotemporal big data that encapsulate a wide range of natural phenomena on an ongoing basis, e.g., traffic, commerce, mobility patterns and user service experience.
Data exploration queries over such TBD are of great interest to both the telco operators and the smart city enablers (e.g., municipalities, public services, startups, authorities, and companies), as these allow for interactive analysis at various granularities, narrowing it down for a variety of tasks. Effectively storing and processing TBD workflows can unlock a wide spectrum of challenges, ranging from churn prediction of subscribers, city localization, 5G network optimization / user-experience assessment, optimizing public transportation and road traffic mapping. Data exploration and visualization might be the most important tools in the big data era where decision support makers, ranging from CEOs to front-line support engineers, aim to draw valuable insights and conclusions visually.
Compression refers to the encoding of data using fewer bits than the original representation and is important as it shifts the resource bottlenecks from storage- and network-I/O to CPU, whose cycles are increasing at a much faster pace. It also enables data exploration tasks to retain full resolution over the most important collected data. Decaying on the other hand, as suggested in~\cite{kersten2015fungus}, refers to the progressive loss of detail in information as data ages with time until it has completely disappeared (the schema of the database does not decay). This enables data exploration tasks to retain high-level data exploration capabilities for predefined aggregates over long time windows, without consuming enormous amounts of storage.
Our tutorial aims to provide an extensive coverage of TBD research, which falls under the following categories: (i) background on TBD and respective architectures; (ii) real-time analytics and detection; (iii) experience, behavior and retention analytics; (iv) privacy; and (v) storage. There is also traditional telco research not related to big data, rather comprises of topics related to business (BSS) data in relational databases. The given presentation should allow the audience to grasp basic and advanced concepts ranging from the anatomy of a telco network and the structure of TBD all the way up to applications and benefits of TBD. We will conclude the tutorial with the presentation of the challenges and opportunities in the field.
In particular, this seminar addresses the following audience:
- Graduate and Undergraduate Students
- Mobile Data Management Researchers/Educators
- Industry Developers
Short Biographies
Constantinos Costa is a Visiting Lecturer at the Department of Computer Science at University of Pittsburgh, PA, USA and a Research Associate at the Advanced Data Management Technologies Laboratory (ADMT). His primary research interests include Spatial Big Data Management, particularly distributed query processing for spatial and spatio-temporal datasets. He holds a Ph.D. in Computer Science (2018) from the University of Cyprus and his thesis was titled ``Algorithms and Indexing Structures for Spatial Big Data''. Besides research, he has distinguished in several programming and innovation competitions and is an active contributor to several industrial and open-source systems for telco big data, indoor navigation, crowd messaging. He has industrial experience in the telco big data sector. For more information please visit:https://www.cs.ucy.ac.cy/~costa.c/.
Demetrios Zeinalipour-Yazti is an Associate is an Associate Professor of Computer Science at the University of Cyprus, where he leads the Data Management Systems Laboratory (DMSL). His primary research interests include Data Management in Computer Systems and Networks, particularly Mobile and Sensor Data Management; Big Data Management in Parallel and Distributed Architectures; Spatio-Temporal Data Management; Network and Telco Data Management; Crowd, Web 2.0 and Indoor Data Management; Data Privacy Management. He holds a Ph.D. in Computer Science from University of California - Riverside (2005). Before his current appointment, he served the University of Cyprus as an Assistant Professor and Lecturer but also the Open University of Cyprus as a Lecturer. He has held visiting research appointments at Akamai Technologies, Cambridge, MA, USA, the University of Athens, Greece, the University of Pittsburgh, PA, USA and the Max Planck Institute for Informatics, Saarbrücken, Germany. He is a Humboldt Fellow, Marie-Curie Fellow, an ACM Distinguished Speaker (2017-2020), a Senior Member of ACM, a Senior Member of IEEE and a Member of USENIX. He serves on the editorial board of Distr. and Par. Databases (Elsevier), Big Data Research (Springer) and is an independent evaluator for the European Commission (Marie Skłodowska-Curie and COST actions). His h-index is 24, holds over 2800 citations, has an Erdös number of 3, won 10 international awards (ACMD17, ACMS16, IEEES16, HUMBOLDT16, IPSN14, EVARILOS14, APPCAMPUS13, MDM12, MC07, CIC06) and delivered over 30 invited talks. He has participated in over 20 projects funded by the US National Science Foundation, by the European Commission, the Cyprus Research Promotion Foundation, the Univ. of Cyprus, the Open University of Cyprus and the Alexander von Humboldt Foundation, Germany. Finally, he has also been involved in industrial Research and Development projects (e.g., Finland, Taiwan and Cyprus) and has technically lead several mobile data management services (e.g., Anyplace, Rayzit and Smartlab) reaching over 35K users worldwide with over 140K sessions. For more information please visit: https://www.cs.ucy.ac.cy/~dzeina/ or the DMSL website: https://dmsl.cs.ucy.ac.cy/.
Tutorial Material
Location
Contact Details
Constantinos Costa
- Phone:-
- Email: costa.c@cs.pitt.edu
- Address: 5425 Sennott Square, 210 S Bouquet St, Pittsburgh, PA 15213, USA.
- Website: https://www.cs.ucy.ac.cy/~costa.c/
Demetrios Zeinalipour
- Phone: 357-22-892755
- Email: dzeina@cs.ucy.ac.cy
- Address: 1 University Ave., P.O. Box 20537, 1678 Nicosia, Cyprus.
- Website: https://www.cs.ucy.ac.cy/~dzeina/