DATA MANAGEMENT SYSTEMS LABORATORY
The main objectives of this graduate-level course are to provide an in-depth understanding of advanced concepts and research directions in the field of databases. The course is organized in three parts: (i) Fundamentals of Database Systems Implementation; (ii) Distributed, Web and Cloud Databases; (iii) Spatio-temporal Data Management, Sensor Data Management other selected and advanced topics from the recent scientific literature.
Outline: Outline: (i) Fundamentals of modern Database Management Systems (DBMSs): storage, indexing, query optimization, transaction processing, concurrency and recovery. (ii) Fundamentals of Distributed DBMSs, Web Databases and Cloud Databases (NoSQL / NewSQL): Semi-structured data management (XML/JSON, XPath and XQuery), Document data-stores (i.e., CouchDB, MongoDB, RavenDB), Key-Value data-stores (e.g., BerkeleyDB, MemCached), Introduction to Cloud Computing (GFS, NFS, Hadoop HDFS, Replication/Consistency Principles), 'Big-data ' analytics (MapReduce, Apache 's Hadoop, PIG), Column-stores (e.g., Google 's BigTable, Apache 's HBase, Apache 's Cassandra), Graph databases (e.g., Twitter’s FlockDB) and Overview of NewSQL (Google 's Spanner and Google 's F1). (iii) Spatio-temporal data management (trajectories, privacy, analytics) and index structures (e.g., R-Trees, Grid Files) as well as other selected and advanced topics, including: Embeeded Databases (sqlite), Sensor / Smartphone / Crowd data management, Energy-aware data management, Flash storage, Stream Data Management, etc. The last part of the course will feature both invited talks from external invited speakers and the presentations of students.
The main objective of this undergraduate course is to provide an in-depth understanding of Database Management Systems. In particular, students will be exposed to the internal structures and algorithms of a relational database system. Students will get a deeper understanding by implementing components of the Minibase database system in the C++ language. Minibase is a database management system intended for educational use that includes a parser, optimizer, buffer pool manager, storage mechanisms (heap files, secondary indexes based on B+ Trees), and a disk space management system. The course is organized in four parts: i) Storage and Indexing, ii) Query Optimization, iii) Transaction Management and iv) Advanced Topics (Distributed Databases and XML Data Management).
Outline: Introduction to Storage and Indexing, Storing Data: Disks and Files, Tree-based Indexing and Hash-based Indexing, Overview of Query Evaluation, External Sorting, Evaluating Relational Operators, Structure of a Typical Relational Query Optimizer, Overview of Transaction Management, Introduction to Concurrency Control (2PL, Serializability, Recoverability, Lock Conversions, Deadlocks), Concurrency Control with Locking, Dynamic Databases and the Phatom Problem, CC in B+trees, Multigranular locking, Concurrency Control without Locking (Optimistic, Timestamp, Multiversion), Introduction to Crash Recovery (ARIES, LOG, WAL, Checkpointing), Recovering from a System Crash (Analysis, Redo, Undo), Media Recovery, Distributed Databases (Architectures, Storage, Catalog Management and Query Processing) and XML Data Management (Models, Query Processing and XQuery)
In this course, students will learn to develop complex system-level software in the C programming language while gaining an intimate understanding of the UNIX operating system (and all OS that belong to this family, such as Linux, the BSDs, and even Mac OS X) and its programming environment. Topics covered will include the user/kernel interface, fundamental concepts of UNIX, user authentication, basic and advanced I/O, fileystems, signals, process relationships, and interprocess communication. Fundamental concepts of software development and maintenance on UNIX systems will also be covered. The students are expected to have a good working knowledge of the C programming language (EPL132) and a good working knowledge of fundamental Operating System Concepts (EPL221).
Outline: Main concepts of System Programming, Introductory and Advanced UNIX commands, System utilities and stream editors (awk,sed), Advanced Shell programming with an emphasis on Bash, Low-Level I/O in C, Files and Filesystem, Processes: Environment, Control and Signals, Interprocess Communication (IPC) with an emphasis on Pipes and Named Pipes (FIFO) in C, XSI IPC (Semaphores, Shared Memory and Message Queues) in C, Network IPC (TCP Sockets) and the client/server model in C, Multithreading in C, Performance evaluation (profiling). Issues in system security and system engineering, Systems Programming in Windows (threads, processes, IPC, sockets and Powershell programming), Scripting Languages: Perl, PHP, Python, TCL/TK.
The main objective of this undergraduate course is to provide an in-depth understanding of concepts related to the design and utilization of a database management system. Students will get a deeper understanding byimplementing these concepts in a commercial database management system. The course is organized in four parts: i) Introduction and Conceptual Modeling using the ER Model, ii) Relational Model and Relational Algebra, iii) Structured Query Language III, and iv) Database Design Theory and Methodology
Outline: Introduction: Databases and Database Users, Database System Concepts and Architecture, Data Modeling Using the Entity-Relationship (ER) Model, The Enhanced Entity-Relationship (EER) Model, The Relational Data Model and Relational Database Constraints, Relational Algebra, Relational Database Design by ER and EER-to-Relational Mapping, SQL-99: Schema Definition, Constraints, Queries, and Views, Introduction to SQL Programming Techniques, Functional Dependencies and Normalization for Relational Databases, Relational Database Design Algorithms and Further Dependencies, Practical Database Design Methodology, Introduction to Data Storage, Indexing, Query Processing, and Physical Design
The course teaches intermediate and advanced programming concepts, techniques and tools through a language that compiles to machine code. The course familiarizes the students with advanced programming constructs utilized for handling memory and files. Advanced topics in compilation, debugging, documentation and optimization of software. Methodological aspects in developing large-scale system software that addresses complex problems. Basic commands for programmers in the UNIX operating system.
Outline: i) Introduction to C for Programmers: types x86/x64, loops, selections, expressions, arrays, functions, IO, basic program organization, ii) Advanced C programming constructs: program anatomy and processes, memory and addresses (pointers, pointers and arrays, strings and examples), structures, unions and enumerations. Linear and non-linear programming data structures (dynamic memory allocation, lists, queues, doubly-linked lists, trees, applications and examples). iii) Advanced Compilation Topics and Tools: preprocessor directives, compiling multiple files with makefiles, static (.a) and dynamic (.so) linking of object files (.o), error handling (assert.h), static and dynamic code analysis (valgrind and gprof). iv) low-level programming (binary operators and examples, binary files and hexdump). v) Basic commands for programmers in the UNIX operating system: file system, redirection and pipes, permissions and basic filters.
The main objective of this undergraduate course is to provide an in-depth understanding of programming principles underlying modern application and systems software. In particular, the course familiarizes the students with advanced programming constructsutilized for handling memory and files, basic software structures and their associated algorithms, low-level programming, building, debugging, documenting and optimizing large-scale software systems individually and in groups through integrated software environments. The course is taught in the C programming language.
Outline: The course is organized in the following four topics: A) Building Entry-scale programs: Fundamental constructs of the C programming language: Types (x86, x64), Expressions, Arrays, Functions, IO, basic program organization. Program Anatomy in Memory and Disk, Processes and Addresses. Memory (Pointers, Pointers and Arrays, Strings and Examples), Structures, Unions and Enumerations. Disk (File Streams, Formatted, Character, Block I/O, File positioning, buffering. B) Building Mid-scale programs: C Preprocessor, Source/Header Files, Building multiple files with Makefiles, linking, macros, etc.). Linear and Non-Linear Programming Data Structures (Dynamic memory allocation, lists, queues, doubly-linked lists, trees, applications and examples). Advanced Programming Constructs and Error Handling: Pointer-to-Pointers, Pointers-to-Functions, Inline Functions and library functions (assert, errno, perror). C) Building Large-scale Programs: Modules, Information Hiding, Design Issues, Extreme Programming and Subversion, Open-source collaborative software development. D) Selected Topics: Low-level Programming, Introduction to Secure Programming and Introduction to C++.
The main objective of this undergraduate course is to provide an in-depth understanding of concepts related to the efficient organization and manipulation of data as well as the design and analysis of algorithms. The course familiarizes the students with data structures and their associated algorithms, techniques for evaluating the complexity of algorithms and also develops skills for efficient algorithm design and implementation.
Outline: Advanced programming techniques based on the programming language C: Recursion, Structures, Pointers, File and Memory management. Data types and abstract data types. Algorithm complexity analysis: worst-case and average-case analysis. Linear data structures: List, Stack and Queue, using static and dynamic memory allocation methods. Applications of linear data structures. Sorting algorithms: SelectionSort, InsertionSort, MergeSort, QuickSort and BucketSort. Tree data structures: Binary Trees, Binary Search Trees, Balanced Trees, B-trees. Priority Queues and Heaps. Graphs: definitions, data structures, topological sorting algorithms, graph traversal algorithms. Hashing techniques, hash functions and collision resolution techniques.