Big Data Management Systems

Course Code: 
Elective Courses

The use data in making accurate, reliable and timely decisions has become a "sine qua non" factor of success for most modern businesses and organizations. At the same time, in recent years, with the development of new technologies and applications - such as the spreading of social networks, the extensive use of smart phones, the installation of sensors etc. - the volume and format of the data has changed dramatically: We now have volumes of petabytes and exabytes data in both text, audio, video, and image formats. The need to manage and exploit this data has led to the development of a new generation of systems, models and programming tools that are still in the embryonic stage such as: Map Reduce, Hadoop and its ecosystem, NoSQL, etc. Technologies enabling parallel data processing on a large scale and fault-tolerant way. The purpose of this course is to present the basic principles of these systems and how they work.

The course contents include:

  • Basic knowledge: query processing, distributed and parallel query processing, distributed systems
  • Programming language: Python
  • MapReduce, Hadoop and ecosystem
  • NoSQL, Key-Value Systems, Learning Redis
  • NoSQL, Document-Store Systems, MongoDB learning
  • Data Flow Management and Applications
  • Interconnectivity in Large Data Management Systems