Welcome to Aspects of Advanced Analytics
Description: Big Data is being collected in ever larger amounts, e.g., from the (geo-social) web, sensors/IoT devices in cyber-physical systems, or scientific experiments. However, it is necessary to go beyond merely storing and querying data to get the full benefit. Instead, advanced analytics (data mining, prediction, forecasting,..) is applied to the huge data volumes to extract trends and patterns, and use historical data to predict future events, so-called predictive analytics. Recently, optimization has been added on top, resulting in so-called prescriptive analytics that prescribes the best course of action given the data and associated predictions and optimization goals.

Traditional data analytics systems do not scale, and/or support only some of the tasks, or do not support the deep requirements for specific types of data, resulting in poor scalability, poor developer productivity and/or lack of functionality.

This course will cover selected aspects of advanced analytics including concepts, algorithms, and systems. The course will feature a mix of theoretical concepts and algorithms with practical hands-on exercises using specific advanced analytics systems on a number of realistic case datasets, focusing on the application area of energy analytics.

Prerequisites: A general background in computer science, and general familiarity with database management and analytics is expected. In specific, students are required to possess following knowledge:

  • Mandatory:
    • Linear algebra: linear equations, vector and matrix operations
    • Database systems: relational data model, SQL, DBMS internals/query processing
    • Algorithms and data structures: basic algorithms (e.g., sort, search) and data structures (e.g., list, tree, table), and basic knowledge of designing and analyzing algorithms
    • Programming languages: know at least one of the following: Python/Java/C-C++
  • Preferable:
    • Distributed computing: concepts of distributed systems, Cloud Computing, MapReduce and big data tools (e.g., Hadoop, Hive, Pig, Spark)
    • Machine learning: basic algorithms of classification (e.g., regression, SVM, neural networks), clustering (e.g., k means, kNN)
    • Optimization: linear/integer programming
    • Time series forecasting: forecasting models for time series such as ETS models, ARIMA models
    • Other scripting languages and tools: R/Matlab

Learning objectives: The objectives of the course are to provide students with a working understanding of concepts, algorithms and systems for selected aspects of advanced analytics, with an application focus on energy analytics.

Organizer: Professor Torben Bach Pedersen, email: tbp@cs.aau.dk

Lecturers: Associate Professor Lukasz Golab, University of Waterloo and post docs Thi Thao Nguyen Ho and Laurynas Siksnys, AAU.

ECTS: 2

Time: December 5-7, 2017

Place: Selma Lagerløfsvej 300, room 0.2.90. 

Zip code: 9220

City: Aalborg Ø

Number of seats: 20

Deadline: 20 October 2017

Important information concerning PhD courses We have over some time experienced problems with no-show for both project and general courses. It has now reached a point where we are forced to take action. Therefore, the Doctoral School has decided to introduce a no-show fee of DKK 5,000 for each course where the student does not show up. Cancellations are accepted no later than 2 weeks before start of the course. Registered illness is of course an acceptable reason for not showing up on those days. Furthermore, all courses open for registration approximately three months before start. This can hopefully also provide new students a chance to register for courses during the year. We look forward to your registrations.