Social Data Science: An applied introduction to machine learning 

    PhD School at the Faculty of Social Science at Aalborg University

    Link to registration coming soon. Please do not register through Moodle

    The developments in computer science technologies and the increasing amount of accessible data present a range of new methodological opportunities for the social sciences and humanities.

    Data from websites, social media and electronic devices (often referred to as ‘Big Data’) allow for new approaches and perspectives on issues relevant for both the social sciences and humanities. Meanwhile, the increasing computational power and development of artificial intelligence algorithms provide the means for accessing, combining and analyzing a variety of data types (numerical, textual, relational) in new and meaningful ways.

    This course is a hands-on practical introduction with no prerequisites in applying computer science techniques (like programming and machine learning) in humanities and social science research. It will cover a broad range of techniques and methods representing the latest methodological innovations in social science and humanities applications of machine learning and artificial intelligence. Some techniques include:

    •             Collecting data from the web using web scraping methods and API's

    •             Processing textual data for quantitative analysis (Natural Language Processing)

    •             Working and visualizing networks (network analysis)

    •             Dimensionality reduction and clustering techniques (topic models and k-means clustering)

    •             Visualization techniques for text data and networks

    •             Building and understanding machine learning classifiers

    This course is meant as a hands-on tools course focusing on the practical use of these methods and will not go in depth with the mathematical and theoretical foundations. It will rather provide a broad overview of the data science ecosystem and toolbox and enable immediate application.

    Structure / teaching format:
    Each day will consist of a mixture of lectures and exercises using interactive online notebooks allowing participants to try out and use the various methods as they are being taught.

    Participants are expected to work on a portfolio during the week with each day having hours dedicated to portfolio work with the possibility of sparring with the course lecturers. Here, participants will work on applying the methods and techniques presented on various cases.

    The course teaches the methods in Python using the Jupyter Notebook IDE on Google Colab ( It is not a prerequisite to know Python beforehand as access to relevant courses will be provided and the first day of the course provides the relevant introduction.

    Participants are expected to complete assigned introductory e-courses on DataCamp before the course. Access to DataCamp will be provided 4 weeks in advance. Two mandatory online check-in sessions are scheduled to properly prepare participants for the course.

    If you are already familiar with Python programming and the application of statistical tools, it is possible to skip the first day and sign up for a lesser fee and receive less credits (4-day version).

    Please bring your own laptop for the course. Make sure to have a Google Account to use on Google Colab. An account can be created for free at

    Learning objectives:
    The objectives of the course is to obtain knowledge of key data science concepts and their relevance in social science and humanities as well as gaining practical competencies in applying and embedding data science methods in quantitative and qualitative workflows using a variety of data types (numerical, textual, relational).

    Participants are expected to hand in a portfolio assignment no later than 2 weeks after the conclusion of the course. Credits can only be received by handing in portfolio assignment by December 10th 2021.

    Access to relevant e-course material from DataCamp will be provided 4 weeks in advance. All other materials and notebooks will be provided at the course.

    •             Associate Professor Daniel Hain (Aalborg University Business School)

    •             Associate Professor Roman Jurowetzki (Aalborg University Business School)

    •             Assistant Professor Rolf Lyneborg Lund (Department of Sociology and Social Work, Aalborg University)

    •             Associate Professor Anders Kristian Munk (TANT-Lab, Department of Culture and Learning, Aalborg University)

    •             Professor Thomas B. Moeslund (Department of Architecture, Design and Media Technology, Aalborg University)

    •             + assisting data scientists from CALDISS and CLAAUDIA

    ECTS: 4 (3 for 4-day version)

    Course fee:
    •             For AAU phD students: 455,- DKK (incl. VAT)

    •             For non-AAU phD students (entire course): 4,800,- DKK (excl. VAT)

    •             For non-AAU phD students (4-day version): 3,600,- DKK (excl. VAT)

    November 22nd-26th 2021, 9:00-16:00 + two online check-in sessions (dates TBD


    Zip code:


    Number of seats: 30

    Deadline for registration: November 8th 2021

    Registration will open in June 2021.

    NOTE: This is a PhD course. PhD students will therefore have prior claim to seats over non-PhD students.

    Covid-19 contingencies
    This course will be held with physical attendance (with exception of the online check-ins). Should precautionary measures for COVID-19 still be in effect at the time of the course, the course will be converted to an entirely digital course.

    Should the course be held digitally, participants will be refunded 455,- DKK corresponding to the price of catering.

    Registered participants will be notified via e-mail, if the course will be converted to an entirely digital formats.

    No-show fee:
    We have over some time experienced problems with no-show for both project and general courses. It has now reached a point where we are forced to take action. Therefore, the Doctoral School has decided to introduce a no-show fee of DKK 5,000 for each course where the student does not show up. Cancellations are accepted no later than 2 weeks before start of the course. Registered illness is of course an acceptable reason for not showing up on those days. Furthermore, all courses open for registration approximately three months before start. This can hopefully also provide new students a chance to register for courses during the year. We look forward to your registrations.

    Organizing Committee:

    • Associate Professor Daniel Hain (Aalborg University Business School)Associate Professor Roman Jurowetzki (Aalborg University Business School)
    • Assistant Professor Rolf Lyneborg Lund (Department of Sociology and Social Work, Aalborg University)
    • Professor Thomas B. Moeslund (Department of Architecture, Design and Media Technology, Aalborg University)
    • Kristian Gade Kjelmann (General Manager of CALDISS, Aalborg University)

    Please contact Kristian Gade Kjelmann with any questions regarding this course: