Welcome to Tools for Scientific Software Development and Data Science (2022)

The development of eScience and Data Science across research fields means many researchers have to spend a significant amount of time at their computers. As a consequence, we need to ensure that our skill set and toolbox is up to date and that we can accurately, effectively and in a research-wise justifiable manner conduct our research with a computer.

Who is this course for?
If you in your daily work do any of these:

  1. Process data on a computer
  2. Adapt code and scripts from colleagues or peers
  3. Write code/scripts used by you, your colleagues or peers then this course is for you.

Objectives
In this course you will learn the practical skills and craftsmanship to increase your day-to-day research productivity and be able to produce scientific software with a high degree of compliance to modern research standards. After the course you should be able to

  1. Apply the widely used commandline interface/shell bash in your daily work.
  2. Apply the widely used version control system Git in your daily work.
  3. Understand concepts related to computational reproducibility and data management.

Format
Hands-on interactive three-day event with participatory live-coding, demos and presentations. The participants are encouraged to follow and run the same examples as shown during the course. The workshop will contain several smaller practical 5-10 minutes exercises and breaks.

Course structure
1. Day:

    • Introduction: why are we here?
    • Get efficient with the command line interface (shell: Linux (bash))
    • Be smart: using automatic testing (with examples in Matlab, R and Python)

2. Day:

    • Version Control with Git
    • What you need for your everyday work.
    • Advanced topics (continuous integration, pull request)

3. Day:

    • Get more out of your code: Computational Reproducibility
    • Show off your examples with Jupyter notebook
    • Get more out of your data: FAIR (findability, accessibility, interoperability, and reusability).
    • Work in practice: what IT resources are available to me?

We will not teach a specific programming language and will try to keep the presented material as language-independent as possible.

Prerequisites

  • You will need to bring a laptop with Linux / OS X / Windows.
  • You know the basics of a least one programming language. You can navigate your computer, locate files etc.
  • Read Wilson et. al. “Good enough practices in scientific computing” and start thinking about the presented ideas and to what extent it can be adapted in your work.
  • We will be using uCloud as a common platform for our experiments throughout the course. Please go to https://cloud.sdu.dk prior to the course and make sure you can log into the platform.
    Further instructions on using uCloud will be given in the course.
  • Should you wish to experiment with the demonstrated tools on your own computer, you will need the following:
    • Please install Git on your system: https://git-scm.com/
    • Please have a working installation of one of the following: Python3/Matlab/R.
      If you do not have any of these already, good starting points could be: Anaconda or RStudio.
  • Please make sure to have a GitHub account prior to the course: https://github.com/.

We expect that:

  • You actively participate and work on the examples and exercises.
  • You talk to your neighbors and help each other.
  • Ask for help if both you and your neighbors are stuck.

Course project
The course project will contain several elements from the course. Participants are presented with a default project, or can take on a project based on their existing work if they find this option suitable. The project will require additional work following the three course days.

ECTS: 2
Participants attending at least 80% of the course and submitting an acceptable course project receive credits.

Lecturers
Special consultants Thomas Arildsen, Tobias Lindstrøm Jensen, and a data management specialist.

Dates: 12, 26 September and 7 October 2022

Location: 9220 Aalborg Øst

12 September: Fredrik Bajers Vej 7G/5-109

26 September: Fredrik Bajers Vej 7C/2-209

7 October: Niels Jernes Vej 14/3-119

Deadline: 17 August 2022

Important information concerning PhD courses: We have over some time experienced problems with no-show for both project and general courses. It has now reached a point where we are forced to take action. Therefore, the Doctoral School has decided to introduce a no-show fee of DKK 3.000 for each course where the student does not show up. Cancellations are accepted no later than 2 weeks before start of the course. Registered illness is of course an acceptable reason for not showing up on those days. Furthermore, all courses open for registration approximately four months before start. This can hopefully also provide new students a chance to register for courses during the year. We look forward to your registrations.