Welcome to Data Science using R


Description:

After the completion of the course, the participants

  • will be able to streamline the data import and data handling through the use of tidyverse
  • can make presentation-ready graphics to visualise their own data using ggplot2
  • are able fit, interpret and present various statistical models for both numeric and categorical data using methodologies like linear regression lm (also in penalised form using the LASSO glmnet), support vector machines svm, naive bayes classifier naiveBayes, classification and regressions trees (CART) rpart, etc.
  • can reduce the dimensionality of their data by principal components analysis princomp for visualisation and modelling
  • can perform cluster analysis using K-means and hierarchical clustering hclust used to produce dendrograms and in heatmap graphics
  • have acquired the skills to present their analysis in standard alone documents using rmarkdown and knitr
  • have seen shiny applications and dashboards for interactive demonstrations of data and models through the use of responsive graphics and tables

These skills will enable you to efficiently wrangle your data into a desired format for further analysis. This includes the abbility to aggregate, summarise and visualise the data at various steps in the data analysis. As R is a scripting language you are free from the constraints of an usual spread sheet program like Excel. The scripts also serves as a transparent and reproducible framework for re-doing your analysis over and over again - as well as re-using essential parts in other analyses. The graphics produced by R and in particular ggplot2 are used professionaly by academics, data visualisation communities and data scientists.



Organizer: Torben Tvedebrink, Søren Højsgaard og Mikkel Meyer Andersen

Lecturers: Torben Tvedebrink, Søren Højsgaard og Mikkel Meyer Andersen

ECTS: 4

Time: 20 - 24 August (8:30 - 16:00) and September 4 (8:30 - 15:00)

Place: Skjernvej 4 A, room skj 4/2.115

City: 9220 Aalborg

Number of seats: 40



Number of seats: 40

Deadline: 1 August



Important information concerning PhD courses We have over some time experienced problems with no-show for both project and general courses. It has now reached a point where we are forced to take action. Therefore, the Doctoral School has decided to introduce a no-show fee of DKK 5,000 for each course where the student does not show up. Cancellations are accepted no later than 2 weeks before start of the course. Registered illness is of course an acceptable reason for not showing up on those days. Furthermore, all courses open for registration approximately three months before start. This can hopefully also provide new students a chance to register for courses during the year. We look forward to your registrations.