Latest news
  • Final, Housing data (.csv), use read.csv to get into R
  • Extra office hours during exam period. Fri 26/5 13-15, Mon 29/5 15-17, Tue 30/5 11-13, Wed 31/5 13-15, Thur 1/6 13-15, Fri 2/6 13-15.
  • The schedule for the course can be found via the link to webTimeEdit top of the page.

  • Student representatives:;
  • Teachers
    Course coordinator: Rebecka J├Ârnsten
    Office: MVH 3029
    Office hours: Thur 8:30-10:00 in MVH3029

    The Elements of Statistical Learning , Hastie, T., Tibshirani, R., and Friedman, J.

    Weblink to the book. (Right-click to open in new tab or window).

    We will also use Journal papers and other materials. These will be posted under "Programme".

    Recommended texts include:

  • "Statistics for High-dimensional Data: Methods, Theory and Applications", Springer 2011, P. Buhlmann and S v.d. Geer, editors.
  • "Handbook of Big Data", Chapman and Hall CRC, 2016. P. Buhlmann, P. Drineas, M. Kane, M vd Laan editors

  • Programme

    Topics Chapter
    Lecture materials

    Introduction + CART + RF

    Chapters 2.1-2.3.
    Chapters 8.7, 9.2, 15

    Lecture 1, puppy image as text , R code See link to Mini1 at end of Lecture 1 notes.
    Lecture 2, R code


    2.1-2.7, 3.1-3.8, 4.1-4.4, 7.1-7.10, 13.3

    Lecture 3, wine data , R code See link to Mini2 at end of Lecture 1 notes.
    Lecture 4, R code
    Lecture 5, R code , Caret slides, Caret paper, sparse LDA paper

    High-dimensional modeling

    3.8, 18.2-18.4, 18.6

    Lecture 6, R code., R code - Numbers.
    Lecture 7, R code, More R, Even more R.
    HDI paper

    Data representations

    14.4-14.9, Journal papers

    Lecture 10, R code.
    sparse SVD paper, The Extraordinary SVD
    Mini 3 R code., Mini 3 more R code.
    A simulation function
    Cats and Dogs Rdata object for Mini4
    Lecture 11, R code
    Fun paper on NMF, sparse SVD paper for high-dim data


    14.1-14.3 + journal papers

    Lecture 12, R code, More R code
    Lecture 13, R code, R code Spectral clustering R code
    Here are some journal papers. You don't have to read all the details - but skimming these papers is recommended to get a better idea about the model-based clustering methods we've discussed in class: Modelbased clustering , Variable selection
    The HDclassif package , The Highdim class paper , High-dim clustering
    Subspace clustering , Spectral clustering , Consensus clustering
    TCGAdata.RData TCGA data and class labels (load("TCGAdata.RData")

    Clustering. Big n.

    Lecture 14, R code
    Lecture 15, Bootstrap R code, R code leveraging and BLB
    Statistical methods and computing for big data
    Bag of Little Bootstraps ,Leveraging
    Lecture 16, , Divide and Conquer paper
    Maximum mean likelihood ,Split and Conquer - penalized regression

    Bayesian vs Frequentists. Review
    Lecture 16
    Sullivan and Feinn: Pvalues and Effect Size , A. Gelman: induction and deduction,A. Gelman: P-values and Statistical Practice
    B. Efron: A 250-year argument
    Raftery et al, Bayesian Model Averaging , Park and Casella: Bayesian Lasso
    Review lecture , Review demo


    There will be 6 Mini-Analysis projects. You can work in pairs for these, but not the same pairs. If you prefer to work on your own this is fine too.
    You have to hand in slides and be prepared to present results in class. Mini-Analyses are compulsory. You have to present at least 2 projects and I will randomly choose presenters each time. Mondays are Mini-Analysis day.

    Your final grade will be based on the 6 Mini-Analysis (in-class presentation that you can work on in teams of 2-3, slides from presentation to be handed in electronically) and one final project that you will work on individually. The Minis count for 50% percent of your final grade and are compulsory.
    Examination procedures
    In Chalmers Student Portal you can read about when exams are given and what rules apply on exams at Chalmers.
    At the link Schedule you can find when exams are given for courses at University of Gothenburg.
    At the exam, you should be able to show valid identification.
    Before the exam, it is important that you report that you want to take the examination. If you study at Chalmers, you will do this by the Chalmers Student Portal, and if you study at University of Gothenburg, so sign up via GU's Student Portal.

    You can see your results in Ladok by logging on to the Student portal.

    At the annual examination:
    When it is practical a separate review is arranged. The date of the review will be announced here on the course website. Anyone who can not participate in the review may thereafter retrieve and review their exam on Mathematical sciences Student Office, open hours. Check that you have the right grades and score. Any complaints about the marking must be submitted in writing at the office, where there is a form to fill out.

    At re-examination:
    Exams are reviewed and picked up at the Mathematical sciences Student Office, open hours. Any complaints about the marking must be submitted in writing at the office, where there is a form to fill out.
    Old exams