Data Analysis – MoBi Bachelor – WS 2021/2022¶
Dr. Carl Herrmann carl.herrmann (at) bioquant.uni-heidelberg.de
Dr. Maiwen Caudron-Herger m.caudron (at) dkfz-heidelberg.de
The purpose of this course is an introduction to basic techniques of scientific data analysis, which every scientist will be confronted with. We will cover:
data visualization (plots)
descriptive statistics (mean, variance, correlation,…)
data exploration and reduction (PCA, clustering,…)
inference statistics (P-values, tests,…)
data modelling (linear/logistic regression,…)
Each lecture (Thursday) will be followed by a practical session (Friday) under R/RStudio
Reference¶
Books:
Intuitive Biostatistics von Harvey Motulsky (Oxford University Press)
Discovering statistics using R von A. Field, J. Miles, Z. Field (SAGE publications)
Websites:
Practical information¶
Lecture slides¶
Part 1 (17.10 - )
: Intro, plots, descriptive statisticsPart 2 (31.10 - )
: Clustering, Principal Component AnalysisPart 3 (7.11 - )
: probability distributionsPart 4 (7.11 - )
: statistical inferencePart 5 (7.11 - )
: Hypothesis testing, statistical testsPart 6 (5.12 - )
: Power of a test, (multiple testing)Part 7 (12.12 - )
: Regression analysis
Link to practical parts¶
Exercises¶
This is a link to a bundle of exercise sheets from the previous years, together with solutions for most of them. This will help to prepare for the exam
Demos¶
During the lectures, some demos have been shown. You can find the corresponding Markdown with the code and the output in the following links:
Central Limit Theorem
[click here]
Confidence Intervals
[click here]
Shiny applets¶
Shiny makes it possible to build interactive R scripts that can be used using user interface. No programming skills are needed!
For some topics of the lecture, we provide Shiny applets to illustrate some concepts, (hopefully) helping in the understanding:
Topic 1 : QQ Plots
Topic 2 : Statistical inference
Topic 3 : Distributions
Topic 4 : Confidence intervals
Topic 5 : Power of a test