This website concerns the “R for Statistics” PhD course at the University of the Faroe Islands. Read more about the course here: https://www.setur.fo/fo/utbugving/stakgreinalestur/depilin-fyri-heilsu-og-almannagransking/r-for-statistics-basic-course-in-the-statistical-program-r-and-rstudio/
In this course, we learn how to use R and RStudio for statistics. We do not teach statistics in this course, and assume that participants are at least somewhat familiar with statistics. The course is very practical, and is based on short case studies where we go through four basic steps of statistics, loading, cleaning, plotting, and modelling data.
Attendees of this course should read this page carefully, in particular the “Preparation” section, including the “Reading material” and “Installing R/Rstudio” sections.
The course will be very hands-on, learning how to do actual statistical analyses in R, but we want to to understand some basic R concepts before the start of the course. To that end, you should read parts 1 and 2, that is, chapters 1 through 8, of “Hands-on Programming with R”. It is important that you read the material, as if you don’t you may find it difficult, or impossible, to follow the course.
“Hands-On Programming with R: Write Your Own Functions and Simulations” by Garrett Grolemund
We want you to understand the concepts layed out in these chapters to the best of your abilities. In general, we want you to understand:
We recommend that you have R open while you read, and try some of the commands to improve your understanding of the material in the book. The book has short excercises, but don’t worry if you don’t know how to solve all of these.
To help you get the most out of your reading, here’s what you should focus on when reading the chapters:
You need to bring a laptop to the course, and before you arrive you need to install R and Rstudio. Download and install the free desktop version of RStudio. R and RStudio are available for Windows, Mac and Linux.
Download and install the newest version of R:
Download and install RStudio Desktop with the Open Source License:
If you get stuck installing R and/or RStudio, a quick Google search will give some helpful resources, including:
After installing RStudio, try installing this package. Hopefully, the package will be installed on your system without problems.
install.packages('tidyverse')
install.packages('magrittr')
install.packages('devtools')
An overview of the course program is in the table below.
Time | Tuesday | Wednesday | Friday |
---|---|---|---|
8:30-11:30 | Introduction and R/RStudio | Case study 2 | Case study 4 |
12:00-12:30 | Lunch | Lunch | Lunch |
12:30-16:00 | Case study 1 | Case study 3 | Workshop and exam |
Every case study session will include a small hands-on exercise session, a summary, and a coffee break.
First and foremost, our advice to you when the course is over is to start analysing data, writing scripts and notebooks, and making projects in R. Statistics and programming is something you primarily learn by experience. Here follow a few book ideas you can try if you want to read more.
You can of course continue reading Hands-on Programming with R, as part 3 contains some more advanced material covering, among other things, more programmatic aspects of R.
“R for Data Science” covers topics similar to this course, including plotting with ggplot2
and data manipulation with dplyr
.
“R for Data Science” by Garrett Grolemund andHadley Wickham
“Data Analysis for the Life Sciences” covers statistics using R, and is very practical and clear. It is more geared toward people who know R and are trying to learning statistics, rather than people trying to learn R, but is nonetheless very useful.
“Data Analysis for the Life Sciences” by Rafael A. Irizarry and Michael I. Love
The Tidyverse site (https://www.tidyverse.org/) is a hub for many of the packages we use in this course such as ggplot2
(https://ggplot2.tidyverse.org/) and readr
(https://readr.tidyverse.org/). These sites have really good examples for how to load and plot data and many other things.
Cheatsheets for many common tasks can be found here on the RStudio website (https://www.rstudio.com/resources/cheatsheets/) and includes many of the packages we use in this course.
Of course, Google is our best friend for finding help. When we search for R questions in Google, it will often lead us to StackOverflow (https://stackoverflow.com/) where someone has had the same problem as us, and is asking for solutions.
Below is an overview of all the case study notebooks.
Number | Description | Notebook | Data |
---|---|---|---|
1.1 | Reading data, plotting, and t test | Notes | Data |
1.2 | Linear regression and more plotting | Notes | Data |
2.1 | Cleaning up messy data | Notes | Data (raw), Data (cleaned) |
3.1 | Logistic regression | Notes | Clean data from case 2.1 |
3.2 | ANOVA | Notes | Data |
3.3 | Genomics with Bioconductor | Notes | Data avilable at https://www.ebi.ac.uk/gwas/ |
4.1 | Survival analysis | Notes | R built-in |
4.2 | Working with map data | Notes | Web API |