Table of Contents |
---|
...
- Download the D2K simulated ALSPAC dataset and store it somewhere appropriate.
- Read the data dictionary for this simulated dataset to familiarise yourself with the variables.
...
Code Block | ||
---|---|---|
| ||
subset.4<-subset(dataframe, x < 5) #subset of the whole dataframe where x < 5 subset.4<-subset(dataframe, x == 5) #subset of the whole dataframe where x = 5 |
- create a subset of
sim.alspac
for males calledsubset.male
and for females calledsubset.female
- How many participants are female and how many are male? HINT: Use
dim
to check the dimensions ofsubset.male
andsubset.female
.
Exploring the data
- Get object summary statistics by using the
summary
function onsubset.male
andsubset.female
- Use the
boxplot
function to plot BMI at age 7 against gender. HINT: You will only need to use the argumentsformula=
anddata=
- Output your boxplot as a .png file using the
png
function. - Use the
hist
function to plot histograms of BMI age 7 for females and males. HINT: You can layer graphs over one another by using the argumentadd=T
in the second histogram. Line colour of the histogram can be set using the argument e.g.border="red"
- Make the plot more readable by using the
legend
to add an appropriate key. - Output your histogram as a .png file using the
png
function. - Use the
plot
function to create a scatter plot of height and weight age 7 for males. - Use
lm
function to generate a linear model calledlm1
for the two variables. HINT: R uses formula notation in formula argument e.g.formula=y~x
- Use the
summary
function on lm1 to get the coefficients. You can add your regression line to the scatterplot by running the
abline
function on lm1 after yourplot
function
...