The names of all other variables end in either .7 or .11 (depending whether they were measured at the age 7 clinic or the age 11 clinic) male codes sex: 1=male, 0=female age.yrs and age.yrs are the age (in decimal years) on the day of the clinic at age 7 or 11 ht is height in cm ht.sit is sitting height in cm ws is waist circumference in cm hp is waist circumference in cm wt is weight in Kg sbp is systolic blood pressure (the top of the blood pressure fluctuation) measured (as is conventional) in mm of Hg (mercury) dbp is diastolic blood pressure (the bottom of the blood pressure fluctuation) measured (as is conventional) in mm of Hg (mercury) pulse is pulse rate measured in beats per minute BMI is body mass index derived as wt/(ht/100)2 The height variable is divided by 100 to express it in metres rather than centimeters |
setwd
and read the dataset into R and assign it the variable sim.alspac
using the read.csv
functioncolnames
function in the help file and apply it to sim.alspac
to list all the column headings in the data.dim
function in the help file and apply it to to sim.alspac
to get the dimensions of the dataset. Number of columns is the number of variables, number of rows is the number of participants. Selecting variables can be done a number of ways including selection by column number or column name. It is best practice to use the column name as the column number may vary between datasets.
select.1<-dataframe[,x] #assign the variable select1 column number x in dataframe select.2<-dataframe[,"x"] #assign the variable select2 column named x in dataframe select.3<-dataframe$x #assign the variable select3 dataframe column x |
It is also possible to use operators to subset between a range of values. See the help file for the subset
function for further explanation
subset.4<-subset(dataframe, x < 5) #subset of the whole dataframe where x < 5 subset.4<-subset(dataframe, x == 5) #subset of the whole dataframe where x = 5 |
sim.alspac
for males called subset.male
and for females called subset.female
dim
to check the dimensions of subset.male
and subset.female
. male
is categorical. Check the class of male
using the class
function.Rounding numbers
signif(x, digits = 6)
# set how many significant figures using digits =
or use
format(round(x, 2), nsmall = 2)
# for two d.p
text(70,12, labels=paste("y=", RegM11$coefficients[2], "+", RegM11$coefficients[1]), col="orange")