Changing variable class
Occasionally the class of variable defined in the data dictionary may not be the format required for the DataSHIELD function. The examples below give a summary of potential function errors when the input variable is of the wrong class.
In the example below, the variable named cens - a censoring indicator for a survival model (1=died, 0=censored) - has been read into R. If you declare these as the names of the two levels of the variable in the categories tab of the data dictionary, it will then automatically be read in as a factor.
But if you want to use this variable as the outcome in a generalized linear model to analyse survival (e.g. a piecewise exponential regression model) it cannot be used as a factor - it has to be a numeric. It is possible to coerce variables into the class required for analysis. This is illustrated in the following example that coerces factor variables into numeric variables.
#What is the class of the variable cens after it has first been imported into the dataframe called EM?
> ds.class("EM$cens")
$study1
[1] "factor"
$study2
[1] "factor"
#Create a new variable called EVENT that is of class numeric
> ds.asNumeric("EM$cens","EVENT")
#Check that EVENT is of class numeric in both studies
> ds.class("EVENT")
$study1
[1] "numeric"
$study2
[1] "numeric"
DataSHIELD Wiki by DataSHIELD is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at http://www.datashield.ac.uk/wiki
