This tutorial assumes you have already installed the DataSHIELD training environment (installation takes around half an hour) or that you have a login for a DataSHIELD cloud training server. It is recommended that you familiarise yourself with R first by sitting our Session 1: Introduction to R tutorial |
Start
arrow (or double click on the Opal test server name).You can check whether the Opal test servers are ready by typing the following into your navigation bar
|
opal
to login and logoutdsBaseClient
, dsStatsClient
, ds.GraphicsClient
and dsModellingClient
containing all DataSHIELD functions referred to in this tutorial.library
function into the command line as given in the example below:#update dataSHIELD packages update.packages(repos='http://cran.obiba.org') #load libraries library(opal) library(dsBaseClient) library(dsStatsClient) library(dsGraphicsClient) library(dsModellingClient) |
The output in R/RStudio will look as follows:
library(opal) #Loading required package: RCurl #Loading required package: bitops #Loading required package: rjson library(dsBaseClient) #Loading required package: fields #Loading required package: spam #Loading required package: grid library(dsStatsClient) library(dsGraphicsClient) library(dsModellingClient) |
You might see the following status message that you can ignore. The message refers to the blocking of functions within the package. |
A Horizontal-DataSHIELD process starts with a login to one or more Opal servers that hold the data behind the data provider firewall. Formatting of the login details is required to log into Opal servers:
logindata
is built into the DataSHIELD test environment.data
function. Calling logindata
allows users to view the login data on the screen:data(logindata) logindata # server url user password table # study1 192.168.56.100:8080 administrator password CNSIM.CNSIM # study2 192.168.56.101:8080 administrator password CNSIM.CNSIM |
You can create a login table to use with /wiki/spaces/DSDEV/pages/12943489 or live research data.
The login details for live research data will have the same format as the login template except:
|
If you are not using your own data, information for the login table is obtained from the data provider. Please follow the appropriate procedures to gain clearance to analyse their data. |
Your login details must be loaded via the |
opals
that calls the datashield.login
function to log into the desired Opal servers. In the DataSHIELD test environment logindata
is our login template for the test Opal servers.opals <- datashield.login(logins=logindata,assign=TRUE) |
study1
and study2
contain the same 11 variables listed in capital letters under Variables assigned:
.> opals <- datashield.login(logins=logindata,assign=TRUE) Logging into the collaborating servers No variables have been specified. All the variables in the opal table (the whole dataset) will be assigned to R! Assigining data: study1... study2... Variables assigned: study1--LAB_TSC, LAB_TRIG, LAB_HDL, LAB_GLUC_ADJUSTED, PM_BMI_CONTINUOUS, DIS_CVA, MEDI_LPD, DIS_DIAB, DIS_AMI, GENDER, PM_BMI_CATEGORICAL study2--LAB_TSC, LAB_TRIG, LAB_HDL, LAB_GLUC_ADJUSTED, PM_BMI_CONTINUOUS, DIS_CVA, MEDI_LPD, DIS_DIAB, DIS_AMI, GENDER, PM_BMI_CATEGORICAL |
In Horizontal DataSHIELD pooled analysis the data are harmonized and the variables given the same names across the studies, as agreed by all data providers. |
Users can specify individual variables to assign to the server-side R session. It is best practice to first create a list of the Opal variables you want to analyse.
myvar
that lists the Opal variables required for analysis: LAB_HDL
and GENDER
variables
argument in the function datashield.login
uses myvar
, which then will call only this list.myvar <- list('LAB_HDL', 'GENDER') opals <- datashield.login(logins=logindata,assign=TRUE,variables=myvar) #Logging into the collaborating servers #Assigining data: #study1... #study2... #Variables assigned: #study1--LAB_HDL, GENDER #study2--LAB_HDL, GENDER |
Assigned data are kept in a data frame (table) named |
symbol
in the datashield.login
function to change the name of the data frame from D
to mytable
.myvar <- list('LAB_HDL', 'GENDER') opals <- datashield.login(logins=logindata,assign=TRUE,variables=myvar, symbol='mytable') #Logging into the collaborating servers #Assigining data: #study1... #study2... #Variables assigned: #study1--LAB_HDL, GENDER #study2--LAB_HDL, GENDER |
Only DataSHIELD developers will need to change the default value of the last argument, |
====================
If you have not installed the DataSHIELD test environment you can login to our cloud Opal test servers using the alternative details below. This will require a good internet connection. Please note this service is not reliable and will be discontinued soon.
The login template for the cloud Opal test servers can be called using:
|