Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Tip
titleHelp

DataSHIELD support is freely available in the DataSHIELD forum by the DataSHIELD community. Please use this as the first port of call for any problems you may be having, it is monitored closely for new threads.

DataSHIELD bespoke user support and also user training classes are offered on a fee-paying basis. Please enquire at datashield@newcastle.ac.uk for current prices. 

Introduction

This is the second in a 6-part DataSHIELD tutorial series.

The other parts in this DataSHIELD tutorial series are:

Quick reminder for logging in:

...


Start R/RStudio

Load Packages

Code Block
xml
xml
#load libraries
library(DSI)
library(DSOpal)
library(dsBaseClient)

Build your login dataframe 

Code Block
languagexml
titleBuild your login dataframe
builder <- DSI::newDSLoginBuilder()
builder$append(server = "study1",  url = "http://192.168.56.100:8080/",
               user = "administrator", password = "datashield_test&",
               table = "CNSIM.CNSIM1", driver = "OpalDriver")
builder$append(server = "study2", url = "http://192.168.56.101:8080/",
               user = "administrator", password = "datashield_test&",
               table = "CNSIM.CNSIM2", driver = "OpalDriver")

logindata <- builder$build()

connections <- DSI::datashield.login(logins = logindata, assign = TRUE, symbol = "D")

...

Code Block
languagebash
DSI::datashield.logout(connections)


Basic statistics and data manipulations

Descriptive statistics: variable dimensions and class

It is possible to get some descriptive or exploratory statistics about the assigned variables held in the server-side R session such as number of participants at each data provider, number of participants across all data providers and number of variables. Identifying parameters of the data will facilitate your analysis.

...

Code Block
themeRDark
  Aggregated (exists("D")) [=============================================================] 100% / 0s
  Aggregated (classDS("D$LAB_HDL")) [====================================================] 100% / 1s
$study1
[1] "numeric"

$study2
[1] "numeric"

Descriptive statistics: quantiles and mean

As LAB_HDL is a numeric variable the distribution of the data can be explored.

...

Code Block
themeRDark
  Aggregated (meanDS(D$LAB_HDL)) [=======================================================] 100% / 0s
$Mean.by.Study
       EstimatedMean Nmissing Nvalid Ntotal
study1      1.569416      360   1803   2163
study2      1.556648      555   2533   3088

$Nstudies
[1] 2

$ValidityMessage
       ValidityMessage 
study1 "VALID ANALYSIS"
study2 "VALID ANALYSIS"

Conclusion

The other parts in this DataSHIELD tutorial series are:

...