...
Tip | ||
---|---|---|
| ||
DataSHIELD support is freely available in the DataSHIELD forum by the DataSHIELD community. Please use this as the first port of call for any problems you may be having, it is monitored closely for new threads. DataSHIELD bespoke user support and also user training classes are offered on a fee-paying basis. Please enquire at datashield@newcastle.ac.uk for current prices. |
Introduction
This is the second in a 6-part DataSHIELD tutorial series.
The other parts in this DataSHIELD tutorial series are:
5: Subsetting
6: Modelling
Quick reminder for logging in:
- Follow instructions to Start the Opal VMs.
...
Start R/RStudio
Load Packages
Code Block | ||||
---|---|---|---|---|
| ||||
#load libraries library(DSI) library(DSOpal) library(dsBaseClient) |
Build your login dataframe
Code Block | ||||
---|---|---|---|---|
| ||||
builder <- DSI::newDSLoginBuilder() builder$append(server = "study1", url = "http://192.168.56.100:8080/", user = "administrator", password = "datashield_test&", table = "CNSIM.CNSIM1", driver = "OpalDriver") builder$append(server = "study2", url = "http://192.168.56.101:8080/", user = "administrator", password = "datashield_test&", table = "CNSIM.CNSIM2", driver = "OpalDriver") logindata <- builder$build() connections <- DSI::datashield.login(logins = logindata, assign = TRUE, symbol = "D") |
...
Code Block | ||
---|---|---|
| ||
DSI::datashield.logout(connections) |
Basic statistics and data manipulations
Descriptive statistics: variable dimensions and class
It is possible to get some descriptive or exploratory statistics about the assigned variables held in the server-side R session such as number of participants at each data provider, number of participants across all data providers and number of variables. Identifying parameters of the data will facilitate your analysis.
...
Code Block | ||
---|---|---|
| ||
Aggregated (exists("D")) [=============================================================] 100% / 0s Aggregated (classDS("D$LAB_HDL")) [====================================================] 100% / 1s $study1 [1] "numeric" $study2 [1] "numeric" |
Descriptive statistics: quantiles and mean
As LAB_HDL
is a numeric variable the distribution of the data can be explored.
...
Code Block | ||
---|---|---|
| ||
Aggregated (meanDS(D$LAB_HDL)) [=======================================================] 100% / 0s $Mean.by.Study EstimatedMean Nmissing Nvalid Ntotal study1 1.569416 360 1803 2163 study2 1.556648 555 2533 3088 $Nstudies [1] 2 $ValidityMessage ValidityMessage study1 "VALID ANALYSIS" study2 "VALID ANALYSIS" |
Conclusion
The other parts in this DataSHIELD tutorial series are:
5: Subsetting
6: Modelling
...