Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note
titlePrerequisites

It is recommended that you familiarise yourself with R first by sitting our Introduction to R tutorial.

It also requires that you have the DataSHIELD training environment installed on your machine, see our Installation Instructions for Linux, Windows, or Mac.


Tip
titleHelp

DataSHIELD support is freely available in the DataSHIELD forum by the DataSHIELD community. Please use this as the first port of call for any problems you may be having, it is monitored closely for new threads.

DataSHIELD bespoke user support and also user training classes are offered on a fee-paying basis. Please enquire at datashield@newcastle.ac.uk for current prices. 

...

The other parts in this DataSHIELD tutorial series are:

Quick reminder for logging in:

...

Now there are two serverside objects which have split GENDER by class, to which we have assigned the names "CNSIM.subset.Males" and "CNSIM.subset.Females".

Other general examples of sub-setting

...

Sub-setting to remove NAs

  • The example below uses the function ds.dataFrameSubset completeCases to subset the assigned data frame D by rows (individual records) that have no missing values (missing values are denoted with NA) given by the argument completeCases=TRUE . The The output subset is named "D_without_NA":
Code Block
languagexml
ds.completeCases(x1="D",newobj="D_without_NA", datasources=connections)


Code Block
languagexml
themeRDark
  Assigned expr. (D_without_NA <- completeCasesDS("D")) [================================] 100% / 1s
  Aggregated (testObjExistsDS("D_without_NA")) [=========================================] 100% / 0s
  Aggregated (messageDS("D_without_NA")) [===============================================] 100% / 0s
$is.object.created
[1] "A data object <D_without_NA> has been created in all specified data sources"

$validity.check
[1] "<D_without_NA> appears valid in all sources"

A subsequent check using ds.dim() will confirm that the new object "D_without_NA" has fewer rows than the original object "D".

Sub-set by inequality

Say we wanted to have a subset where BMI values are ≥ 25, and call it subset.BMI.25.plus

...

The other parts in this DataSHIELD tutorial series are:

Tip

Also remember you can:

...