Server side description file and folders

The description file

The description file is one of the most important files in your package directory/folder. Actually without a description you cannot construct an R package. The description file of a DataSHIELD server site package is slightly different to that of a standard R package; it tells what functions are aggregate functions (functions which return aggregated summaries to the analyst) and what function are assign function (those that do not return their output to the analyst but rather save it on the server site). As already mentioned elsewhere our package will contains two aggregate functions and two assign functions.

Our first aggregate function named meanDS will return the statistical mean of a numeric vector. meanDS calls the function mean from the R base package but we allow the output to be returned only if a particular condition is met, as you will see in the code. Our second aggregate function is simply the function length from the R base package use without any change; and in fact in such case we do not even write a function and this is why we do not have a name for that function - this will be clified later.

The first assign function is named replaceNaDS and as hinted in the name its job is to replace missing values, in a vector, by the specified value(s)- like all assign functions it stores its output on the server site. The second assign function computes the natural logarithmic values of a numeric vector. The first assign function is written from scratch and the second is simply the function log from the R base package.

isValidDS is an internal function used to verify the input vector or table meets DataSHIELD privacy criteria explained later.

To create the description file launch Rstudio load your project as already explained here, then go to the tab File, select New File and choose Text File. You now have a blank text file open; copy and paste or write the below lines on the blank text file. Then Go to the tab File on the Rstudio top menu bar and choose Save As, browser to your project folder and save the file under the name DESCRIPTION (note the capital letters). Now go to the top right sub-menu open the tab Build and click on Build & Reload. Once you have completed this some new files will appear in your package folder.

Package: dsTutorial
Maintainer: <datashield@obiba.org>
Author: <datashield@obiba.org>
Version: 1.0.0
License: GPL-3
Title: DataSHIELD server side functions for my first package
Description: DataSHIELD server side functions
AggregateMethods: meanDS,
    length=base::length,
    isValidDS
AssignMethods: replaceNaDS,
    log=base::log
Options: datashield.privacyLevel=5

In the description file there is one line that requires particular attention: Options: datashield.privacyLevel=5. In this line we set the 'privacy level' which determines the minimum number of observations that makes a table structure or vector valid in DataSHIELD - remember one of the goals of DataSHIELD is to protect the privacy and confidentiality of individual-level data (e.g. patient-level data). An input is invalid if it holds between 1 and the datashield.privacyLevel number of observations, for example, in our package, input objects with between 1 and 4 observations are invalid. The default value of the parameter datashield.privacyLevel is 5 but this value can be changed by the data owner in his/her opal interface as will be shown later.

In the description file all the parameters are always the same except for Version (which gets upgraded after each meaningful change) and the AggregateMethods and AssignMethods entries where the list of functions changes as the developer adds or removes functions. The description might require a {Depends parameter if your package depends on another package (i.e. calls functions which are in another package). However, our package is a basic one that does not depend on any other package.

Using functions from R

As already explained two of our functions, length and log are called directly from R without any modification. The expression functionName=RpackageName::functionName literally means:

If the function functionName is called then use the one from the R package RpackageName

This is why we do not need to write these functions, we just mention the R functions we want in the description file as shown. DataSHIELD will understand that it should get these functions from the relevant R package.

 

Required directories/folders

In any R package there are two mandatory directories namely the R and the man directory. The first holds the functions scripts (.R files) and the second holds the documentation files (.rd files). In additions there are other directories that can be added but for the server site package these are the only two we need.
To create these two directories go the bottom left of your Rstudio console (see image) go the to the tab Files and navigate to your project folder; there click on New Folder, type in the name of the folder (R) and click on OK. Do the same for the folder man. We are now ready to write our server site functions scripts and store them in the R folder.

DataSHIELD Wiki by DataSHIELD is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at http://www.datashield.ac.uk/wiki