Our client tutorial package will contains four functions, one for each of the four function on the server side package. To make it obvious what server side function a client function calls the convention is to match client names with those of their counterpart with a slight difference that distinguishes them.
ds.mean
is that client that calls meanDS
ds.replaceNA
is that client that calls replaceNaDS
ds.log
is that client that calls log
ds.length
is that client that calls length
Remember, as already explained, that log
and length
do not have the suffix DS
because these two server side functions are called directly from the R package base
.
We need some internal functions used to carry out the same tasks in all client side functions. Follow the below links and copy over the internal functions to the R
directory of your project.
findLoginObjects
isDefined
isAssigned
checkClass
extract
getPooledMean
We also need some login data; follow the link below and copy over the data object to the data
directory of your project. To create your own login table see this guide.
logindata.rda
Just as for the server side function and actually for any programming it is paramount to observe good coding practice. So in the below box we re-iterate the same advices.
As with any programming it is highly important to stick to best practices:
|
For the remainder of this page we are going to show one function and explain the main sections of the script in detail. Nearly all client functions have a similar structure; differences will be highlighted and explained. The sections we are going to explain are signposted so you can follow the explanations more easily in the last chapter of this page.
In your Rstudio go the tab File
in the top menu, select New File
and then choose R Script
. This will open up a new R script file; copy the below code and paste the code below or write the lines in the file. Then Go to the tab File
on the Rstudio top menu bar and choose Save As
, browser to the R
folder in the project directory and save the file under the name ds.mean
. Always use the same name as the function for the script file and the file extension should always be .R
.
#-------------------------------------- HEADER --------------------------------------------# #' @title Computes the statistical mean of a given vector #' @description This function is similar to the R function \code{mean}. #' @details It is a wrapper for the server side function. #' @param x a character, the name of a numerical vector #' @param type a character which represents the type of analysis to carry out. #' If \code{type} is set to 'combine', a global mean is calculated #' if \code{type} is set to 'split', the mean is calculated separately for each study. #' @param checks a boolean, if TRUE (default) checks that verify elements on the server side #' such checks lengthen the run-time so the default is FALSE and one can switch these checks #' on (set to TRUE) when faced with some error(s). #' @param datasources a list of opal object(s) obtained after login in to opal servers; #' these objects hold also the data assign to R, as \code{data frame}, from opal datasources. #' @return a numeric #' @author Gaye A., Isaeva I. #' @seealso \code{ds.quantileMean} to compute quantiles. #' @seealso \code{ds.summary} to generate the summary of a variable. #' @export #' @examples { #' #' # load that contains the login details #' data(logindata) #' #' # login and assign specific variable(s) #' myvar <- list('LAB_TSC') #' opals <- datashield.login(logins=logindata,assign=TRUE,variables=myvar) #' #' # Example 1: compute the pooled statistical mean of the variable 'LAB_TSC' - default behaviour #' ds.mean(x='D$LAB_TSC') #' #' # Example 2: compute the statistical mean of each study separately #' ds.mean(x='D$LAB_TSC', type='split') #' #' # clear the Datashield R sessions and logout #' datashield.logout(opals) #' #' } ds.mean = function(x=NULL, type='combine', checks=FALSE, datasources=NULL){ #-------------------------------------- BASIC CHECKS ----------------------------------------------# # if no opal login details are provided look for 'opal' objects in the environment if(is.null(datasources)){ datasources <- findLoginObjects() } if(is.null(x)){ stop("Please provide the name of the input vector!", call.=FALSE) } # the input variable might be given as column table (i.e. D$x) # or just as a vector not attached to a table (i.e. x) # we have to make sure the function deals with each case xnames <- extract(x) varname <- xnames$elements obj2lookfor <- xnames$holders #--------------------------------------------------------------------------------------------------# #-------------------------------------- SERVER SIDE CHECKS ----------------------------------------# if(checks){ # check if the input object(s) is(are) defined in all the studies if(is.na(obj2lookfor)){ defined <- isDefined(datasources, varname) }else{ defined <- isDefined(datasources, obj2lookfor) } # call the internal function that checks the input object is of the same class in all studies. typ <- checkClass(datasources, x) # the input object must be a numeric or an integer vector if(typ != 'integer' & typ != 'numeric'){ stop("The input object must be an integer or a numeric vector.", call.=FALSE) } } #----------------------------------------------------------------------------------------------------# # number of studies num.sources <- length(datasources) #-------------------------------------- CALLING SERVER SIDE FUNCTION --------------------------------# cally <- paste0("meanDS(", x, ")") mean.local <- datashield.aggregate(datasources, as.symbol(cally)) cally <- paste0("NROW(", x, ")") length.local <- datashield.aggregate(datasources, cally) # get the number of entries with missing values cally <- paste0("numNaDS(", x, ")") numNA.local <- datashield.aggregate(datasources, cally) #-----------------------------------------------------------------------------------------------------# #-------------------------------------- FINALIZING RESULTS -------------------------------------------# if (type=='split') { return(mean.local) } else if (type=='combine') { length.total = 0 sum.weighted = 0 mean.global = NA for (i in 1:num.sources){ if ((!is.null(length.local[[i]])) & (length.local[[i]]!=0)) { completeLength <- length.local[[i]]-numNA.local[[i]] length.total = length.total+completeLength sum.weighted = sum.weighted+completeLength*mean.local[[i]] } } mean.global = sum.weighted/length.total return(list("Global mean"=mean.global)) } else{ stop('Function argument "type" has to be either "combine" or "split"') } } |
The header of the script (lines starting with #'
) is used by Roxygen to produce the documentation files. Except for @examples
all the entries on the header have been already explained at the bottom of this page.
Writing a comprehensive documentation is paramount to help users. The header of the function is used to produce the R documentation. This documentation is far more important for client side functions than it is for server side functions simply because server side functions are called by DataSHIELD developers who are familiar with the code whilst client functions are ran by users with a wide range of R experience (from beginners to experts). |
datasources
, the internal function findLoginObjects
looks for some login object in the working environment and if more than one object is found asks the user to chose the one, if no login object is found the attempted analysis is aborted.
Previous functions have systematically ran these checks. It was however flagged that for some functions the checks were time consuming (just minutes though!). So a new argument, |
The strategy is to construct a call object and passed it on to the opal functions
|
type='combine'
). Make sure, like in the last of the above illustration, an error message is thrown if the user attempt to specify anything else than 'combine' or 'split' for the argument type
.In the above section we have used ds.mean
to illustrate the development of a client side function that calls an 'aggregate' server side function. The structure is not very different for clients that call an 'assign' function. Follow the below link to copy over the scripts of the other three functions to the R
directory of your project.
ds.log
ds.length
ds.replaceNA