How to modify the DataSHIELD packages
The logic
The environment that you have set up places a number of virtual machines on your computer.
The virtual machines play the role of the servers of different cohorts; they have Opal
and R
installed and each contain some simulated test data.
Your machine, the host, plays the role of the client; that is, the computer that is conducting the analysis.
The DataSHIELD
code is also split into client and server packages. The DataSHIELD
client packages are installed in R
on your machine (so that it can act as a client), and the DataSHIELD
server packages are installed in R
on each of the virtual machines.
- Modifying the client packages therefore involves downloading theÂ
DataSHIELD
source code to your machine, making your changes, and the installing the changed client packages on your machine. - Modifying the server packages also involves downloading theÂ
DataSHIELD
source code to your machine and making your changes. However the changed server packages must be installed on each of your virtual machines.
Download the DataSHIELD code
DataSHIELD
code is stored publicly on github, a copy can be downloaded using git
.
Install git
sudo apt-get install git
Now read the Using Git page
Clone the DataSHIELD repositories
DataSHIELD
code is broken down into a number of client and server packages:
Client side packages
Server side packages
- dsbase
- dsmodelling
If you want to download everything, then simply 'clone' all of these repositories. This creates a local copy of the code on your machine.
mkdir dsdev cd dsdev git clone https://github.com/datashield/dsbaseclient.git git clone https://github.com/datashield/dsmodellingclient.git git clone https://github.com/datashield/dsbase.git git clone https://github.com/datashield/dsmodelling.git
Installing the client side packages
When we installed the DataSHIELD
client packages in order to 'play' with DataSHIELD
, we did this by running R
and then using the install.packages
command, specifically we told R
to get the DataSHIELD
client packages from the OBiBa
website.
Since we now want to modify the DataSHIELD
client packages, we will instead want to install our personal, modified, version of the packages from a local directory on our computer.
First, delete the existing packages, if they are installed:
# R > remove.packages('dsbaseclient') > remove.packages('dsmodellingclient')
Now install your local version of the DataSHIELD
client source code, using the devtools
package:
# R > library(devtools) > devtools::install('/home/me/ds-dev/dsbaseclient')
You are now able to use your modified client code to run analyses on the virtual machines.
Installing Server Side DataSHIELD packages on the virtual servers
Official Packages
The official DataSHIELD
packages can be installed in their current state through the Opal
web interface:
Administration > DataSHIELD > Add Package
If you select to install them all, this will install dsbase
and dsmodelling
.
Public In-development Packages
Any packages (each package is its own git repository) in the DataSHIELD github project can be installed on the virtual servers from within an R instance running on the host (that is to say, from the computer acting as the client or 'analysis computer').
Installing packages this way uses the dsadmin.install_package
function, from R
running on the client*. This function can be found within the opaladmin
package.
For example, one could install a package by specifying a repository from the DataSHIELD
github project as follows:
# R > library('opaladmin') > dsadmin.install_package(ag.dev.sv)
This would install the package on all the virtual machines.
Private Packages
Installing you own local modified versions of the DataSHIELD
server code is a little more involved.
Installing modified versions of the server code
Our aim is to take the DataSHIELD
server source code that we have been modifying on our own computer and to install it on each of the virtual machines. As such, this means you will need to repeat this process on each of the virtual machines you are using.
First, we need to get the code from our computer onto the virtual machine. For example, we can place the code in the home directory of 'user' as follows:
# rsync -av /home/me/ds-dev/dsbase user@192.168.56.100:/home/user
Now ssh into the virtual machine in order to install the code:
# ssh user@192.168.56.100
We have ssh'd in as 'user', but the installation must be done as a different user on the system. This is because, if we installed the packages as 'user', Opal's rserver would not be able to access them.
Instead will install them as root, so all users have access to them.
Switch to the root user and use devtools
to install from where ever you copied the package source to on the virtual machine:
# sudo su root # R > library(devtools) > devtools::install('/home/user/dsbase')
This will install the packages in the R
system library ( /usr/local/lib/R/site-library
). This is fine, and they we be available to use by all users. However, you will notice that you cannot delete them using the Opal web interface. Rather, you will have to delete them from the R
system library manually.
DataSHIELD Wiki by DataSHIELD is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at http://www.datashield.ac.uk/wiki