Welcome to the DataSHIELD installation instructions for Linux. Click below for the table of contents, specifying all the stages needed to get set up and ready to go.


DataSHIELD support is freely available in the DataSHIELD forum by the DataSHIELD community. Please use this as the first port of call for any problems you may be having, it is monitored closely for new threads.


The minimum computer specification for installing the DataSHIELD training environment is:

  • 12GB+ of RAM.
    • Each VM is allocated 4GB RAM, and some needs to be spare for other applications.
  • Reasonably powerful processor (for example: Intel i5 or i7).
  • 16GB space on the hard disk.
    • 7GB per VM + VirtualBox installation requirement (80 MB) 

Installing the Virtual Servers

Install VirtualBox

To run the virtual servers you will need to install VirtualBox (click here for more about what Virtualbox is).

There are three ways to install.

$ sudo apt-get install virtualbox

Download the Virtual Servers

Two virtual servers are available for you to test DataSHIELD with. The virtual servers require 4GB RAM each and about 5GB hard-disk space each.
You can download the VMs from google drive:


dstutorial-100 (Opal Server 1)

dstutorial-101 (Opal Server 2)

Import the Virtual Servers into VirtualBox

You will need to import the VMs into Virtualbox. Open Virtualbox, and in the top menu, on the left:

Setting up the Virtual Servers

Setting up the VirtualBox network adapter


Connect to the Virtual Servers

Boot the Virtual Machines

Now that the two virtual machines have been downloaded, imported and configured, it's time to launch them. What is happening when the VM launches is that it is booting up a mini-computer (within your computer) which plays the role of an Opal server as if it were online and you were remotely connecting to the data stored on it.

To start a VM (i.e. a Virtual Opal Server):

Depending on how powerful your computer is, the Opal servers may take a few minutes to boot and for Opal to start.


On first launching the VMs (Virtual Opal Servers) you will observe several grey warning messages. These can be suppressed by clicking the "x" in the top right corner. This will trap your mouse-

Clicking inside them can cause your mouse pointer to become captured within the mini-computer environment. The default key to escape this is Right-Ctrl


For ordinary use, once the VMs have booted there is no cause to type commands within them, as they are just acting as a server while you do analysis on the web or on R, just connecting to them.

However if you are learning DataSHIELD for development purposes, you may wish to log in to the VM. Once the machines have been started you can use the credentials:

  • username: administrator
  • password: datashield_test&

for tasks such as reviewing logs produced on the server, instructions here.

Now that your VMs are launched you should check they are ready to be used before getting set up in R.

Virtual Servers' IP addresses

By default, machines on the host-only network can be found at 192.168.56.xxx:

Your computer (the host) will be at 192.168.56.1
The virtual servers have been configured as follows:

Web Access to the Opal Server (How to look inside the VM)

The opal web interface is accessed in you browser. Simply type the IP address of the VM, followed by the port number.

For example: to access the web interface for dstesting-100, go to:

Please allow up to 2 minutes after launching a VM to gain access to the Opal web interface.


The following username and password is used to access the web portal of the training servers:

username: administrator
password: datashield_test&

e.g. to read about the metadata of the studies, or to connect to external resources.

When finished: Shutting down the Opal servers

Assuming no changes have been applied to the Virtual Opal Servers (which will be the case for general users) after you have finished your analysis, to shutdown the Opal server:

Setting up your R Session

These are instructions for installing R on your own machine. That is, the machine you will be using for analysis, not the virtual servers.

These instructions assume you are using Ubuntu on your local machine.

Installing R in Ubuntu

Note: the CRAN R repository you add depends on the Ubuntu release you are running:

How to check your Ubuntu version:

Open your Terminal, and past command

lsb_release -a

And take note of the number (20, 18, 16) next to "Release".

Now install R using the following instructions:

Set-up R for DataSHIELD analysis

Install the Opal packages

Open an R Session (whether in terminal, RGui or RStudio), then run:

R
install.packages('DSI')
install.packages('DSOpal')
install.packages('DSLite')

install.packages(c('fields', 'metafor', 'ggplot2', 'gridExtra', 'data.table'))

Install the DataSHIELD client packages

install.packages('dsBaseClient', repos=c(getOption('repos'), 'http://cran.obiba.org'), dependencies=TRUE)


Although be aware that this will place the DataSHIELD packages wherever your R libraries are saved. This may be unhelpful for development- in which case you may have to relocate your files.

DataSHIELD R Package Manuals

See: Current release

Keeping up to date

Please see our keeping up to date wiki page in the user/analyst support section.



What's next?

You are now fully set up. To start using the DataSHIELD test environment, you can try our Tutorial for DataSHIELD users. The tutorial teaches you the basics of DataSHIELD including how to:

  • login
  • run commands to:
    • generate descriptive statistics
    • subset tables and vectors
    • fit some regression models


Further instructions are available for the advanced users of the DataSHIELD test environment:

  • Follow the instructions in the Opal management tutorial to learn how to upload your own data.
  • DataSHIELD R package manuals will be available soon on the release notes page.
  • Install non-CRAN R packages to the training Opal servers (coming soon)