v6.0 Linux Installation Instructions


 Click here to view page contents...

Help

DataSHIELD support is freely available in the DataSHIELD forum by the DataSHIELD community. Please use this as the first port of call for any problems you may be having, it is monitored closely for new threads.


The minimum computer specification for installing the DataSHIELD training environment is:

  • Ideally 12GB+ of RAM. Each VM is allocated 4GB RAM, and some needs to be spare for other applications.
  • Reasonably powerful processor (for example: Intel i5 or i7).
  • 16GB space on the hard disk.
    • 7GB per VM + VirtualBox installation requirement (80 MB) 

Installing the Virtual Servers

Install VirtualBox

To run the virtual servers you will need to install VirtualBox.

On Ubuntu, type in the Ubuntu Software store "VirtualBox" or in the terminal:

~$ sudo apt-get install virtualbox

Download the Virtual Servers

Two virtual servers are available for you to test DataSHIELD with. The virtual servers require 4GB RAM each and about 5GB hard-disk space each.
You can download the VMs from google drive:


Import the Virtual Servers into VirtualBox

You will need to import the VMs into Virtualbox. Open Virtualbox:

File > Import Appliance...

Setting up the Virtual Servers

Setting up the VirtualBox network adapter

  • On the VirtualBox top menu, at the far right, go to the tab Global Tools, click the drop down and select Host Network manager.
    • Ensure the Adapter tab at the bottom of the screen has the following settings:
      • IPv4 address: 192.168.56.1
      • IPv4 Network Mask: 255.255.255.0
      • IPv6 Prefix Length: 0
    • And ensure the DHCP Server tab, next to Adapter, at the bottom of the screen:
      • Has Enable Server unchecked.
    • Click "Apply" to save any changed settings.
  • Now click on Machine Tools at the top of the screen, next to "Global Tools", and select from drop down Details.
    • Selecting each VM in turn, check that under the Network tab, the following options are fulfilled:
      • Adapter 1: Intel PRO/1000 MT Desktop (NAT)
      • Adapter 2: Intel PRO/1000MT Desktop (Host-only Adapter, 'vboxnet0')

See here for more information on virtualbox networking (if a network warning occurs, or you are having trouble connecting to the VM).

Once the machines have been started you can login directly, for tasks such as reviewing logs produced on the server, instructions here.

Connect to the Virtual Servers

Starting the Opal test servers

To start an Opal server just open VirtualBox click on a server and click on the green arrow that says start (or just double click on the Opal server name).

Depending on how powerful your computer is, the Opal servers may take a few minutes to boot and for Opal to start.

Be aware, once the VM windows have started, clicking inside them can cause your mouse pointer to become captured. The default key to escape this is Right-Ctrl

Virtual Servers' IP addresses

By default, machines on the host-only network can be found at 192.168.56.xxx:

Your computer (the host) will be at 192.168.56.1
The virtual servers have been configured as follows:

  • dstesting-100 – 192.168.56.100
  • dstesting-101 – 192.168.56.101

Should you wish to create more servers, see here

Opal web access

The opal web interface is accessed in you browser. Simply type the IP address of the VM, followed by the port number.

For example: to access the web interface for dstesting-100, go to:

Note: You will get a warning when connecting to 8443 because the SSL certificate is self-signed.

Please allow up to 2 minutes after launching a VM to gain access to the Opal web interface.

Logging onto the Opal web interface

The following username and password is used to access the web portal:

Opal web interface login details
username: administrator
password: datashield_test&

e.g. to upload your own data.

Shutting down the Opal servers

If no changes to Opal servers (e.g. for general users), when you have finished your analysis, to shutdown the Opal server:

  • click the cross at the top right of the window. This will give you a list of options for how to shutdown the machine:
  • Select Power off the machine.

    Turing off a VM "abruptly" can cause problems, hence the specific shutdown instructions.

Setting up your local machine to use the VMs

These are instructions for installing R on your own machine. That is, the machine you will be using for analysis, not the virtual servers.

These instructions assume you are using Ubuntu on your local machine.

Installing R in Ubuntu

Use:

$ sudo -i

to run the following commands as root.

Add a CRAN R repository

Add the following to the /etc/apt/sources.list file, this gives a newer version of R than the one that is in Ubuntu's repositories. Note: the CRAN R repository you add depends on the Ubuntu release you are running:

  • If you are running Ubuntu 18.04 "Bionic Beaver":
# add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/'
  • If you are running Ubuntu 16.04 "Xenial Xerus":
# add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu xenial-cran35/'

You may wish to select a different CRAN mirror, closer to your location.

You can find further details about installing R in Ubuntu on the CRAN website.

Add the public key for the CRAN R repository to your list of keys

# apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9

Install R

(Plus one dependency, needed for installing R packages)

  • "apt-get install r-base" will install the latest version of R
# apt-get update
# apt-get install r-base r-base-dev libcurl4-openssl-dev

Additional libraries for DataSHIELD development

Should users wish to do development on DataSHIELD, the following libraries will need to be installed

Additional libraries
# apt-get install libxml2-utils
# apt-get install libxml2-dev
# apt-get install libssl-dev
# apt-get install libgit2-dev

Set-up R for DataSHIELD analysis

You shouldn't do this as root, instead run command:

# exit

or use the ctrl + D hotkey combination.

Install the Opal packages

Do this either in R environment of terminal (see below), or by getting the RStudio app through the ubuntu application store, then running the below commands prefixed with ">" in the terminal.

$ R
> install.packages('DSI')
> install.packages('DSOpal')
> install.packages('DSLite')

Install the DataSHIELD client packages

> install.packages('dsBaseClient', repos=c(getOption('repos'), 'http://cran.obiba.org'), dependencies=TRUE)

Although be aware that this will place the DataSHIELD packages wherever your R libraries are saved. This may be unhelpful for development- in which case you may have to relocate your files.

To exit the R environment within terminal, use

> q()

DataSHIELD R Package Manuals

See: Current release

Keeping up to date

If you have installed the DataSHIELD client packages using the method above (that is, within R using install.packages and specifying the Obiba repository), then you can update those client packages as follows:

# R
> update.packages(repos='http://cran.obiba.org')

You are now fully set up. To start using the DataSHIELD test environment, you can try our Tutorial for DataSHIELD users. The tutorial teaches you the basics of DataSHIELD including how to:

  • login
  • run commands to:
    • generate descriptive statistics
    • subset tables and vectors
    • fit some regression models

Further instructions are available for the advanced users of the DataSHIELD test environment:

  • Follow the instructions in the Opal management tutorial to learn how to upload your own data.
  • DataSHIELD R package manuals will be available soon on the release notes page.
  • Install non-CRAN R packages to the training Opal servers (coming soon)



DataSHIELD Wiki by DataSHIELD is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at http://www.datashield.ac.uk/wiki