Instructions for Linux users v5
Installing the Virtual Servers
Install VirtualBox
To run the virtual servers you will need to install VirtualBox.
On Ubuntu:
$ sudo apt-get install virtualbox
Download the Virtual Servers
Two virtual servers are available for you to test DataSHIELD with. The virtual servers require 4GB RAM each and about 5GB hard-disk space each.
You can download the VMs from google drive:
Import the Virtual Servers into VirtualBox
You will need to import the VMs into Virtualbox. Open Virtualbox:
File > Import Appliance
Setting up the Virtual Servers
Setting up the VirtualBox network adapter
- On the VirtualBox top menu go to the tab
File
and selectPreferences...
, this opens up a VirtualBox settings window. - Inside that window and under the tab Network click on Host-only Networks and then click on the screw driver icon on the right which says
Edit host-only network (Space)
when you hover on it.
- Under the
Adapter
tab the following should be set:
IPv4 address: 192.168.56.1
IPv4 Network Mask: 255.255.255.0
- Under
DHCP Server
tab make sure thatEnable server
is unchecked. SelectOK
to save the settings.
Snapshot the Opal Servers
Before starting the Opal servers it is very useful to take a snapshot. This enables you to roll-back the Opal servers to a pristine state, without needing to delete them and repeat the process above to re-import them.
To take a snapshot, first open VirtualBox and click on a server. Now click the button in the top right that says
Snapshots
, the camera icon (for VirtualBox 6, click on a server, click the three horizontal lines to open a menu and select 'Snapshots', then click the 'Take' button).
Remember you must repeat this for each Opal server
Snapshots can be named however you like. Also you can take as many as you like, to save different states of the Opal server.
For example, you could take Snapshot 1 straight after importing the Opal server, and then later Snapshot 2 after uploading some of your own simulated data.
Connect to the Virtual Servers
Starting the Opal test servers
VirtualBox Networking
If a network warning occurs, or you are having trouble connecting to the VM (e.g. can't ping it or open the opal webpage) it might be necessary to reset the network adaptor. You can do this by powering down the VM, then deleting the host-only adaptor in the main virtualbox setting, then adding a new host-only adaptor that matches the original settings.
- The host machine and the virtual servers are able to communicate with each other through a host-only network. This network cannot communicate with any other network your (host) computer might be on. The host-only adapter will be configured when you import the Virtual Servers, no extra setup is necessary.
- The virtual servers also have a second 'NAT' network adapter. This does enable them to communicate through a network the host computer might be on. Consequently you will be able to update the servers (e.g. sudo apt-get upgrade) or install R packages from CRAN or github.
Virtual Servers' IP addresses
By default, machines on the host-only network can be found at 192.168.56.xxx:
Your computer (the host) will be at 192.168.56.1
The virtual servers have been configured as follows:
- dstesting-100 – 192.168.56.100
- dstesting-101 – 192.168.56.101
Should you wish to create more servers, you can clone any one of these.
You will however need to:
- Reinitialise the MAC address of the VM
- Change the static ip address in /etc/network/interfaces (And note that the interface itself (e.g. eth0) may have changed - check with 'ip addr' or 'ifconfig')
- Change the hostname in /etc/hostname and /etc/hosts
Opal web access
You can use the opal web interface in you browser. Simply type the ip address of the VM, followed by the port number.
For example: to access the web interface for dstesting-100, go to:
Note: You will get a warning when connecting to 8443 because the SSL certificate is self-signed.
Logging onto the Opal web interface
Should you require logging into the Opal web interface e.g. to upload your own data click on the website and use the following username and password:
username: administrator password: datashield_test&
SSH access
Once the machines have been started you can login directly, or you have ssh access:
Username: user password: password or Username: root password: puppet
For example:
$ ssh user@192.168.56.100
Note: you may want to alias the virtual servers in the host's ~/.ssh/config by adding the following:
$ cat >> ~/.ssh/config << 'EOF' Host server0 HostName 192.168.56.100 User user EOF
Now you can ssh into the virtual server by:
$ ssh server0
Shutting down the Opal servers
See Shutting down the Opal Servers for instructions on how to shut down the servers depending on whether you have or have not made changes to the Opal test servers.
Setting up your local machine to use the VMs
These are instructions for installing R on your own machine. That is, the machine you will be using for analysis, not the virtual servers.
These instructions assume you are using Ubuntu on your local machine.
Installing R in Ubuntu
Do the following as root:
$ sudo su
Add a CRAN R repository
Add the following to /etc/apt/sources.list, this gives a newer version of R than the one that is in Ubuntu's repositories:
deb https://cloud.r-project.org/bin/linux/ubuntu xenial-cran35/
Note: the CRAN R repository you add depends on the Ubuntu release you are running.
For example: 'xenial' = '16.04', 'bionic' = '18.04'
You may wish to select a different CRAN mirror, closer to your location.
You can find further details about installing R in Ubuntu on the CRAN website.
Add the public key for the CRAN R repository to your list of keys
# apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
Install R
(Plus one dependency, needed for installing R packages)
# apt-get update # apt-get install r-base r-base-dev libcurl4-openssl-dev
Additional libraries for DataSHIELD development
The following libraries will need to be installed should Ubuntu users wish to do DataSHIELD development
# apt-get install libxml2-utils # apt-get install libxml2-dev # apt-get install libssl-dev # apt-get install libgit2-dev
Set-up R for DataSHIELD analysis
Don't do this as root.
Install the Opal packages
$ R install.packages('rjson') install.packages('RCurl') install.packages('mime') install.packages('opal', repos='http://cran.obiba.org', type='source') install.packages('opaladmin', repos='http://cran.obiba.org', type='source')
Install the DataSHIELD client packages
install.packages('dsBaseClient', repos=c(getOption('repos'), 'http://cran.obiba.org'), dependencies=TRUE)
Although be aware that this will place the DataSHIELD packages wherever your R libraries are saved. This may be unhelpful for development.
DataSHIELD R Package Manuals
See: Current release
Keeping up to date
If you have installed the DataSHIELD
client packages using the method above (that is, within R using install.packages
and specifying the Obiba repository), then you can update those client packages as follows:
# R update.packages(repos='http://cran.obiba.org')
You are now fully set up. To start using the DataSHIELD test environment sit our Tutorial for DataSHIELD users. The tutorial teaches you the basics of DataSHIELD including how to:
- login
- run commands to:
- generate descriptive statistics
- subset tables and vectors
- fit some regression models
Further instructions are available for the advanced uses of the DataSHIELD test environment:
- Install non-CRAN R packages to the test Opal servers (coming soon)
- Sit the Opal management tutorial to learn how to Update the test Opal environment and Upload your own data to the test Opal servers.
- All DataSHIELD R package manuals are available as part of the release notes.
DataSHIELD Wiki by DataSHIELD is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at http://www.datashield.ac.uk/wiki