Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The image below shows the variables tab for the simulated dataset CNSIM used within the v4 Tutorial for DataSHIELD users.

The table below summarises the column names in the variables tab, including examples from the test data built into the training environment in the spreadsheet image above.

Column Names

Description

Default value

Example value in the test data

Notes

table

the table name the variable will be added to

Table

Column A (CNSIM)

This is the table name you refer to in your DataSHIELD login details.

Note
It is critical that the table name appears in every row


name

the variable name


Column B (e.g. LAB_TSC)

Mandatory field.
Becomes the the column name in Opal for that variable

valueType

the value type of the variable

text

Column C (e.g. decimal, integer)

See further information on variable types and classes

entityType

Opal can store data on different entities

Participant

Column D (e.g. Participant)

Examples: Participant (each row corresponds to a different participant), Instrument, Area, Drug

referencedEntityType

if the variable values are entity identifiers, this is the type of the entities that are referenced


Column E

Can be left blank

mimeType

the mime type of the variable to help applications to display documents


Column F

Examples: image/jpeg, application/excel. Can be left blank

unit

the unit in which variables are expressed


Column G (e.g. Participant)

Examples: cm, kg, ml etc. Can be left blank

repeatable

repeatable measurements

0

Column H (0)

1 if repeatable, 0 if not (e.g. Three measures of blood pressure)

occurrenceGroup

name of a repeatable variable group


Column I

Example: [measure value, measure date] is a group of variables that can be repeated. Can be left blank

label:en

label of the variable.


Column J

Can be localized by language e.g. label:en in english, label:fr for french)

aliasAlternative name for the variable, usually used for defining a shorter name for the variable
Column K

...

The image below shows the categories tab for the simulated dataset CNSIM used within the v4 Tutorial for DataSHIELD users.  Each category for each variable is represented by a single row in the spreadsheet.  For example, in the dictionary file below, 3 rows (rows 12-14 inclusive) are for PM_BMI_CATEGORICAL as it has 3 categories.

...

Column Names

Description

Default value

Example value in the test data

Notes

table

the table name the variable will be added to

Table

Column A (CNSIM)

This is the table name you refer to in your DataSHIELD login details.

Note
It is critical that the table name appears in every row


variable

the variable name (mandatory field)


Column B (e.g. DIS_CVA)

mandatory field. One row per category for each variable.

name

the variable category

integer

Column C (e.g. 1)

mandatory field. One row per category for each variable

code

can be left blank


Column D

Can be left blank

missing

Some categories are interpreted as missing answers (e.g. 'Don't know', 'Prefer not to answer').  

0

Column E

Use 1 for missing and 0 for not missing (normal answer).

label:en

label of the variable category


Column F

Human readable text description of the category. Can be localized by language e.g. label:en in english, label:fr for french)

...

  • To create a new project click on the Project tab in the top left (after clicking it appears in green on the dark blue horizontal bar) 
  •  click Add Project.
Section

Image Modified

Fill in the details of your project:

  • You must specify a name and this will be used to point to the data. For convenience do not use a very long name. The example below shows the name as CNSIM.
  • The "title" can then be a longer explanatory label for the table
  • The database currently defaults to mongodb but can also be MySQL
  • You can give an additional description if you wish

Image Modified

Info
titleProject and table names

In DataSHIELD in order to refer uniquely to a table held in Opal you must specify both the Opal project and table names. For example the table EMISS in the SURVSIM Project is referred to as SURVSIM.EMISS in your DataSHIELD login template. In the case of the training data, it happens to be that both the Project and the Table within it are called CNSIM - and so the table is referred to as CNSIM.CNSIM.

...

  • To upload data files from your local computer,  click on Dashboard from the top menu bar (the word changes to green)

  • Click Manage Files from the left hand menu
Section

Image Modified

  • By default, you will see a list of files currently held in your Opal Home folder. This is where your dictionary and data files will be saved.  

...

  • Click the grey Upload button from the top tab
Section

Image Modified

  • This brings up a new window, click Choose file
Section

Image Modified

  • Browse to the .xls file to upload from your your local machine 
  • Select that file and click open (you can simply double-click the file in Windows)
  • Clicking the dark blue upload at the bottom of the window
  • If that file already exists in that location then it will ask you whether to replace it or not (see above tip)
  • Repeat for the .csv (data file)

...

  • Click on the Projects tab from the top menu (it will turn green) and click on your project name (CNSIM in the example below)
  • Click on the large blue +Add Table button that sits  above the list of tables in the project you have specified
  • Select Add/update tables from dictionary ... from the drop down menu
Section

Image Modified

Section

Image Modified

  • Use HOMES and SYSTEM on your left menu to navigate to the folder that holds the .xls data dictionary file
  • Click the small square box to the left of the file name (a tick appears) and then click the dark blue Select button towards the bottom right
Section

Image Modified

  • Click the blue Next button
  • Review the Opal table in the pop up window. If this is a new data table the information in the window should tell you the name of the Table you have asked to be created and the number of New Variables (corresponding to number of columns in your .csv data file).

...

  • If the information is correct, click the small box to the left of the table name (CNSIM in the example below). A tick will appear. 
  • Click the dark blue Finish button from the bottom right
Section

Image Modified

  • This will take you back to the list of all available tables in the chosen project, and after a few seconds this will be refreshed to include the new table you have create

...

  • Select your project again by clicking on the Projects tab from the top menu and click on your project name (CNSIM in this particular example)
  • Your table states as holding 0 Entities (indicating it is empty)
Section

Image Modified

  • Click on the small box at the left of the table name, and a tick will appear.
  • Click the grey Import button from the tabs above the table.  This opens a window to define file format.

...

Tip

Your data has now successfully been uploaded into an Opal server. You will need to repeat the process for each Opal server you wish to use.

To start using the DataSHIELD training environment sit our Tutorial for DataSHIELD users using your own data. The tutorial teaches you the basics of DataSHIELD including how to:

  • login
  • run commands to:
    • generate descriptive statistics
    • subset tables and vectors
    • fit some regression models

Assistance with DataSHIELD can be found:

...

Opal is supported by the software creators at Obiba. Opal support is available on the Obiba-users mailing list, where support questions can be posted for free. Opal general enquiries can be sent to info@obiba.org.